[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Some comments



At 23.57 +0000 01-01-13, D. J. Bernstein wrote:
>Patrik writes:
>>  I don't want 8-bit clean protocols, and UTF-8. I want protocols which
>>  can handle UCS-2, UCS-4 or UTF-16
>
>UTF-8 is compatible with ASCII. UCS-4 is not.

Depends all on how you define "compatible", and no, you don't have to 
explain because I know what you are thinking of. We don't agree here. 
It's as simple as that.

>Switching a protocol to UTF-8 preserves compatibility; ASCII data is
>unaffected. Switching a protocol to UCS-4 destroys compatibility; ASCII
>bytes turn into 4-byte sequences.

And?

Was it not a long-term, "correct", solution you wanted?

>Do you seriously believe that the Internet is going to move to UCS-4
>rather than UTF-8? Exactly what benefits is anyone supposed to see in
>this? Occasionally I hear people claiming that a string of Unicode
>numbers is convenient because the displayed width of the string is
>proportional to the number of bytes; but they're wrong, thanks to
>combining characters and double-width characters.

If nothing else because of political reasons. The grade of opposition 
to UTF-8 is about the same as the number of bits you need to 
represend words in the language the person opposing the language 
speaks.

I see a big risk the solution will _NOT_ be Unicode if UTF-8 is 
pushed, and that just because of the way a character is encoded. Only 
for political reasons.

>  > What I don't like in your arguments is the focus on "applications"
>>  instead of looking at what is specified in the protocols.
>
>What I don't like in your arguments is the focus on protocol specs as
>religious objects, rather than as tools to help implementors provide
>working software to system administrators and users.

Ok.

>  > You talk about quoted-printable encoding and charset parameter as it
>>  was something which destroys the content as shredding of luggage
>>  would do.
>
>``The luggage isn't destroyed, sir. It's just shredded. As I said, you
>can take some time and sew it back together. You haven't lost anything;
>it's all here! By the way, would you like to buy a sewing machine?''

Sigh...

>  > But, software doesn't handle 8 bit stuff correctly today even though
>>  we have in the IETF been talking about it for a very long time.
>
>The IETF should have required 8-bit-clean mail software in 1982. Allman
>would have fixed his software eventually, certainly before version 6.57
>in 1993. UTF-8 header fields should have been allowed in 1996. Everyone
>would have been happily using UTF-8 mail in 2001.

People would not have been happy at all. See above.

>Instead, the IETF mail standards _still_ allow MTAs to drop 8-bit bytes,
>even in message bodies. See http://cr.yp.to/docs/8bit/06.txt.

But, you say that the standards is only an aid for implementors. How 
come still I get broken email even though I have a sewing machine?

You didn't respond to my questions regarding what language etc you use, Dan.

   paf