[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] Some comments
- To: idn@ops.ietf.org
- Subject: Re: [idn] Some comments
- From: Patrik Fältström <paf@cisco.com>
- Date: Sun, 14 Jan 2001 09:23:51 +0100
- Delivery-date: Sun, 14 Jan 2001 00:42:17 -0800
- Envelope-to: idn-data@psg.com
At 23.57 +0000 01-01-13, D. J. Bernstein wrote:
>Patrik writes:
>> I don't want 8-bit clean protocols, and UTF-8. I want protocols which
>> can handle UCS-2, UCS-4 or UTF-16
>
>UTF-8 is compatible with ASCII. UCS-4 is not.
Depends all on how you define "compatible", and no, you don't have to
explain because I know what you are thinking of. We don't agree here.
It's as simple as that.
>Switching a protocol to UTF-8 preserves compatibility; ASCII data is
>unaffected. Switching a protocol to UCS-4 destroys compatibility; ASCII
>bytes turn into 4-byte sequences.
And?
Was it not a long-term, "correct", solution you wanted?
>Do you seriously believe that the Internet is going to move to UCS-4
>rather than UTF-8? Exactly what benefits is anyone supposed to see in
>this? Occasionally I hear people claiming that a string of Unicode
>numbers is convenient because the displayed width of the string is
>proportional to the number of bytes; but they're wrong, thanks to
>combining characters and double-width characters.
If nothing else because of political reasons. The grade of opposition
to UTF-8 is about the same as the number of bits you need to
represend words in the language the person opposing the language
speaks.
I see a big risk the solution will _NOT_ be Unicode if UTF-8 is
pushed, and that just because of the way a character is encoded. Only
for political reasons.
> > What I don't like in your arguments is the focus on "applications"
>> instead of looking at what is specified in the protocols.
>
>What I don't like in your arguments is the focus on protocol specs as
>religious objects, rather than as tools to help implementors provide
>working software to system administrators and users.
Ok.
> > You talk about quoted-printable encoding and charset parameter as it
>> was something which destroys the content as shredding of luggage
>> would do.
>
>``The luggage isn't destroyed, sir. It's just shredded. As I said, you
>can take some time and sew it back together. You haven't lost anything;
>it's all here! By the way, would you like to buy a sewing machine?''
Sigh...
> > But, software doesn't handle 8 bit stuff correctly today even though
>> we have in the IETF been talking about it for a very long time.
>
>The IETF should have required 8-bit-clean mail software in 1982. Allman
>would have fixed his software eventually, certainly before version 6.57
>in 1993. UTF-8 header fields should have been allowed in 1996. Everyone
>would have been happily using UTF-8 mail in 2001.
People would not have been happy at all. See above.
>Instead, the IETF mail standards _still_ allow MTAs to drop 8-bit bytes,
>even in message bodies. See http://cr.yp.to/docs/8bit/06.txt.
But, you say that the standards is only an aid for implementors. How
come still I get broken email even though I have a sewing machine?
You didn't respond to my questions regarding what language etc you use, Dan.
paf