[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] What's wrong with skwan-utf8?
- To: briansp@walid.com
- Subject: Re: [idn] What's wrong with skwan-utf8?
- From: Kenneth Whistler <kenw@sybase.com>
- Date: Thu, 4 Jan 2001 11:14:25 -0800 (PST)
- Cc: idn@ops.ietf.org
- Delivery-date: Thu, 04 Jan 2001 11:14:48 -0800
- Envelope-to: idn-data@psg.com
A terminological quibble here:
> I guess I still don't get why some people are so focused on UTF-8.
> UTF-8 is an 8-bit encoding of the UCS. ACE (whatever flavor) is a 7-bit
> encoding of the UCS.
UTF-8, UTF-16, and UTF-32 are encoding forms of Unicode (or the UCS,
if you prefer). These have a privileged status in the standard(s), and
are implemented as processing forms of the encoded characters, as
well as interchange forms. People treat UTF-8 streams as streams of
the *characters* themselves, not as cryptographic puzzles to be teased
apart by the appropriate API before the characters can be identified.
ACE, on the other hand, is one of a large class of things that are
referred to as transfer encoding syntaxes in the Unicode Character Model.
It is an explicit reshuffling of the bits to meet the bit-pattern
constraints of one or more protocols that can't handle the encoding
forms per se. Nobody is going to use ACE (or LACE or RACE or *ACE) as
a processing form of the encoded characters, nor will they use ACE
as a generic interchange form for the encoded characters, in any
but the protocols concerned with IDN.
That said, I am not advocating one or the other particularly as
an IDN solution. (I see that the ACE advocates have strong arguments
in their favor.) But you need to understand that UTF-8 and ACE are
not just morally equivalent "encodings" to understand why UTF-8
advocates would be so focussed on it.
--Ken