[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Just send UTF-8 with nameprep (was: RE: [idn] Reality Check)
- To: idn@ops.ietf.org
- Subject: Re: Just send UTF-8 with nameprep (was: RE: [idn] Reality Check)
- From: Eric Brunner-Williams in Portland Maine <brunner@nic-naa.net>
- Date: Wed, 18 Jul 2001 11:50:40 -0400
Recently Keith Moore wrote:
ACE is an *encoding" just like UTF-8 is an *encoding*.
I thought about responding, but Ken wrote something last January in the
discussion of the skwan-utf8 draft that apparently needs reposting.
Eric
------- Forwarded Message
Date: Thu, 4 Jan 2001 11:14:25 -0800 (PST)
From: Kenneth Whistler <kenw@sybase.com>
Message-Id: <200101041914.LAA24879@birdie.sybase.com>
To: briansp@walid.com
Subject: Re: [idn] What's wrong with skwan-utf8?
Cc: idn@ops.ietf.org
X-Sun-Charset: US-ASCII
Sender: owner-idn@ops.ietf.org
Precedence: bulk
A terminological quibble here:
> I guess I still don't get why some people are so focused on UTF-8.
> UTF-8 is an 8-bit encoding of the UCS. ACE (whatever flavor) is a 7-bit
> encoding of the UCS.
UTF-8, UTF-16, and UTF-32 are encoding forms of Unicode (or the UCS,
if you prefer). These have a privileged status in the standard(s), and
are implemented as processing forms of the encoded characters, as
well as interchange forms. People treat UTF-8 streams as streams of
the *characters* themselves, not as cryptographic puzzles to be teased
apart by the appropriate API before the characters can be identified.
ACE, on the other hand, is one of a large class of things that are
referred to as transfer encoding syntaxes in the Unicode Character Model.
It is an explicit reshuffling of the bits to meet the bit-pattern
constraints of one or more protocols that can't handle the encoding
forms per se. Nobody is going to use ACE (or LACE or RACE or *ACE) as
a processing form of the encoded characters, nor will they use ACE
as a generic interchange form for the encoded characters, in any
but the protocols concerned with IDN.
That said, I am not advocating one or the other particularly as
an IDN solution. (I see that the ACE advocates have strong arguments
in their favor.) But you need to understand that UTF-8 and ACE are
not just morally equivalent "encodings" to understand why UTF-8
advocates would be so focussed on it.
- --Ken
------- End of Forwarded Message