[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] What's wrong with skwan-utf8?



Thanks Ken for explaining this clearly. A bit of follow up:

At 01/01/04 11:14 -0800, Kenneth Whistler wrote:
>A terminological quibble here:
>
> >   I guess I still don't get why some people are so focused on UTF-8.
> > UTF-8 is an 8-bit encoding of the UCS.  ACE (whatever flavor) is a 7-bit
> > encoding of the UCS.

ACE is not a 7-bit encoding of the UCS. It's a two-step encoding,
first from the UCS to the legacy host name repertoire, and then
from there to 7-bit octets (using US-ASCII). Of course, for most
people on this list, the second step is just ignored, because
US-ASCII is taken for granted anyway.

To get to truly solid internationalization, it is highly desirable
that there is no difference between the handling of US-ASCII and
non-US-ASCII characters, and in particular that the later are
represented directly, and not via the former.

Also, because ACE maps to ASCII, it is only a very restricted
encoding of the UCS. Can you imagine a terminal emulator for ACE?
An XML page starting with <?xml version='1.0' encoding='ACE'?> ?
An editor that reads in a file in ACE, allows to edit it, and
write it back out again? It's possible to create such things,
but they won't be useful at all for IDN, because kind of
data has a special pattern of where domain names appear.


Regards,   Martin.