[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] UTF-8 as the long-term IDN solution
- To: idn@ops.ietf.org
- Subject: Re: [idn] UTF-8 as the long-term IDN solution
- From: Paul Hoffman / IMC <phoffman@imc.org>
- Date: Tue, 29 May 2001 09:07:07 -0700
- Delivery-date: Tue, 29 May 2001 09:14:53 -0700
- Envelope-to: idn-data@psg.com
At 4:25 AM +0000 5/29/01, D. J. Bernstein wrote:
>I suspect that we have consensus that the long-term IDN solution will
>encode Unicode characters as UTF-8 on the wire.
I suspect this is incorrect.
If ACE works for all the IDN strings we want, then there is no reason
to do the second transition to UTF-8.
If ACE doesn't work for al the IDN strings we want, then it should
not be deployed at all.
And, for those of you who are supporting the idea of the transition,
please consider the companies and people who register names that are
legal in ACE but illegal in UTF-8 because of the 63-octet limit on
name parts. As has been pointed out here earlier, many ACEs compress
some scripts better than UTF-8. Thus, if we do "ACE first then
UTF-8", there will some names that are legal ACE and illegal UTF-8.
Is this a good policy?
The reverse is also true: there are many names that will will fit in
63 octets of UTF-8 that cannot fit in 63 octets of any of the ACEs
proposed to date. Those names will be usable to some people (the ones
who have made the transition) but not to the people who haven't. How
is this transition supposed to work?
No transition is needed. Either we do ACE, or we do non-ACE.
Currently, ACE seems to the be the strongly preferred solution in the
WG.
--Paul Hoffman, Director
--Internet Mail Consortium