[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] UTF-8 as the long-term IDN solution



At 4:25 AM +0000 5/29/01, D. J. Bernstein wrote:
>I suspect that we have consensus that the long-term IDN solution will
>encode Unicode characters as UTF-8 on the wire.

I suspect this is incorrect.

If ACE works for all the IDN strings we want, then there is no reason 
to do the second transition to UTF-8.

If ACE doesn't work for al the IDN strings we want, then it should 
not be deployed at all.

And, for those of you who are supporting the idea of the transition, 
please consider the companies and people who register names that are 
legal in ACE but illegal in UTF-8 because of the 63-octet limit on 
name parts. As has been pointed out here earlier, many ACEs compress 
some scripts better than UTF-8. Thus, if we do "ACE first then 
UTF-8", there will some names that are legal ACE and illegal UTF-8. 
Is this a good policy?

The reverse is also true: there are many names that will will fit in 
63 octets of UTF-8 that cannot fit in 63 octets of any of the ACEs 
proposed to date. Those names will be usable to some people (the ones 
who have made the transition) but not to the people who haven't. How 
is this transition supposed to work?

No transition is needed. Either we do ACE, or we do non-ACE. 
Currently, ACE seems to the be the strongly preferred solution in the 
WG.

--Paul Hoffman, Director
--Internet Mail Consortium