[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] UNIX moving to UTF-8
> People are clearly moving to Unicode. Exactly which UTF they choose (8, 16,
> 32) is not as important, since they all can be converted to each other very
> efficiently and without loss.
And since an ACE is just another encoding of Unicode, you can add ACE to
that set.
> It is however an overstatement to say that all environments are headed
> towards UTF-8.
And if there's not a single preferred encoding of Unicode that's being
widely supported, using ACE for IDNs makes about as much sense as anything.
The fact that ACE doesn't happen to use the 0x80 bit doesn't strike me
as a particularly good reason to rule it out - especially when (for a
carefully-designed ACE) the encoding will often be more efficient than
either UTF-16 or UTF-8.
If UTF-8 is optimized for ASCII-compatibility, the C language (so
that for instance NUL-terminated strings still work), and to
minimize the amount of state that must be maintained while decoding;
ACE can be optimized for space-efficiency, compatibility with protocols
that were designed for ASCII-only DNS names, and (with nameprep)
ease of comparison. It's not as if one is right and the other wrong.
They are different encodings to accomodate different sets of
transition issues.
Keith