[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Why we cannot go directly to UTF-8



It's really quite simple.

If we use UTF-8 names, each component in a signal path that handles
an IDN has to be upgraded before the application will work with IDNs.

For email, this means every UA, MTA, message store, mail filter,
mailing list, etc that uses the addresses in the header or
envelope of a message.  For the web, this means every web browser,
proxy, cache, and origin server that makes use of domain names
in the request or response (header or payload).  For both cases,
it means that every DNS query library, resolver, cache, and server
involved in the lookups supports UTF-8 also (unless you believe
that the existing ones will already support UTF-8 without protocol
extensions, which is far from a given).  There's little incentive
to upgrade because so many other components need to be upgraded
before you can get reliable operation.

If we use ASCII compatible names, each component in a signal path
that handles a domain name can upgrade independently, and things
will keep working - they just won't display the name as nicely if 
they're not updated.    And only the components that interface with 
users need to be upgraded before the users see a benefit.

It's easier to get real IDN support into the various components 
using ASCII compatible names because fewer components need to be
upgraded.  And the incentives for adoption are greater with ASCII 
names because the benefit of upgrading will be seen sooner.

Users won't care about whether the applications protocols represent 
IDNs in ACE or UTF-8.  But they will care about whether their
applications support IDNs.  ACE lets them do so far more quickly.

Keith