[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Debunking the ACE myth



The ACE myth is that we can safely start using ACE IDNs before all the
browsers, mail clients, etc. have been modified.

Here's a counterexample, elaborating on David Conrad's messages. This is
an ACE failure involving a hypothetical ACE IDN, a hypothetical ACE
browser, a real mail client under a real UTF-8 xterm, and a real MTA.

I see a mailto URL in my browser. I don't like the browser's built-in
mailer, and changing the mailto handler is inconvenient, so I simply
copy-and-paste the address into my mail client. The address looks fine.
I type the message and send it. (This is a very common situation.)

The message bounces. In contrast, it would have gone through if the
domain name owner hadn't tried to use IDNs. ``Don't use IDNs!''

What went wrong is that the browser displayed the address as Unicode
characters. The address then passed through the (CTEXT UTF-8, although
UTF8_STRING would be better) copy-and-paste mechanism, the UTF-8 xterm,
the mail client, and the 8-bit-clean MTA, none of which converted back
to ACE. The resulting UTF-8 DNS lookup produced NXDOMAIN.

You might try to help the mail go through by also providing UTF-8 IDNs.
But we've been told again and again that this doesn't work. ``What about
mail clients that reject 8-bit bytes? What about the default behavior of
UNIX gethostbyname()?'' If we do the software upgrades to make this work
then we can simply use UTF-8 without ACE.

These failures can't occur when there are no ACE applications. They also
can't occur when there are no non-ACE applications. But they can and
will occur if we deploy ACE when there are _some_ ACE applications.

As I've commented before, moving to ACE (even without the subsequent
switch to UTF-8) appears to be much more expensive than moving directly
to UTF-8. See http://cr.yp.to/proto/idn.html for details. The ACE myth
has been the only counterargument, and it simply isn't true.

---Dan