[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] UTF-8 / RACE



Some people seem to be arguing that using ACE requires no less (or even
more) upgrading of software than using UTF-8 without ACE.  While it may
be true that ACE-fully-working-everywhere requires as much upgrading as
UTF-8-working-fully-everywhere, that comparison overlooks an important
point.

ACE affords incremental deployment much better than no-ACE.  Suppose I
am considering getting an IDN for my domain.  With ACE, this will make
things better for some users (who have upgraded their clients to decode
the ACE) and worse for others (who have old clients and will see ugly
ACEs) but nothing will actually break (mail will get through, web pages
will load, etc).

But without ACE, if I get an IDN for my domain, this will make things
better for some users (who have upgraded their clients, or who are lucky
enough to already be using UTF-8 clients) and will *completely* *break*
things for other users (mail will not get through, web pages will not
load, etc).  There may be nothing those users can do to fix it, because
the breakage might be happening in their provider's software.  The
provider might be very slow to upgrade, because 99% of their customers
might be English speakers, and the other 1% are just screwed.

With ACE, people might have to put visible ACEs in some config
files, which is annoying, but at least it will work.  Eventually
the application might get upgraded to support native characters in
the config files and then things will get easier.  Without ACE, the
application will simply be unusable with IDNs until it is upgraded.

Conclusion: There should be ACE.

Next question:  Given that there will be ACE, should DNS support 8-bit
queries in addition to ACE queries?  I don't know.  No matter what
is recommended or discouraged, some DNS servers will probably try to
guess the encoding of 8-bit queries.  This will help old clients, but
increases the risk of spoofing, and allows some applications to be
lazier about upgrading.  I haven't yet formed an opinion on this.

Next question:  Should ACE ever be phased out?  I don't think so.  Very
few systems will ever support all Unicode characters, so applications
will sometimes try to display IDNs containing unsupported characters.
What should they display?  They should display the ACE, because that's
no uglier than anything else they might display, and has the added bonus
that you can copy it into any other application and it will still work.

Furthmore, domain names are first and foremost *global* identifiers
intended to be used by *humans* *anywhere* to refer to network objects.
For people who know the Unicode characters in the domain name, the
native representation is easiest, but for the billions of people who
don't know those characters, the ACE is much easier to type, write,
speak, and visually compare.  (The Roman alphabet is one of the smallest
alphabets in existence, and is already widely recognized.  If you have
to chose a fallback character set for everyone to learn, that's the best
choice.)

I'm not saying that various protocols shouldn't allow 8-bit encoding in
addition to ACE (I have no opinion on that yet), I'm just saying that
ACE will always serve a useful function, and should never be deprecated.

AMC