[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] Thoughts on nameprep
- To: idn@ops.ietf.org
- Subject: Re: [idn] Thoughts on nameprep
- From: "Adam M. Costello" <amc@cs.berkeley.edu>
- Date: Sun, 11 Mar 2001 00:51:48 +0000
- Delivery-date: Sat, 10 Mar 2001 16:53:49 -0800
- Envelope-to: idn-data@psg.com
- User-Agent: Mutt/1.3.15i
"D. J. Bernstein" <djb@cr.yp.to> wrote:
> How will the bad names appear?
Hostnames will appear in text files and other electronic documents in
a variety of encodings, not just Unicode. They may have undergone
arbitrary transcoding in the past. In order to do a lookup on those
hostnames, they must eventually be transcoded to Unicode. If that
transcoder is not required to output normalization form KC, then it
could very easily produce a bad name.
Not all the precomposed characters that exist in Unicode 3.0 existed in
earlier versions. There are probably some transcoders still in use that
cannot produce good names because they were written before the required
code points were assigned. And those transcoders may be perfectly
adequate for their intended purpose, which may be competely unrelated to
domain names.
Therefore, an application that is concerned with domain names would be
wise not to make assumptions about Unicode text that is handed to it,
but instead to perform its own normalization.
> Now, are you claiming that the usual keyboard interface in some locale
> will produce an e followed by a combining acute accent?
In most locales the keyboard interface doesn't produce Unicode at all,
so we also have to consider how the transcoding is performed.
Keyboard interfaces are used to input text for many many purposes. I
think it would be very presumptuous to think that one narrow purpose,
domain names, should dictate how keyboard interfaces must work.
And as I mentioned above, keyboards are not the only sources of domain
names.
At the end of the day, whether the IETF requires nameprep or not, they
can't hold a gun to anyone's head; people will still be able to write
applications that neglect to do nameprep. But I bet those applications
will cause headaches for users when they occasionally fail in very
mysterious ways (because the user cannot distinguish bad names from good
names).
AMC