[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] I-D ACTION:draft-ietf-idn-idna-08.txt
[Edmon, sorry you got two copies of this, it was an accident.]
Edmon Chung <edmon@neteka.com> wrote:
> non-ASCII requests are not "randomly misinterpreted", at least not in
> NeDNS, they get uniquely resolved into the intended domain name that a
> non-aware user is typing in.
Really? So if I type a name, and my IDN-unaware software blindly copies
the name into a DNS request (which won't necessarily happen, but let's
say it does), the server will match on the name I intended no matter
whether my system uses iso-8869-1 or shift_jis or iso-2022-jp or euc-jp
or UTF-8 or UTF-16be or UTF-16le or...? And it will match uppercase
Latin letters with fullwidth lowercase Latin letters? And it will match
precomposed characters with their decomposed equivalents?
Isn't it inevitable that there will be collisions between the various
charsets, so that a request for one name using one charset will
accidentally match an unrelated name stored under a different charset?
That's what I mean by random misinterpretation. In the current DNS
protocol, the semantics of octets 80..FF are undefined, so the server
cannot know what the client is really asking for--it cannot know what
the user typing the name really saw on the screen. It can guess, and
maybe guess right most of the time, but it can't really know.
> Adam, whether you like it or not, the reality is that non-ASCII
> request are reaching registry name servers and whether you resolve
> these domain names or not is the operator's choice.
That's true, I'm just saying it's risky. Whenever text is passed
around without a charset tag (implicit or explicit), there is a risk of
misinterpretation.
If RFC 1035 had said clearly "domain names in zones shall not contain
80..FF until the semantics have been defined", then we could decide
tomorrow that 80..FF are nameprepped UTF-8 (or arbitrary UTF-8 if we
require the server to perform Nameprep), and there would be no problem.
Unfortunately, RFC 1035 contained no such clear prohibition, and various
conflicting interpretations of 80..FF have been deployed, and now any
standard definition will conflict with those deployments, and so people
who want standard 8-bit names are turning to EDNS and/or new classes.
AMC