[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] conflicts with ACE and STD13



I appreciate you try to help with this problem, but it is not the DNS
protocol itself we talk about, but what is actually used as a protocol
element in the applications. We need text that explain when and why we have
any kind of (if any) restriction of what codepoints to be used here and
there. Have a look at the restriction list(s) in the nameprep document,
should that apply to domain names or only the subset host names?

I.e. I see this problem being orthogonal to the encoding of the Unicode
codepoints, and have not much to do with the DNS protocol, which already
can handle 0x00-0xFF.

    paf

--On 2001-11-09 15.43 -0600 "Eric A. Hall" <ehall@ehsco.com> wrote:

> All domain names are unstructured eight-bit sequences, host names are a
> specific subset of that range. Host names are the exception, domain names
> are the rule. Treating domain names as the exception results in the above
> problem. This isn't a simple block of text...
> 
> The draft I'm working on punts with the problem cases cited: labels which
> only contain characters in the range 0x00 through 0x7E must only be
> encoded as STD13 octet sequences and UTF-8, while domain names that have
> any eight-bit value in the label are to be encoded as STD13 octet, ACE and
> UTF-8 equally. If a server is unable to choose between STD13 and ACE
> output encoding, it favors ACE on the assumption that it is more likely to
> be Latin-1 than an eight-bit code, and that ACE has future processing
> characteristics (can be used as CNAME for a host) whereas STD13 octet
> encoding does not. This is definitely a punt which is guaranteed to fail
> in more than one scenario. Some sort of group decision needs to be made on
> this at some point; ambiguous matches in DNS are not cool.
> 
> Note that UTF-8 does not suffer this ambiguity, since it doesn't overload
> a shared label: if the query arrived as UTF-8, the canonical UCS character
> is encoded as UTF-8 and returned for the recipient to decode, so there is
> no ambiguity as to which encoding should be used. Nor does it matter if
> the client wanted STD13 binary domain or an internationalized domain name,
> because there is no difference with this particular encoding scenario;
> they asked for a code point in a specific encoding and we comply.
> 
> -- 
> Eric A. Hall                                        http://www.ehsco.com/
> Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/