[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Just send UTF-8 with nameprep (was: RE: [idn] Reality Check)




Keith Moore wrote:

> As for zone files, I've already proffered a suggestion to have an
> exchange format (using ACE) distinct from the native format. The
> native format could use UTF-8 if that were the platform's native
> charset.  But we shouldn't presume that all platforms will use UTF-8.
> A separate UTF-8 exchange format, or a way of labelling native format
> to say that it is using UTF-8, might also be a possibility.

I think that the exchange format (master files and AXFR/IXFR) can and
should be one the last point we discuss. It doesn't need to be decided on
first, and if we debate it later we will have arguments that don't lend
themselves to an "X is preferred, so Y must suck" interpretation.

However, a couple of points.

 a) There will be some requirement for the UTF-8 namespace to
    provide manually-encoded ACE names directly. If we are
    converting UTF-8 to ACE, these fixed names are pretty easy to
    implement. Conversely, if we are doing ACE to UTF-8, then we
    have to come up with a way to prevent the ACE name from being
    converted to its UTF-8 equivalent in order for the ACE
    encoding to be preserved in the UTF-8 namespace.

 b) The exchange format applies to a specific zone. A zone which
    consists of IDNs only has to be consistent across the servers
    for that zone. If we are supporting a UTF-8 namespace for a
    particular zone, then it is okay to use UTF-8 as the default,
    since any additional servers which join the zone must be able
    to use it.

Like I said, this should probably be one of the last things we do. But
given the impact of (a) and the permission of (b), the default encoding
for zone management operations should probably be UTF-8.

> > Similarly for the DNS protocol: Send UTF-8, and if it's not found,
> > try again with ACE.
> 
> no, that's simply not acceptable.  DNS queries are already too
> slow and too unreliable, and you've just introduced some very
> interesting failure modes.

The default encoding for application functions needs to be ACE, except in
those cases where the application protocol explicitly provides in-stream
support for UTF-8 IDNs.

However, it's also important to remember that many lookups are not
dictated by application protocols (IETF or otherwise). Ping should
certainly be allowed to use the UTF-8 encoding on its command line, for
example. Hostnames in simple apps (like POP3 server) could also use UTF-8,
even though the POP3 protocol doesn't address this. It might be prudent to
simply say that "A lookups MAY default to UTF-8 if the operating
environment allows it, but the environment SHOULD explicitly indicate that
it supports this usage." We may need to encourage a usage whereby the
application can do a binding test to check the resolver.

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/