[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] San Diego Meeting Notes



I object to ACE because the initial switch involves massive unnecessary
costs. The costs are discussed in http://cr.yp.to/proto/idn.html. See
below for a transliterated copy of idn.html.

I also object to the idea of specifying a short-term plan without a
long-term plan. How can we rationally evaluate the costs of ACE if we
don't also evaluate the likely costs of a future ACE-to-UTF-8 switch?

I also object to the characterization of ACE as an ``application-only''
solution. Is an MTA an ``application''? How about a DNS server? Unless
we can settle on a clear definition of ``application,'' we shouldn't
even be using the word, let alone making decisions that involve it.

As for nameprep: When are we going to see some working software? I have
a bunch of examples that I'd like to try; it's difficult to evaluate
nameprep without software.

---Dan



   D. J. Bernstein
   Protocols

                        Internationalized domain names

   We want to let people use domain names like abg.com (``alpha beta
   gamma dot com''). They should be able to register the name, set up
   computers under the name, connect to those computers by name, set up
   web pages under the name, set up links to those web pages, browse
   those web pages given the name or a link, send email from an address
   under the name, receive email at that address, etc.

   The big question is how these domain names will be encoded. This web
   page describes two proposals, and tallies the costs of each proposal
   for UNIX. (The costs for Windows are of similar types but generally
   smaller; Microsoft has supported Unicode for years.)

   Important note: An uppercase ABG.com (``Alpha Beta Gamma dot com'') is
   guaranteed to cause confusion: when the uppercase Alpha is printed
   properly, it looks just like an uppercase A. Some other strings are
   also guaranteed to cause confusion. There's a complicated definition
   of good names among all Unicode strings. Good names won't be confused
   with each other. Registrars won't allow registration of bad names.

Base costs

   Users need to be able to see common Unicode characters. The necessary
   fonts are available, as is a version of xterm that displays the
   characters given UTF-8 input. These are all included with the current
   version of XFree86, so they are being deployed as part of regular OS
   upgrades.

   Users will sometimes need to type strange addresses from business
   cards. Finding an unusual character in a huge font display is
   difficult, so I expect business cards to provide more information,
   such as Unicode numbers in small type. Keyboard interfaces will have
   to improve to accept this information. (The ISO standard method is
   Shift-Ctrl-222E for character 222E.)

Costs of ACE with slow nameprep

   What it means. Domain names are encoded as 7-bit strings in the
   following contexts:
     * DNS registration forms.
     * DNS queries and responses.
     * Mail message header fields: From, To, Received, etc.
     * POP USER commands. (POP usernames typically include domain names.)
     * Various parts of IMAP.
     * HTTP fields: Host, etc.
     * URLs.

   Domain names are encoded as UTF-8 strings in the following contexts:
     * The argument to gethostbyname.
     * h_name and h_aliases.
     * /etc/hosts and many other network configuration files.
     * /etc/named.boot and zone files.
     * Command lines for ndc, nsupdate, etc.
     * BIND log files.
     * /service/dnscache/root/servers.
     * dnscache log files.
     * /service/tinydns/root/data.
     * Command lines for add-host, add-ns, etc.
     * tinydns log files.
     * /etc/resolv.conf, $LOCALDOMAIN, etc.
     * Command lines for dig, host, etc.
     * Output of dig, host, etc.
     * Command lines for telnet, ssh, etc.
     * /etc/hosts.allow, .ssh/known_hosts, etc.
     * httpd.conf.
     * /public/file.
     * More HTTP server configuration files.
     * lynx.cfg.
     * Command line for lynx.
     * .fetchmailrc.
     * Pine interface: message displays, command line, etc.
     * Mutt interface: message displays, command line, etc.       
     * More mail clients.

   A domain name encoded as a UTF-8 string is permitted to be a bad name
   if it looks just like a good name. It is interpreted as that good
   name.

   Making it work. BIND, tinydns, and other DNS servers need to be   
   upgraded. Domain names in configuration files need to be converted  
   from possibly bad to good, and from UTF-8 to 7-bit. 7-bit domain names
   in queries need to be converted to UTF-8 for logs.

   The gethostbyname DNS client library needs to be upgraded. The input
   domain name needs to be converted from possibly bad to good, and from
   UTF-8 to 7-bit, before it is sent as a DNS query. The output domain
   names in h_name and h_aliases need to be converted from 7-bit to
   UTF-8.

   The tcpclient networking tool needs to be upgraded. The input domain
   name needs to be converted from possibly bad to good, and from UTF-8
   to 7-bit, before it is sent as a DNS query.

   telnetd, sshd, tcpserver, etc. need to be upgraded. Domain names in
   configuration files need to be converted from possibly bad to good,
   and from UTF-8 to 7-bit. 7-bit domain names need to be converted to 
   UTF-8 for logs.

   Many low-level networking tools need to be upgraded. Domain names in
   configuration files need to be converted from possibly bad to good,
   and from UTF-8 to 7-bit. 7-bit domain names need to be converted to
   UTF-8 for logs.    

   Pine, Mutt, and other text-mode mail clients need to be upgraded.
   7-bit domain names need to be converted to UTF-8 when messages are
   displayed. UTF-8 needs to be spaced properly. Addresses in
   configuration files (for example, ``flag messages to djb@cr.yp.to'')
   need to be converted from possibly bad to good, and from UTF-8 to
   7-bit.

   fetchmail and other POP clients need to be upgraded. Domain names
   embedded in POP usernames need to be converted from possibly bad to
   good, and from UTF-8 to 7-bit.

   Netscape Mail and other graphical mail clients need to be upgraded.
   7-bit domain names need to be displayed as Unicode glyphs. Addresses
   in configuration files need to be converted from possibly bad to good,
   and from UTF-8 to 7-bit.

   Apache, publicfile, and other HTTP servers need to be upgraded. Domain
   names in configuration files need to be converted from possibly bad to
   good, and from UTF-8 to 7-bit. 7-bit domain names need to be converted
   to UTF-8 for logs.

   Lynx and other text-mode browsers need to be upgraded. Domain names in
   configuration files need to be converted from possibly bad to good,
   and from UTF-8 to 7-bit. 7-bit domain names need to be converted to
   UTF-8 for internationalized URL displays.

   Netscape and other graphical browsers need to be upgraded. 7-bit
   domain names need to be displayed as Unicode glyphs. Domain names in
   configuration files need to be converted from possibly bad to good, 
   and from UTF-8 to 7-bit.

Costs of UTF-8 with fast nameprep

   What it means. Domain names are encoded as UTF-8 strings in all of the
   above contexts.

   Bad names are not allowed to appear. (Exception: Users can send bad
   names in DNS registration forms; the registrar will send back a
   rejection notice showing the closest good name.)

   Making it work. Sendmail needs to be upgraded. Current versions     
   discard bytes \200 through \237 in mail message headers.

   The gethostbyname DNS client library needs to be upgraded. Many
   current installations, in violation of RFC 2181, reject DNS answers 
   that contain unusual characters. (However, some versions will work 
   correctly with options allow_special all or options no-check-names in
   /etc/resolv.conf.) 

   Pine, Mutt, and other text-mode mail clients need to be upgraded.
   UTF-8 needs to be spaced properly.

   There's one report that an obsolete version of the Netscape mailer
   crashes under Solaris when it reads UTF-8 messages. I need verifiable
   details.

   Are there more programs that need to be upgraded? Let me know.