[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Re: Just send UTF-8 with nameprep



"Eric A. Hall" <ehall@ehsco.com> wrote:

>  a) There will be some requirement for the UTF-8 namespace to provide
>     manually-encoded ACE names directly.  If we are converting
>     UTF-8 to ACE, these fixed names are pretty easy to implement.
>     Conversely, if we are doing ACE to UTF-8, then we have to come up
>     with a way to prevent the ACE name from being converted to its
>     UTF-8 equivalent in order for the ACE encoding to be preserved in
>     the UTF-8 namespace.

There is only one namespace.  Every name in that single namespace has
both an ASCII representation and a UTF-8 representation (which are
identical in the case of ASCII names).  Every protocol, message format,
file format, etc. is free to require the ASCII representation or to
allow both representations.  If both representations are allowed, then
they must be treated the same by the receiver, because they are the
same name.  There is no need to tag some names as must-use-ACE.  Are
you trying to support names that really do begin with the ACE signature
prefix, so that the user sees the ACE prefix even inside IDN-aware
applications?  Such names are forbidden; they simply do not exist in the
namespace.  Trying to support them would be much more trouble than it's
worth.

> Ping should certainly be allowed to use the UTF-8 encoding on its
> command line, for example.

This is not a protocol issue, but a user-interface issue, and operating
systems have their ways of dealing with it.  For example, under UNIX,
commands should assume that their command-line arguments use the
encoding indicated LC_ALL or LANG.

> Hostnames in simple apps (like POP3 server) could also use UTF-8, even
> though the POP3 protocol doesn't address this.

I don't know what you mean by the POP3 protocol not addressing this.
The POP3 protocol exchanges a variety of commands, none of which contain
hostnames, and also exchanges RFC 822 messages, which are required to be
ASCII.  I see no way for a POP server to use UTF-8 hostnames until there
is an updated message header format and an updated Post Office Protocol
that uses it.

AMC