[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] IDN's with any ASCII character



Jarrod Hollingworth <jarrod@backslash.com.au> wrote:

> Will IDN's allow encoding of domain names with *any* ASCII character?
>
> For example, let's say that I want to register the domain name
> "this&that.com" or "100^10.com".
>
> Will IDN allow this or does it only facilitate international
> languages?

IDNA allows the addition of non-ASCII characters to domain names.  For
ASCII characters, IDNA adds no new restrictions, but nor does it relax
the old restrictions.  The ASCII characters & and ^ (and every other
ASCII character besides letters, digits, and hyphen) are not allowed in
the "preferred syntax", which is used for domain names that name hosts
and mail exchangers.

It is not merely by fiat that IDNA keeps the old ASCII restrictions,
it follows from the technical details of the encoding.  In IDNA, every
non-ASCII domain label has an ASCII form, where the non-ASCII characters
are encoded using ASCII letters and digits.  But any ASCII characters
that occur in the non-ASCII label are represented literally, not
encoded.  For example, if we want to put an acute accent over the "a"
in this&that.com, the ASCII form will be xn--this&tht-fza.  As you can
see, IDNA does nothing to help you "sneak" the "&" into the name; it is
still there as "&", so you can't use such a name anywhere that "&" is
forbidden.

One thing that is by fiat is the restriction on initial and final
hyphens.  Technically, IDNA could have enabled one to sneak an initial
or final hyphen into a label where initial and final hyphens are
forbidden, but IDNA includes optional checks to prevent initial/final
hyphen from sneaking in where it's not allowed.

AMC