[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] IDN's with any ASCII character



It should be noted however that given an input: "this&that.com"
The proper output for ToACE MUST be "this&that.com" and NOT
"xn--this&that-.com"
I specifically raise this issue because we have found that some
IDNA/Punycode implementation is actually exhibiting this behaviour.
Edmon



----- Original Message -----
From: "Adam M. Costello" <idn.amc+0@nicemice.net.RemoveThisWord>
To: "IETF idn working group" <idn@ops.ietf.org>
Sent: Wednesday, April 30, 2003 1:09 AM
Subject: Re: [idn] IDN's with any ASCII character


> Jarrod Hollingworth <jarrod@backslash.com.au> wrote:
>
> > Will IDN's allow encoding of domain names with *any* ASCII character?
> >
> > For example, let's say that I want to register the domain name
> > "this&that.com" or "100^10.com".
> >
> > Will IDN allow this or does it only facilitate international
> > languages?
>
> IDNA allows the addition of non-ASCII characters to domain names.  For
> ASCII characters, IDNA adds no new restrictions, but nor does it relax
> the old restrictions.  The ASCII characters & and ^ (and every other
> ASCII character besides letters, digits, and hyphen) are not allowed in
> the "preferred syntax", which is used for domain names that name hosts
> and mail exchangers.
>
> It is not merely by fiat that IDNA keeps the old ASCII restrictions,
> it follows from the technical details of the encoding.  In IDNA, every
> non-ASCII domain label has an ASCII form, where the non-ASCII characters
> are encoded using ASCII letters and digits.  But any ASCII characters
> that occur in the non-ASCII label are represented literally, not
> encoded.  For example, if we want to put an acute accent over the "a"
> in this&that.com, the ASCII form will be xn--this&tht-fza.  As you can
> see, IDNA does nothing to help you "sneak" the "&" into the name; it is
> still there as "&", so you can't use such a name anywhere that "&" is
> forbidden.
>
> One thing that is by fiat is the restriction on initial and final
> hyphens.  Technically, IDNA could have enabled one to sneak an initial
> or final hyphen into a label where initial and final hyphens are
> forbidden, but IDNA includes optional checks to prevent initial/final
> hyphen from sneaking in where it's not allowed.
>
> AMC
>
>