[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] IDN's with any ASCII character



Sorry everyone, In my previous message I have mis-used the term ToACE, it
should be ToASCII (there is no ToACE)
and I should have said that ToASCII SHOULD result ... instead of MUST...

Anyway, just to reiterate my point.

Given an input "this&that.com" the output for ToASCII MUST NOT be
"xn--this&that-.com"

And we have come across implementations that mis-convert it to
"xn--this&that-.com"

Also, while ToASCII should fail in this case, it is important for an
implementation not to further terminate the process.  More specifically,
when ToASCII fails, the implementation to leave further interpretation up to
the original application and should not attempt to alter or terminate its
path.

Edmon





----- Original Message -----
From: "Edmon Chung" <edmon@neteka.com>
To: "IETF idn working group" <idn@ops.ietf.org>
Sent: Friday, May 02, 2003 4:28 PM
Subject: Re: [idn] IDN's with any ASCII character


> It should be noted however that given an input: "this&that.com"
> The proper output for ToACE MUST be "this&that.com" and NOT
> "xn--this&that-.com"
> I specifically raise this issue because we have found that some
> IDNA/Punycode implementation is actually exhibiting this behaviour.
> Edmon
>
>
>
> ----- Original Message -----
> From: "Adam M. Costello" <idn.amc+0@nicemice.net.RemoveThisWord>
> To: "IETF idn working group" <idn@ops.ietf.org>
> Sent: Wednesday, April 30, 2003 1:09 AM
> Subject: Re: [idn] IDN's with any ASCII character
>
>
> > Jarrod Hollingworth <jarrod@backslash.com.au> wrote:
> >
> > > Will IDN's allow encoding of domain names with *any* ASCII character?
> > >
> > > For example, let's say that I want to register the domain name
> > > "this&that.com" or "100^10.com".
> > >
> > > Will IDN allow this or does it only facilitate international
> > > languages?
> >
> > IDNA allows the addition of non-ASCII characters to domain names.  For
> > ASCII characters, IDNA adds no new restrictions, but nor does it relax
> > the old restrictions.  The ASCII characters & and ^ (and every other
> > ASCII character besides letters, digits, and hyphen) are not allowed in
> > the "preferred syntax", which is used for domain names that name hosts
> > and mail exchangers.
> >
> > It is not merely by fiat that IDNA keeps the old ASCII restrictions,
> > it follows from the technical details of the encoding.  In IDNA, every
> > non-ASCII domain label has an ASCII form, where the non-ASCII characters
> > are encoded using ASCII letters and digits.  But any ASCII characters
> > that occur in the non-ASCII label are represented literally, not
> > encoded.  For example, if we want to put an acute accent over the "a"
> > in this&that.com, the ASCII form will be xn--this&tht-fza.  As you can
> > see, IDNA does nothing to help you "sneak" the "&" into the name; it is
> > still there as "&", so you can't use such a name anywhere that "&" is
> > forbidden.
> >
> > One thing that is by fiat is the restriction on initial and final
> > hyphens.  Technically, IDNA could have enabled one to sneak an initial
> > or final hyphen into a label where initial and final hyphens are
> > forbidden, but IDNA includes optional checks to prevent initial/final
> > hyphen from sneaking in where it's not allowed.
> >
> > AMC
> >
> >
>
>
>