[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] Problems in normalisation and matching
At 08:56 AM 7/1/2002 +0800, James Seng wrote:
I remember the "dot issues" was extensively discussed by the Nameprep Design
Team. It is decided that dots (other than U+002E) should be included because
there are IMEs which generate these dots in place of the normal dots (it
become a hassy to switch in and out of IME just for the dot).
This is a confuses user interface issues with protocol issues. The IETF
tries to stay away from user interface standardization, even though domain
names do have a human representation.
User interfaces must adapt to a wide range of usability issues. Protocols
are not supposed to suffer that burden.
It is the job of the user interface to map whatever typing codes it chooses
to, into the constrained protocol codes. The theory behind typical
Internet protocols -- and most other modern protocol standards -- is that
the world chooses ONE way to do a thing and everyone with other ways maps
to that one way.
The concern for cut-and-paste is obviously valid, but it is not the job of
the IETF protocol standards to operate well within a user cut-and-paste
environment.
Now, some may
say IME is out of scope but on the other hand, we really dont need to rehash
a topic which have been concluded. Lets move forward.
Introducing user interface issues into a protocol design is a good way to
impair interoperability, because it adds variability. That makes the
protocol not work.
Moving forward is a good idea. Except when it is moving backward.
The place where IDNs get broken down into label is in IDNA.
James. Forgive me, but I do not understand this statement. IDNs are
ALREADY series of separate labels. IDNA does not "separate" a domain name
into labels.
Note that IDN maintains the same kind of dot separater as the "unaware"
legacy domain name world, even if it uses multiple choices for the dot
character.
All IDNA does is to ENCODE those separate labels into a kind of UTF-7 (that
is, ACE).
Comparison is also done on a per label basis. A IDN is considered equivalent
if and only if all their individual labels are equivalent. The separators
during comparison is also irrelevant. (See IDNA Requirement 4)
One bit of confusion that I did not pursue with my suggested revisions is
the idea that an IDN can only be compared it its IDNA form.
In terms of formal specification, this cannot be correct. If there is
another encoding for IDNs, then the IDNA specification is essentially
saying that such encodings must be mapped to IDNA, first, in order to do
comparisons.
If that is what the working group really does mean, it should be stated as
part of an IDN specifications, separate from the IDNA specification,
because it is another formal change to the DNS.
d/
----------
Dave Crocker <mailto:dave@tribalwise.com>
TribalWise, Inc. <http://www.tribalwise.com>
tel +1.408.246.8253; fax +1.408.850.1850