[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] Prohibiting characters in draft-ietf-idn-nameprep
- To: idn@ops.ietf.org
- Subject: Re: [idn] Prohibiting characters in draft-ietf-idn-nameprep
- From: Paul Hoffman / IMC <phoffman@imc.org>
- Date: Tue, 15 Aug 2000 14:27:57 -0700
- Delivery-date: Tue, 15 Aug 2000 14:29:48 -0700
- Envelope-to: idn-data@psg.com
At 4:11 PM -0400 8/15/00, Edmon wrote:
>I am sure Einstein will never ever agree that E=mc" is the same as E=mc2 !!!
Of course, but why is that relevant? We are talking about host names,
not general canonicalization of all text.
>Form KC doesnt seem to make sense in the context of a "name" of which the
>DNS is about...
>
>I believe form C should be the choice...
To summarize the previous discussion (which did not come to
consensus, I believe):
- Form C preserves the uniqueness of characters, some of which are
visually indistinguishable from each other. This, in turn, causes
surprise when a user asks for a name with a character such as U+F900
and is told that there is no such host because the host registered
with U+U+8C48, which looks identical to U+F900.
- Form KC loses the uniqueness of some characters whose compatibility
decomposition is not as clear (such as in the example you give of
U+00B2, superscript 2), but causes less surprise when a user enters a
compatibility character and it is normalized to a single character.
The other option for processing, which wasn't popular, is to prohibit
on input the compatibility characters that are "more" ambiguous and
then use form C so that the others (such as U+00B2) pass through.
Many folks would argue that superscript 2 looks too much like digit 2
to want this, but it is certainly doable.
--Paul Hoffman, Director
--Internet Mail Consortium