[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] nameprep failures



These kind of visual ambiguities can only be resolved by
knowing the language. It's not likely that a business card
recipient will be able to type an IDN properly if the
script is exotic for him. He needs a second romanized
domain name to use.

On the other hand, one of the main good things about
nameprep is that it prohibits "bad" names at registration
time, preventing the unscrupulous from making scam sites
similar to legitimate ones. But with the abundance of
supported scripts, mixing and matching weird characters
will allow registration of things that we would prefer not
to have registered with the potential of confusion.

Unfortunately, the only way to stop this is to further
burden nameprep with rules that disallow scripts in
bizarre combinations. Is this worth it? I don't know
but I would be interested in hearing what people think.

Bruce


----- Original Message ----- 
From: "Adam M. Costello" <amc@cs.berkeley.edu>
To: <idn@ops.ietf.org>
Sent: Thursday, July 19, 2001 11:19 AM
Subject: Re: [idn] nameprep failures


> "D. J. Bernstein" <djb@cr.yp.to> wrote:
> 
> > I see a domain name on a business card.  I type it in.  I'll be
> > unhappy if I don't get the right name.  Isn't that the whole point of
> > nameprep?
> 
> In my opinion (and this is just my opinion, not an interpretation of
> the nameprep spec or the IDNA spec) the point of nameprep is to make
> sure that users are able to type what they intend to type, and to make
> sure that names are not corrupted by transcoders.  I think the visual
> ambiguity problem is too much for nameprep to solve.  What do you do
> about l and 1?  What about katakana ka and the Han character for power?
> What about the katakana long vowel mark and the Han character for one
> (and the em dash, and the minus sign...)?  I think Mark Davis is right
> that this is a morass.
> 
> > If I see WWW.AOL.COM, where the first dot is actually U+3002, then
> > I'm sure I'll type it incorrectly.  Isn't this why nameprep prohibits
> > U+3002?
> 
> No, the nameprep spec is explicit about the reason for prohibiting
> U+3002:
> 
>     U+3002 is used as if it were U+002E in many input mechanisms,
>     particularly in Asia.  This prohibition allows input mechanisms to
>     safely map U+3002 to U+002E before doing nameprep without worrying
>     about preventing users from accessing legitimate host name parts.
> 
> > If I see WWW.AOL.COM, where the A is actually a capital Alpha, then
> > I'm sure I'll type it incorrectly.  So why doesn't nameprep prohibit
> > Alpha?
> 
> [Nitpick:  I wouldn't say you typed it incorrectly, I'd say you read it
> incorrectly, then typed exactly what you intended to type.]
> 
> No one is going to put <Alpha>OL.COM on a business card and seriously
> expect people to read it correctly.
> 
> AMC
> 
>