[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] Re: Back to work (Nameprep) (was: Re: Justsend UTF-8 with nameprep (was: RE: [idn] Reality Check))
- To: Mark Davis <mark@macchiato.com>
- Subject: Re: [idn] Re: Back to work (Nameprep) (was: Re: Justsend UTF-8 with nameprep (was: RE: [idn] Reality Check))
- From: John C Klensin <klensin@jck.com>
- Date: Wed, 18 Jul 2001 11:54:38 -0400
- cc: idn@ops.ietf.org
--On Wednesday, 18 July, 2001 08:22 -0700 Mark Davis
<mark@macchiato.com> wrote:
> I think the whole notion of trying to prevent cross-script
> confusions in domain names is a morass. It would invariably
> result in complicated rules that would invariably have false
> positives and negatives; it also depends greatly upon the font
> in use on the particular user's machine. We even have that
> now, with one and ell -- depending on the user's font, those
> can look identical.
>
> Better would be to have useful GUIs that detect and signal
> possibly confusing names. For example, in the URL field of a
> browser, the spelling-check-style wavy underline could be used
> under terms like "intŠµl.com" vs "intel.com" (where in the
> first the "e" is the Cyrillic letter U+0435), to alert the
> user that the URL might be odd. Such tools would not get in
> the way of legitimate domain names that mix scripts or symbols.
Mark,
I think your conclusion is correct, but its implications are
quite broad.
This set of conversations has been very interesting to me, for
the unfortunate reason that it confirms the tentative, but
painful, conclusion I reached a few months ago. Once we get
down to the really fine details of which characters match and
which do not, how to disambiguate glyphs from different scripts
that look (or are) identical, and so on, we need to rely on user
interfaces and human intelligence (or very close approximations
to the latter), rather than depending on absolute matching rules.
If we have to do that (and I agree that we will), then we almost
certainly need a non-DNS mechanism to support it. If nothing
else, the performance costs of making multiple "maybe it is this
one" probes into the DNS to try to sort out ambiguities will
almost certainly be unacceptable (remember that DNS timeouts are
on the order of seconds and that cached negative responses
cannot have long durations).
If we _are_ going to fix the UIs to handle these cases, the
amount of work required to use a completely different mechanism
that supports human-assisted disambiguation tools (rather than
one encoding or another within the DNS) is almost certainly
quite small in comparison to other aspect of the effort required.
And, if we need a non-DNS mechanism, the current ACE versus
UTF-8 debate has the potential to turn into "look in three or
more different ways" story. That, in turn, dramatically
increases the odds of false positives, and false positives cause
both security and sanity problems (as well as driving trademark
lawyers crazy).
john