[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Dots, and a path to working IDNs

To: idn@ops.ietf.org
Subject: Re: [idn] Dots, and a path to working IDNs
From: "D. J. Bernstein" <djb@cr.yp.to>
Date: 29 May 2001 19:45:40 -0000
Delivery-date: Tue, 29 May 2001 12:49:02 -0700
Envelope-to: idn-data@psg.com
Mail-Followup-To: idn@ops.ietf.org

Keith Moore writes:
> For instance, it is one thing if typing the "bad" dots causes the
> lookup to fail; quite another if typing the "bad" dots causes a lookup
> for a completely different domain; and still another if the "bad" dots
> work differently in different clients.

I agree that having two different names, one using a good dot and one
using a bad dot, would be confusing for users. It would also mean that,
if we ever decided to move to ``bad dots are converted to good,'' we'd
break some working names. Disaster!

The current situation, however, is that the bad dot isn't used. So
neither problem happens. There's only one name. The worst case is a
failure, forcing the user to change the bad dot to a good dot.

My proposal is to do the same for IDNs. We can allow good UTF-8 IDNs,
and prohibit bad UTF-8 IDNs. Everything necessary to make this work is
something that we want to do anyway. Nothing has been lost if it turns
out that the users also need bad->good conversion.

> we would like to avoid having millions of programs upgraded only to
> find out that their IDN support is buggy and that (for instance)
> people can't reliably use certain IDNs with certain clients, and that
> there will be yet another massive upgrade

Aha---you're missing a basic point. My plan is compatible with bad->good
conversion: bad IDNs will be prohibited on the wire. If a programmer has
time to do all the work necessary for bad->good conversion, then he can
do that and deploy it. Nothing in my plan slows him down.

What I'm trying to do is make IDNs work as soon as possible. We already
have widespread support for UTF-8 in existing programs. We can take
advantage of this---as many people already have---to get UTF-8 IDNs.

Yes, it's possible that simple UTF-8 IDNs won't be as good as UTF-8 IDNs
with bad->good conversion in thosuands of programs. But they're clearly
much better than no IDNs at all. We can still add bad->good conversion;
we simply have to make sure to prohibit bad IDNs on the wire.

> it strikes me as far more difficult to get platforms to create a
> special input mode for IDNs,

The worst case is exactly what Adam has been proposing for ACE: a
separate little IDN tool, added on to the OS, that reads input from the
user and prints a good IDN.

> we already know of instances where the UTF-8 support isn't good enough.

What are you talking about?

---Dan

Prev by Date: [idn] How gethostbyname() handles 8-bit characters
Next by Date: Re: [idn] UTF-8 as the long-term IDN solution
Prev by thread: Re: [idn] Dots, and a path to working IDNs
Next by thread: Re: [idn] Dots, and a path to working IDNs
Index(es):
- Date
- Thread