[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] IURL vs URL, IDNS name vs DNS name
- To: Larry Masinter <LM@att.com>
- Subject: Re: [idn] IURL vs URL, IDNS name vs DNS name
- From: John C Klensin <klensin@jck.com>
- Date: Fri, 11 Feb 2000 02:04:24 -0500
- Cc: idn@ops.ietf.org
- Delivery-date: Thu, 10 Feb 2000 23:06:48 -0800
- Envelope-to: idn-data@psg.com
--On Thursday, 10 February, 2000 14:36 -0800 Larry Masinter
<LM@att.com> wrote:
> In draft-masinter-url-i18n-04.txt, we took the tack of
> defining a _new_ protocol element, an "IURL"
> (Internationalized URL) which allowed 8-bit UTF8 sequences. We
> left "URL" alone, but noted that there might be some
> situations, protocols and contexts that could be upgraded to
> use IURLs instead of URLs. This got us out of the quandry of
> wanting to upgrade technology but dealing with older software
> that couldn't deal with the new representation.
>
> A similar approach could work for "DNS names": define a new
> protocol element (IDNS name), note that existing compliant DNS
> servers _could_ handle IDNS names as well as DNS names, and
> then allow some way of encoding IDNS names in DNS names.
>...
> This is a migration strategy. If you're going to migrate from
> "everyone assumes DNS names are ASCII" to "DNS names allow
> UTF8", you have to allow for an interim state where there are
> some contexts in which DNS names are only ASCII and other
> contexts where they're allowed to have UTF8. You can't even
> talk about this if you say "DNS" for both contexts, so you
> have to make up a new name. So call the context of "DNS names
> that are allowed to have UTF8" the "IDNS" context.
Larry,
Keep in mind that URLs are pretty simple, in the sense that they
more or less reference objects (specifically, things that are
not recursively URLs). And there isn't an object->URL mapping
inherent in anything (sometimes, more the pity, but that is a
separate conversation). In general, the sort of system you
suggest will work when those conditions are met. But, with the
DNS, you'd be talking about some fairly complex situations and
combinations of situations. For example:
* If you put these more or less into the existing DNS, would you
contemplate an "IPTR" record whose RHS is a potentially
non-ASCII name?
* Give, especially, that CNAMEs can point to records in another
domain entirely, would you contemplate
CNAME (label and RHS in ASCII)
ICNAME (label in something else, RHS in ASCII)
CNANMEI (label in ASCII, RHS in something else)
ICNAMEI (both in non-ASCII)
It is pretty easy to poke holes in this example, but you get the
picture.
There are, however, at least two variations on this theme. Some
of us suspect that the WG may be driven to one of them once the
interoperability problems with the existing deployed base and
the impossibility of doing a "flag day" (or even "flag year" or
"flag decade") changeover is understood. Note that the two
examples below are just examples -- neither would work without a
lot of details, some quite subtle, being worked out and filled
in.
(1) The place where that "I" symbol goes is not in the record
type, but in the Class, possibly with some very fancy
"additional information" or reinterpretation rules and
recommendations. The new type would shadow the old "IN" one,
with all of the same record types but different rules for
forming and interpreting strings in labels and predicates. An
I18N-capable resolver might then do a lookup in Class "INN",
rather than Class IN. Servers might be trained, if no Class INN
records were found, to try to do a lookup in Class IN and return
the right stuff. (Of course, that second lookup would fail, and
wouldn't be worth trying, if the query's content was really a
non-ASCII string, but there would be cases in which it would
be.) One of the worrisome cases is that a reverse ("PTR")
lookup in Class IN would fail if the address was registered only
in the Class INN space -- maybe that is ok, maybe we'd need a
kludge.
(2) But that model isn't very different from a directory overlay
on the DNS, where the directory is entirely internationalized
and the contents of the DNS itself are eventually viewed as just
a collection of octets in particular ranges (that happen to
correspond to ASCII A-Z, a-z, 0-9, and "-"), i.e., as protocol
elements not names that are expected to have any human
significance. Use of a directory overlay for this purpose would
have some advantages over using the DNS, e.g., one could do
smarter lookups if one assumed one was dealing with names in a
known natural language than is sensible for the labels of the
current DNS and, should the character set wars break out again,
one might use different systems in different environments. The
reverse mapping problems wouldn't be easy, but might not be
significanty more difficult than would exist if the new names
were embedded in the DNS on a "no flag day and you can't wreck
old servers, resolvers, or applications" basis.
john