[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] An open letter to the IDN WG (long)



The rumor may have reached many of you by this time that I
made remarks at the ICANN meeting last week and at the
WIPO meeting last month suggesting that the work of this WG is
a dead end and/or that internationalization is hopeless.
Neither is true, although I have come to question, and reject,
some of the fundamental assumptions under which the WG has
been operating.  The issues are, I think, two parts technical
and one part philosophical --more in the epistemological sense
than the religious one-- and the most clear-cut and important
issues may be the philosophical ones.

I. Background and context

In between travels and distractions from various emergencies
I've been working on an exploration and explanation of both
sets of issues.  It doesn't reduce to sound bites, or even to
15 or 20 minute talks: the issues are quite complex and subtle
and I want to completely dissociate myself and my position
from anyone who says "well, if we just do this simple thing,
the IDN problem will be solved (or go away)".  Those notes,
which I had intended to email to the WG, are still not ready
and are, at this point, of I-D length rather than "note"
length.   But I do want to get the philosophical questions
onto the table in the hope that the WG can consider them very
carefully as it evaluates various proposals and presentations
during the coming weeks.

I do this with great fear: the IETF is traditionally lousy at
resolving philosophical debates.  The discussions tend to
produce a great deal of heat, less light than one would
expect, no clear way to make choices, and resentment about
whatever choices are ultimately made.  That shouldn't be
surprising, since, in ordinary society, definitive choices
among philosophical positions at typically made only by
essentially dictatorial mechanisms (including royal edicts and
majority votes to suppress minority positions) or by warfare.
Occasionally, it is possible to resolve such a choice by
perfect knowledge of what will happen in the future, but such
conditions are rare.

Usually, we try to avoid these choices in IETF for these obvious
reasons.   But we are, I believe at a juncture at which the
whole future of the Internet --and some of our most
deeply-held assumptions-- may be at stake, and I see little
choice but to explicitly open the debate.  You should be aware
that some of my most respected colleagues have predicted a
meltdown if I even send this note or otherwise open these
issues.  My belief is that they are wrong and that, faced with
a sufficiently important issue, the participants in this WG,
and the IETF in general, can and will respond to the challenge
with maturity, careful reasoning, and a focus on the long-term
future of the Internet.   Please don't prove me wrong. 

II.  The Issues - a summary of one view of where the WG, and
the IDN situation generally, stand.

(1) Indentifiers and words; Scripts and languages

The DNS was designed to contain identifiers in the strictest
sense: artificial or semi-artificial strings that could be
precisely bound to hosts or other resources.  It is reasonable
to impose fairly significant constraints on the form and use
of identifiers.  Indeed, the "restricted ASCII subset" rules
originally adopted for host names in the pre-DNS days are just
one such set of constraints.  If one chooses to broaden the
rules, then it is quite reasonable to talk about scripts and
WG-imposed (or UTC-imposed) restrictions on the use of those
scripts.

On the other hand, the IDN discussion and requirements have
been driven by explicit or implicit requirements for the use
of "words" and "names" -- names of people and of products, a
desire to spell one's name, or names of family members,
correctly, a desire to use strings in national or native
languages.  All of these are language issues, not script
issues, and the language implications are inevitable.

And the language issues are dominated about some facts about
human languages: They are ambiguous.  They are imprecise.
Each one has its own idiosyncracies and special rules, and
these may differ even between countries or regions in which
the same language is being used.  And people are very attached
to their languages as keynotes for their cultures -- some
group of engineers are not going to successfully change
language-use rules worldwide, no matter how convenient that
would be for computer systems.   Of course some people might
adapt and do so happily, just as some people have managed to
adapt to various contrived languages and simplified spelling
systems.   But most of the world won't, and most of those
potential converts are probably already using the Internet.

Directory solution needed

It appears to me that, because of the above, almost everyone
who has participated in the WG's work and understood its
implications has concluded that some sort of directory-based
approach (see draft-klensin-dnsrole-00.txt for an overview) is
going to be needed, sooner or later.  The philosophical
questions lie in the "sooner or later" part.  In particular...

If we are going to do a directory, is it worth modifying
the DNS first?  The answer seems less clear than it did a year
ago.  If, as seems inevitable, the DNS is going to remain a
repository of identifier, there is less argument for making it
"international" or "multilingual" than when we thought we
could make significant advances on the "multilingual" issue in
the DNS context.  And there seems to be general agreement that
even the most mild of modifications to the DNS increases
complexity and risk of breaking something.

That said, if we are going to start making modifications, the
argument has been that the ACE-based solutions are safer,
since they do not require changes to the DNS or the
applications that call it, although applications changes are
needed to make the encoded names intelligible and correctly
rendered.  But some of that argument depends on the assumption
--based largely on European and Japanese experience-- that it
will always be possible to enter and render Roman characters.
But the assumption is not true: we have strong indications
from, at least, China and parts of the Arabic-speaking world
that devices which cannot input or render Roman characters are
to be expected and that weakens, although does not eliminate,
the argument for ACE solutions.

We also have to face a question whose answer cannot be
predicted: our experience with Internet applications has been
that even 25% adequate "solutions" can prevent or delay the
deployment of real fixes forever or nearly so.  We need to
accept and understand the risk that, if we deploy a DNS-
modification approach, we will never see deployment of an
effective directory-based solution.  Perhaps worse, we might
need to wait until the problems with, and attempted
workarounds for, the DNS-based solutions cause the network to
reach the point of collapse or extreme fragmentation before we
can get to a decent solution.

So this is our dilemna.  The WG is perceived as being under
great pressure to produce _something_ immediately, if not
sooner.  While strawman design notes for a directory base --
including the outlines of referencing systems, a keyword
overlay, and discussion of how it might all be made to work
operationally and commercially-- are in progress and should be
ready for IETF review quite soon, there are many details and
no possibility for "immediately".  Producing an in-DNS
mechanism will take some of the pressure off, but risks
delaying good-quality mechanisms for multilingual (and I do
mean "multilingual") use of the Internet for a long time,
perhaps forever.  And, if those delays do occur, we can, I
think, expect to see local approaches, probably incompatible
ones, propagate to solve local problems for local users,
resulting in exactly the sort of fragmentation most of us
would like to avoid.

So, with the understanding that the hardest questions have no
technical resolution, we need to decide how to proceed and, in
particular, whether to go ahead with an identifier, DNS, and
script-based approach to solve a problem that involves words,
terminology, and natural language.

    john