[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: traditional/simplified (Re: [idn] wg milestones update)



--On Monday, April 30, 2001 07:15 +0200 Harald Tveit Alvestrand
<harald@alvestrand.no> wrote:

> At 10:25 29.04.2001 +0800, Sun Guonian wrote:
>> [30] Within a single zone, the zone manager MUST be able to
>> define equivalence rules that suit the purpose of the zone,
>> such as, but not limited to, and not necessarily, non-ASCII
>> case folding, Unicode normalizations (if Unicode is chosen),
>... 
> If it was possible to write down this equivalence rule on a
> single sheet of paper (without the codepoints), I would agree.
> Unfortunately not even the definitions of the different classes
> of traditional/simplified Chinese equivalence seems to fit on a
> single sheet of paper (the 1-n, n-1 and contextual mappings in
> particular).
> 
> A complex specification is prone to implementation errors,
> especially when there are large markets (US, Western Europe)
> where a sloppy implementation will not be challenged by
> real-life usage of the functionality.

Let me try a different answer, or pair of answers, without in any
way disagreeing with Harald (i.e., these are two or three
separate reasons for the same conclusion)...

(i) Even were it technically feasbile, it it not desirable to tie
interpretation rules to subdomain trees (zones are even worse).
If there are not to be a single, global, set of resolution and
matching rules, the rules applied should be best be those that
matches the language and perceptions of the user entering the
name.  For the simplifie/ traditional Chinese example, this may
not make any difference (I'm not educated enough to know).  But
for things like different interpretations of Unicode matching/
case folding, either the rules need to match the perceptions of
the requesting user, or they need to be globally uniform, or the
user will need, ultimately, to remember different rules for each
zone that might be encountered.  Or the users will be continually
astonished, which is not good.

(ii) User-end handling of these things is best accomplished in,
or very close to, the user agent, not in DNS servers or in
interception mechanisms.  If I am visiting a country other than
my own, I ought to be able to use the settings and resolution
rules I am used to, rather than having local ones imposed on me.
To take an example that does not involve Roman-based characters
at all, a traveller from China to an Arabic-speaking country
should be able to use Chinese rules to access Chinese sites
(whether located in China or elsewhere) rather than, e.g., Arabic
ones.

But a user-agent approach increases the risk that, with different
user agent configurations, a putative domain name that works for
me and that I send to you will not be resolvable by you because
your UA has a different set of translation or canonicalization
rules.
 

Unfortuately, if one believes the above, one is led, I think, to
one of two conclusions:

* We must accept a single, global, set of resolution rules,
rather than domain or zone-specific ones.  And that suggests that
the right place to do conversions between simplified and
traditional Chinese, or British and American English, is either
by having user agents convert to a single, canonical form for
each case before making a query (and registering only the
canonical form in the DNS).  Or we could make a "no mixing"
requirement and then register both forms but not the "some of one
and some of the other" variations.  But I have been told by
experts that a "no mixing" requirement is not practical in many
cases, and must accept that.

* We give up on the DNS as an environment in which to solve these
problems.  There are many advantages to server-based resolution
of names and matching rather than forcing users (or user agents)
into generating canonical forms and hoping that the canonical
forms match, not least of which is that we can have some local
flexibility while retaining very high recall of records the users
perceive as matching.  But the price is moving beyond the
"lookup" mechanisms of the DNS into an environment in which
UA-driven searching with imprecise matching is possible.

     john