[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] I fear I cannot use IDN in the next 10 years




Dave Crocker wrote:

> uDNS has no significant technical benefit over an ACE

It advances the state of the Internet to a point beyond ASCII hackology,
which directly translates into a significant technical benefit for EVERY
OTHER PROTOCOL. Whereas there may not be any significant technical
difference between the two encodings from a high-level perspective, the
difference they make on the ground -- inside the applications which use
the encodings, particularly those applications and formats which already
use UTF-8 -- is tremendous.

> but it does have a serious problem. It impedes transition, since it
> requires changes to the infrastructure, whereas an ACE does not.

I disagree with both of those assertions:

 1) The changes which are required are optional. Admins that don't
    want to use IDNs will never have to change anything. Admins
    that want to use IDNs can deploy UDNS at their leisure. The
    root servers are the only servers that MUST be upgraded. It is
    entirely optional for every other zone, including the gTLDs,
    ccTLDs and end-user domains. It does not "require changes to
    the infrastructure" in the context you imply (unless your
    implication of "the infrastructure" was "the root servers").
    It can be rolled out at the whim of each and every other zone.

 2) ACE does require every application to implement transliteration
    for the applications to be usable as intended. ACE requires a
    fork-lift upgrade of the entire Internet, and we're left with
    ASCII hacks when that's over. UDNS is an incremental cost to
    this upgrade process, and leaves us with an infrastructure
    which is completely internationalized.

> >but we haven't fully discussed the benefits as of yet. Clearly, ACE
> >has many costs, some of which are quite high AND ongoing
> 
> if you mean the encode/decode cost, please be serious.  remember how
> short the string is.  if you mean something else, please enumerate.

As was illustrated, the principle ongoing cost is that every application
which is compliant with BCP18 will have to maintain separate mapping
services for ACE. Many protocols are intimately tied to domain names, and
all of them will have to implement a secondary mapping service.

> and please remember that UTF-8 is also an encoding and, therefore,
> carries an encode/decode cost.

If the OS and/or development environment is already UTF-8 clean, there is
minimal cost for the UTF-8 mode, but only for the ACE mode.

> >  * BCP18's is "Official Internet Policy" which requires support for
> >    UTF-8 in all new protocols,
> 
> Please re-read section 2:
> 
> >    This document does not mandate a policy on name internationalization,
> >    but requires that all protocols describe whether names are
> >    internationalized or US-ASCII.

It is hard to imagine how any application can comply with the objectives
given in BCP18 and not use UTF-8 for protocol data such as embedded domain
name components. That section 2 disclaims "names" in the context of
protocol labels is no surprise. That it would be interpreted as reasonable
cause to enforce yet another mandatory encoding on every protocol not yet
invented is questionable, IMO. That is the result of ACE-only.

Is there some unspoken intention of promoting ACE to the status of
~Internet-standard-encoding for new applications?

> You are also neglecting to observe the fact that the DNS protocol
> already exists.  BCP18 is not mandating a change to existing protocols.

It appears that I misread section 3.2. I concede this point.

> >  * Without a UTF-8 DNS interface, no new protocols or applications
> >    can be developed that are UTF-8 clean. Instead, they will be
> >    UTF-8 for everything EXCEPT domain names, and in some cases this
> >    will be fatal.
> 
> "Fatal" is a very strong word.  Also an incorrect one.  Feel free to
> provide empirical data that supports your certitude.

"Extreme hindrance to the point of not bothering" is more accurate,
although that translates to "fatal". One example is cited below.

> >One example we have already discussed for this is
> >    mapping between LDAPv3 distinguished names and DNS domain names
> >    (mapping dc= RDNs to DNS). Failure to support UTF-8 is a heavy
> >    blow to such efforts
> 
> Why would an encoding form affect the ability to translate between two
> different naming environment?  Encoding is for transport, rather than
> for "native" form.

So what's the solution? Should LDAP DNs store dc components as ACE-encoded
strings, even though the DN is UTF-8 aware already? Or should LDAP clients
and servers transliterate every DN they encounter for comparison?

Why force this choice of lesser evils, when we do not have to, and when we
arguable should not do so in the first place?

> A more serious limitation is the core difference between LDAP and DNS
> naming rules.

No, the dc= syntax already implies these rules. The problem is reflective
of applications which embrace UTF-8 and which also use DNS names as
protocol data. Every such application invented from here on out will face
this dilemma.

> And this raises the major problem that protocol efforts, like the
> current one, must not be used to try to fix larger and unrelated
> problems, such as mapping between two, incompatible name spaces.

The incompatibilities between dc= and DNS are exaggerated by ACE.

> >  * UTF-8 is infinitely more manageable and serviceable than ACE.
> 
> What is the empiricial basis for this assertion?

 | ACE libs can certainly address [] part
 | of the issue, the extent of the support for tasks like importing
 | trace data into a spreadsheet or viewing it in an editor is not
 | as compelling. The massive number of UTF-8 tools are extremely
 | compelling in terms of general manageability and serviceability
 | of the global DNS.

For example, there are plenty of large network admins who still maintain
address assignment spreadsheets. International users manipulating this
data in ACE form are faced with much greater tasks than would be required
if they could use UTF-8. At the least, UTF-8 is likely to be a supported
encoding in the OS, if not the spreadsheet app, while ACE is likely to be
supported in neither. These things add up to affect serviceability; where
did I transliterate that name wrong?

> >  * Finally, there are some problems that ACE cannot solve, which
> >    UDNS can. The clipboard problem practically goes away,
> 
> 1.  You are talking about representation inside a computer, rather than
> transport across a network.  Internet protocols pertain to transport,
> not storage.

We sure spent a lot of time talking about it.

> 2.  You are also wrong about this particular problem.  This choice does
> not affect the ability to clipboard a name.

It does affect transliteration requirements.

> A more immediate problem is that uDNS is an idea, not a complete
> specification.  The current document is substantially incomplete.

Yes, it needs some work, but the underlying principles are sound (IMO).

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/