[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Thoughts on nameprep



"D. J. Bernstein" <djb@cr.yp.to> wrote:
[...]
>Suppose we settle on fast nameprep: it's the keyboard interface's job to
>help you type good domain names, so that other programs don't have to
>worry about bad domain names. What changes would you make in the
>keyboard interface to support this?
>

Forcing the keyboard interface to do "fast nameprep" may be a solution for
japanese (I still doubt), it is definitely no solution for most European
languages, because it significantly changes the perception of the average
"Jo User" how DNS (and his computer) works.

Users have become accustomed to the fact that it does not matter whether
they enter letters in domain names in uppercase or lowercase; yet, most
European languages use some "national special characters" as letters, and
these additional letters come in uppercase and lowercase flavours.

In German (I am using this language because I am most familiar with it; but
I had the opportunity to get to know the respective conventions of other
languages in a recent project) these special characters are 'umlaut a'
(lowercase U+00E4; uppercase U+00C4), 'umlaut o' (lowercase U+00F6,
uppercase U+00D6), 'umlaut u' (lowercase U+00FC, uppercase U+00DC) and the
'sharp s' (lowercase U+00DF; no uppercase equivalent).

The average user has got used to the fact that e.g. the domain names
"www.kraeuterpaul.at" and "WWW.KRAEUTERPAUL.AT" are equivalent, and because
they perceive e.g. 'umlaut a' as another letter, they simply won't
understand that "www.kr[U+00E4|uterpaul.at" and "WWW.KR[U+00C4]UTERPAUL.AT"
are not.

Let us see how a keyboard interface doing "fast nameprep" might fit the bill:
(1) You suggest forcing the users to switch "keyboard mode"; well, it is
not hard to predict that European users will forget to do so - simply
because they didn't need to do it so far, and being forced to change habits
just to make an IDN proposal work probably isn't going to be the big
motivational force.
(2) You might suggest to do the translations all the time when something is
entered via the keyboard; well, not long ago, we had a reform of the
orthography of the German language, and - although the original goals were
very ambitious (to greately simplify German orthography in order to make it
much simpler to get it right), the results where only modest changes
(merely because conservative ministers for education thought those original
rules were of great educational value) - and those results are still
debated. A "just write lowercase" proposal - just to get an IDN proposal
working - definitely won't get much (political) support.
(3) There is some mysterious magic how a keyboard interface determines
automatically whether the user is typing a domain name or not; maybe I am
not very imaginative, but it remains mysterious to me how this "magic"
should work (after all, present computers are not very good in doing
something that is not well specified; if they were, we could tell them to
just do IDN right, close down the IDN WG and thus get rid of all those
tedious discussions that come up again and again although a reasonable
solution has been found already.)

But there is another place where we know that we are in fact dealing with
domain names, and nothing else: anywhere in the DNS resolution process -
and the "slow nameprep" proposal specifies to do nameprep just there.

(It may still be debated whether applications are the best place; other
options are DNS resolver libraries and DNS servers; putting any IDN-related
code into just the applications - or any dynamically linked libraries
called by them - has - IMHO - the big advantage that users may upgrade just
them - being immediately gratified by being able to use IDNs, which will
result in a very fast shift to IDN.)

>> Half-width kana will be obsolete
>
>So, if we take the slow nameprep approach, then in twenty years we'll
>have a bunch of networking programs with the useless skill of converting
>half-width kana to full-width kana. Right?
>
>As for numbers, Bruce seemed to say that most applications expect ASCII
>digits, and that double-width numbers won't work. If that's correct, why
>is anybody using double-width numbers?
>
>---Dan

All that nameprep specifies is a generic transformation algorithm, with the
actual transformations defined by a set of tables; once those characters
get really out of use, all that needs to be done to remove this "useless
skill" is to replace those tables - no software upgrade necessary!

Sincerely Yours - Alfred Novacek

------------------------------------------------------------------------
Dipl.-Ing. Alfred Novacek
Institute for Data Processing in Business Administrations,
     Economics and Social Sciences
Johannes Kepler University Linz / Austria
E-Mail: Novacek@idv.uni-linz.ac.at
        Novacek@pop.idv.uni-linz.ac.at
WWW: http://www.idv.uni-linz.ac.at