[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Dots, and a path to working IDNs



Liana,

Thanks for taking the time to explain what you are trying to do.
I hope I understand it now.  The comments below are based on my
current understanding -- if I have gotten it wrong, please try
to keep explaining.

--On Thursday, 31 May, 2001 23:40 -0700 liana.ydisg@juno.com
wrote:

>...
>> (You also propose to represent the building blocks
>> phonetically  using
>> ASCII, but I think that's an orthogonal issue.)
>> 
>> Is my understanding roughly correct?
> 
> Yes.
>...
 
>> Regarding the use of English keyboards, I'm not sure what
>> your point  is.
>> The representation of text in a file or on a wire is
>> unrelated to  how
>> users type that text, right?
>> 
> Not quite, but good catch. It is depending on which operating 
> model you are asuming.  But that is another subject.

_If_ I correctly understand this, you will quickly run into the
problem that many languages contain words that are pronounced
the same way but spelled differently and words that are spelled
the same way but pronounced differently.  As long as the DNS is
essentially/historically text-based, overlaying a phonetic
system on it is a problem for those languages and words (i.e.,
the mappings are not unique).

Worse, it is not clear that one can establish a set of
conventions for representations of characters that work across
scripts.  From very young ages, people get used to variations on
ways to print or write characters in their own languages
(scripts) but not in others: some variations in characters
derived from the Roman alphabet that are considered "font
details" would be sufficient to cause "different character" in
other scripts.  Indeed, as has been suggested in other contexts,
the glyphs for "i" and "j" may represent the same character or
not depending on whether the particular language treats them as
distinct; a different language may have different rules using
the same script.

Similar problems presumably apply with phonetic representations.
The international phonetic alphabet (see the 164ff in the
Unicode 3.0 book) has changed, and presumably improved, since I
learned how to use it 40-odd years ago. But I recall its being
considered "good enough", rather than "exact", and that the
phonetic community does not believe that a better system exists
or that a significantly better one is possible across language
groups.

Switching from a script-and-text-based model to a character-form
or phonetic one consequently exchanges one set of problems for
another, rather than reducing, IMO, the total number of problems.

But I think the most serious problem, as Adam suggests, is just
compatibility between systems: a mix of the two is probably
worse than either by itself.

That said, one might think of either phonetic or character-shape
schemes as candidates for a "layer 3" search system as outlined
in my "DNS search" discussion (draft-klensin-dns-search-00.txt).
If one could use such systems as a means of input which would
then go through a software process of both searching (not
exact-match DNS lookups) databases for possible matches and,
where necessary, dialogue with the users, they might become very
strong candidates, especially where the users were not easily
able to put more direct representations of the characters in
from keyboards.

   john