[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] Dots, and a path to working IDNs
John,
Thanks for a great explaination. I'd like everybody
read it, and I'd like this language be in our final draft, and
proposing it to WIPO for their WHOIS database
solution specified in their RFC-3.
Liana Ye
On Fri, 01 Jun 2001 09:55:01 -0400 John C Klensin <klensin@jck.com>
writes:
> Liana,
>
> Thanks for taking the time to explain what you are trying to do.
> I hope I understand it now. The comments below are based on my
> current understanding -- if I have gotten it wrong, please try
> to keep explaining.
>
> --On Thursday, 31 May, 2001 23:40 -0700 liana.ydisg@juno.com
> wrote:
>
> >...
> >> (You also propose to represent the building blocks
> >> phonetically using
> >> ASCII, but I think that's an orthogonal issue.)
> >>
> >> Is my understanding roughly correct?
> >
> > Yes.
> >...
>
> >> Regarding the use of English keyboards, I'm not sure what
> >> your point is.
> >> The representation of text in a file or on a wire is
> >> unrelated to how
> >> users type that text, right?
> >>
> > Not quite, but good catch. It is depending on which operating
> > model you are asuming. But that is another subject.
>
> _If_ I correctly understand this, you will quickly run into the
> problem that many languages contain words that are pronounced
> the same way but spelled differently and words that are spelled
> the same way but pronounced differently. As long as the DNS is
> essentially/historically text-based, overlaying a phonetic
> system on it is a problem for those languages and words (i.e.,
> the mappings are not unique).
>
> Worse, it is not clear that one can establish a set of
> conventions for representations of characters that work across
> scripts. From very young ages, people get used to variations on
> ways to print or write characters in their own languages
> (scripts) but not in others: some variations in characters
> derived from the Roman alphabet that are considered "font
> details" would be sufficient to cause "different character" in
> other scripts. Indeed, as has been suggested in other contexts,
> the glyphs for "i" and "j" may represent the same character or
> not depending on whether the particular language treats them as
> distinct; a different language may have different rules using
> the same script.
>
> Similar problems presumably apply with phonetic representations.
> The international phonetic alphabet (see the 164ff in the
> Unicode 3.0 book) has changed, and presumably improved, since I
> learned how to use it 40-odd years ago. But I recall its being
> considered "good enough", rather than "exact", and that the
> phonetic community does not believe that a better system exists
> or that a significantly better one is possible across language
> groups.
>
> Switching from a script-and-text-based model to a character-form
> or phonetic one consequently exchanges one set of problems for
> another, rather than reducing, IMO, the total number of problems.
>
> But I think the most serious problem, as Adam suggests, is just
> compatibility between systems: a mix of the two is probably
> worse than either by itself.
>
> That said, one might think of either phonetic or character-shape
> schemes as candidates for a "layer 3" search system as outlined
> in my "DNS search" discussion (draft-klensin-dns-search-00.txt).
> If one could use such systems as a means of input which would
> then go through a software process of both searching (not
> exact-match DNS lookups) databases for possible matches and,
> where necessary, dialogue with the users, they might become very
> strong candidates, especially where the users were not easily
> able to put more direct representations of the characters in
> from keyboards.
>
> john
>