[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] I-D ACTION:draft-ietf-idn-vidn-00.txt
- To: "FDU - Sung Jae Shim" <sshim@mailbox.fdu.edu>
- Subject: Re: [idn] I-D ACTION:draft-ietf-idn-vidn-00.txt
- From: Harald Alvestrand <Harald@Alvestrand.no>
- Date: Mon, 27 Nov 2000 06:02:18 +0100
- Cc: <idn@ops.ietf.org>
- Delivery-date: Sun, 26 Nov 2000 22:34:07 -0800
- Envelope-to: idn-data@psg.com
I have tried to read your draft (again). It is problematic because you
chose to represent non-ACII characters as "??" rather than as character
names or Unicode codepoints.
Your specification does not say where the "pre-assigned codes" are stored,
transferred and checked. This is critical to evaluating your draft.
Please revise.
One point:
First, each entity-defined portion of a virtual domain name in the
local language is decomposed into individual characters or sets of
characters so that each individual character or set of characters can
represent an individual phoneme of the local language, which is the
inverse of transcription of phonemes into characters. Second, each
individual phoneme of the local language is matched with an
equivalent phoneme of English that has the same or most proximate
sound. Third, each phoneme of English is transcribed into the
corresponding character or set of characters in English. Finally, all
the characters or sets of characters converted into English are
united to compose the corresponding entity-defined portion of an
actual domain name in English.
This process is severely underdefined; English is not a good language for
finding systematic phoneme representations.
If you start off with an English word and do this, you will usually end up
with a different word, and sometimes this word will be English.
Consider that the pronounciation of "bridge" is (roughly) "bri-tsch". And
that "cite" and "sight" have the same prononunciation in English (homonyms).
If you do this to a language using a Latin-based script, the result is even
more confusing; "skjerm" in Norwegian is pronounced almost as if it was
"charm" in English. And there are the homographs - "lever" (liver) and
"lever" (alive) are pronounced differently (lev-ER and LE-ver respectively).
I do not understand how a process of "disambiguation" that cannot be made
to work with any pair of languages I understand is going to help much with
the problems I don't.
But I could be wrong.
--
Harald Tveit Alvestrand, alvestrand@cisco.com
+47 41 44 29 94
Personal email: Harald@Alvestrand.no