[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] Interesting links
In a message dated 2002-02-12 9:50:11 Pacific Standard Time,
hoho@iis.sinica.edu.tw writes:
> Dear Dough,
Hmm, I guess it really is difficult to type names correctly. :-)
> As several members of the list tried to explain to the group, it is
> very difficult to ask a user to enter a Chinese domain names exactly
> in a Unicode input environment. This is because the complexity of
> getting the exact domain name is exponential in the length of a
> domain name. In other words, if there are 10 characters with variants
> in a domain name, then there are at least 1,024 different Unicode
> strings corresponding to the same name. It is frustrated for users to
> deal with such a high complexity.
Continuing to talk about 1,024 variants is unrealistic and is causing you to
lose credibility, because
(1) the vast majority of Chinese domain names (in all surveys and in existing
testbeds) are, and will be, much shorter than 10 Han characters, and
(2) a significant percentage of Han characters do not exist in TC/SC pairs.
If you consider a CDN that is, much more realistically, 6 characters in
length, and 4 of those 6 characters can be expressed as either TC or SC, then
you have 2^4 = 16 possible variants. Now, it may well be that registering 16
names is an unreasonable burden. I am not saying that there is no problem at
all. But it does no good to exaggerate the problem by claiming that it will
cause THOUSANDS of variants of a typical domain name.
-Doug Ewell
Fullerton, California
(address will soon change to dewell at adelphia dot net)