[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Summary of TS-SC discussion



Ignore the last mail. I realized how unreable it is. (Note to self: bad
idea to hit the "send' button before reading it once myself especially
going thru various cut & paste ;-)

---CUT HERE ---
>     Sorry,  it is not true , " Taiwan " means 2 chinese characters, in
> these two character , the first one has 2 scripts , all are TC , but
one
> of them are only used in PRC-SC. The all-traditional need at least 2
> record in "Taiwan University" , this is a example of  1(SC) --> 2
(TC).
> 1-n mapping are forced to selected one in registraion is based on
> the assumption the 1-1 mapping had treated in nameprep. You can
> not cut them to simplify it and forget they may be 2^^n combinations
> without SC/TC 1-1 mapping to reduce them.

Prof Tseng,

In Latin, if we have A/a B/b C/c, it is reasonable for someone to type
in "ABC", "abc" or "Abc" and so on...this gives us 2^n possibility.

Now if A/a is TC-SC and the same goes for B/b and C/c, is it usual for
one to
do variant like "Abc" or "ABc"? To be exact, from all experience, it is
more likely that only "ABC" or "abc" exist which gives us n*2.

But TC-SC also have is Traditional-Traditional, Traditional-Simplified,
Simplified-Traditional and Simplified-Simplified variant, this gives us
a max of n*4 combination, one for each. ("Taiwan" in your example falls
into the T-T and T-S variant).

This is why TC/SC is not really n^2 combination. Opertionally, it is
closely to n*2 to n*4 from our experience.

As Mark said, we should compare this 'confusion' to Latin A (U+0041) and
Greek A (U+0391). This is not normalized in Nameprep either and if AOL
wish to have both Latin and Greek, they have to put it in the zonefile
twice.

-James Seng