[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Summary of TS-SC discussion



>     Sorry,  it is not true , " Taiwan " means 2 chinese characters, in
these
> two character , the first one has 2 scripts , all are TC , but one of
them
> are only used in PRC-SC. The all-traditional need at least 2 record in
> "Taiwan University" , this is a example of  1(SC) --> 2 (TC).  1-n
mapping
> are forced to selected one in registraion is based on the assumption
the 1-1
> mapping had treated in nameprep. You can not cut them to simplify it
and
> forget they may be 2^^n combinations  without SC/TC 1-1 mapping to
reduce
> them.

Prof Tseng,

In Latin, if we have A/a B/b C/c, it is reasonable for someone to type
in "ABC", "abc" or "Abc" and so on...this gives us 2^n possibility.

Now is A/a is TC-SC, similarly for B/b and C/c, it is usual for one to
do variant like "Abc" or "ABc"...It is like to have "ABC" or "abc" only
which gives us n*2 combination.

But because TC-SC also have is Traditional-Traditional,
Traditional-Simplified, Simplified-Traditional and Simplified-Simplified
variant, this gives us a max of n*4 combination, one for each. ("Taiwan"
in your example falls into the T-T and T-S variant).

This is why TC/SC is not really n^2 combination. Opertionally, it is
closely to n*2 to n*4 from experience.

As Mark said, we should compare this 'confusion' to Latin A (U+0041) and
Greek A (U+0391). This is not normalized in Nameprep either and if AOL
wish to have both Latin and Greek, they have to put it in the zonefile
twice.

-James Seng