[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] An ignorant question about TC<-> SC



In a message dated 2001-10-23 11:13:14 Pacific Daylight Time, klensin@jck.com 
writes:

>  On the other hand, one problem is more severe
>  than in the Chinese case: in the general case, a Serbo-Croatian
>  string written in Cyrillic cannot be distinguished, on a
>  character string basis, from uses of Cyrillic for other languages
>  (e.g., Russian), which should not be mapped and, similarly, a
>  string written in Roman-based characters cannot be distinguished,
>  on a character string basis, from the Roman-based characters of
>  another language (English?) which, again, cannot be mapped.

But this problem *does* exist in the Chinese case, because certain Han 
characters can also be used to write Japanese or (I've been told) Korean.  In 
a Japanese or Korean context, it wouldn't make any sense to map the correct 
"traditional" Han character to a simplified "equivalent"; the simplified 
character is only equivalent if the language is Chinese.  And we're not 
tagging languages, so we don't know when this mapping is appropriate and when 
it's not.

-Doug Ewell
 Fullerton, California