[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] opting out of SC/TC equivalence



--On Wednesday, 29 August, 2001 10:20 +0200 Harald Tveit
Alvestrand <harald@alvestrand.no> wrote:

> draft-ietf-idn-tsconv-00 describes a TC/SC mapping for 2064
> traditional/ simplified pairs, saying that other tables are
> needed for single/many and many/single mappings.
> 
> This means that we have a documented proposal on what to do
> with 4128 characters. In Unicoode 3.0, there are 23.658 *more*
> characters classified as "Han"; Unicode 3.1 adds 42.711 more,
> and it has been noted here that because of the way Chinese
> linguistics work, it is almost 100% certain that there will be
> more added. I assume (foolishly) that for some large class of
> these characters, the answer is "don't touch them" when
> mapping TC/SC - but I have no way of telling which characters
> belong in that class.
> 
> If you can come up with a proposal that describes what to do
> about ALL the Han characters in Unicode, I will be very happy
> to hear it.
> 
> Until then, I have to say that I have not seen any complete
> proposal. Remember - the implementations of the algorithm for
> the non-Chinese part of the world will mainly be done by
> non-Chinese-speaking programmers; it's got to be simple &
> complete enough that even I can get it right...

And, unless whatever is done is isolated -- through procedures,
structure, or layering-- in a way that is robust, accessible to
all users of URLs based in the Chinese language, and that does
not fragment the Internet, we also need to be sure that the
tables and mechanisms that do SC/TC equivalence don't
accidentally map Kanji or Hangul into SC.

   john