[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] draft about Tradition and Simplified Chinese Conversion



Some comments.
 
1. I had always heard that the T/S conversions were quite complex, and required contextual mappings. If you tried to do a simple mapping there would be some cases where word 1 would be treated as synonymous with word 2, and yet word 1 has a very different meaning than word 2.
 
It sounds like you are saying that although this is true, these are only a few edge cases; that in the vast majority of cases a relatively simple transformation could be incorporated into nameprep, and that in practice Chinese speakers would be willing to put up with the cases of mismatch in order to have the benefits of the folding. Is this your position?
 
2. I found the discussion of Rule A and Rule B somewhat confusing. It sounds like it is roughly equivalent to what is done for case folding (see http://www.unicode.org/unicode/reports/tr21/):

"For those concerned with the details. Case-folding logically involves a set of equivalence classes, constructed from the Unicode Character Database case mappings as follows.

For each character X in Unicode:

  1. If X is already in an equivalence class, continue to next character.
  2. Otherwise, form a new equivalence class, and add X.
  3. Then add whatever upper-, lower- or titlecases to anything in the set.
  4. Then add whatever anything in the set upper-, lower- or titlecases to.
  5. Repeat #3 and #4 until nothing further is added.

Each equivalence class is completely disjoint from all the others, and together they form a partition of the entire Unicode code space. From each class, one representative element (a single lowercase letter where possible) is chosen to be the common form."

Applying that to this case, it would imply that:
 
M1) if two different traditional characters map to a single simplified, all three would be considered equivalent for matching (and in nameprep one would be chosen as representative for matching).
 
M2) if two different simplified characters map to a single traditional, all three would be considered equivalent for matching (and in nameprep one would be chosen as representative for matching).
 
However, as I said, I found the discussion somewhat difficult to follow, so I may be misinterpreting it. Could you confirm whether (M1) and (M2) are what you meant?
 
Mark
----- Original Message -----
From: lee
Sent: Thursday, June 28, 2001 05:49
Subject: [idn] draft about Tradition and Simplified Chinese Conversion

Dear IETF&IDN WG:
 
    Attachment is our draft about Traditional and Simplified Chinese Conversion, which explains why we do Traditional and Simplified Chinese Conversion, gives a specification of such conversion, and suggests a
solution to the conversion problem based on two rules, A and B.
    Thanks for any suggestions and comments!
 
Xiaodong LEE, HSU NAI-WEN, Erin Chen, GuoNian SUN
 
 
 
 
 
 
 
 
 
 
______________________________________
 
                X.D. Lee             
______________________________________
 
  Tel. (O): +86-10-62619750-3020      
  Email(O): LEE@cnnic.net.cn         
______________________________________