[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] draft-ietf-idn-tsconv-01.txt
Dear Ted Hardie:
> It is not so clear, however, that these goals should or can be
> met using the DNS infrastructure as described. Probably one of the
> most important issues raised by the draft is in this note:
>
> [Editor's note: As Chinese character's in common use by CJK
> people, so such table may be modified after making consensus with
> language experts of CJK area.]
>
>
> ..... The authors apparently
> feel that this could be managed with exclusion lists. I believe that
> a reasonable list of such exclusions would run into several thousand
> and that some of the most common characters would fall into that list.
I am sorry if I talk about the inhibitation approach to mixed
TC/SC in this list that may make your mistake. That is my personal
suggestion but not the authers of draft, even I joint the work to study it.
The announced draft of tsconv have not this issue yet.
Someone suggest to solve 2^n problems by forcing user to input all-TC or
all-SC approach. This approach need one help procedures to tell user they
type in a mixed TC/SC hostname . If the convertion approach can not be
solved at present , then we can check and inhibit the mixing to avoid the
problems of un-predicatble resoving . That is my personal suggestion to
solve the character overlapping trouble , that is like you want SC converted
to TC , but it may be 1-n relation , you can not find the real mapping path
by single characters , but you can check all the characters in this hostname
and to find a SC is mixed with other TC that is belong to paired (SC,TC) ,
because this character can not be mapped to exact one TC and you found
another paired TC presented , so it is a mixed SC/TC . The checking only in
these 1-1 , 1-n SC/TC pairs , it is almost the same size as original SC/TC
simplified table.
> I think that the use of an exclusion list of that size is likely to
> diminish the effectiveness of this approach to the point of
> unusability. If the user community cannot know whether two characters
> map to equivalence without knowledge of an extensive exclusion list,
> they are considerably worse off than if they were dealing with just
> complex and simplified characters sets.
>
> As a trivial example, the character "guo" used in "zhongguo"
> (China) is also used in some form by kanji and hanja.
Inhibitation is to check the mixing of TC/SC but not to do
equivalence mapping , if they are matched to some conditions that have been
mapped by users to one uniform form that are exised in the registed file ,
then it can be converted to ACE in her form , if they are not , it is mixed
, so user need to try again in the client input.
Inhibitation of mixing is just to stop the unknown mapping
relation and keep it back to users to do correct mapping.
Thanks your care to my suggestion.
L.M.Tseng