[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] An ignorant question about TC<-> SC
Hi ! Patrik Fältström :
----- Original Message -----
From: "Patrik Fältström" <paf@cisco.com>
>
> Paul is completely correct. The important thing is to have one
well-defined
> set of rules for "equality". Given these rules, three things can happen:
>
> - A registry can create well-defined rules for what can be requested
> to be registered
> - A registry can create well-defined rules for what goes into the
> DNS zonefile given a registration request
> - The dispute resolution process for a specific domain can define
> their rules, and come up with their own practice for decision making
>
> What is NOT acceptable is to have rules which doesn't settle down, changes
> all the time etc.
>
Right , the PRC-only-SC is derived by some fixed rule and keep more
than 40 years.
> Just like Paul I see too many players trying to come up with engineering
> solutions to real world problems which have to do with "matching of words"
> and not "matching of identifiers" which DNS is about.
>
> paf
>
The terminology "word" is very easy confused between chinese
speaking and non-chinese speaking people. The "chinese word" may be
presented as multiple Han character phrase and single Han character. 1-n
mapping is based on phrase , so it is content sensitive. But the 1-1 mapping
, especially the PRC-only-SC that is derived from TC based on quick-written
form , that are characters with the same phonic and can be replaced each
other in character position of word phrase or word. The basic unit in CJK
code point is a single-character , it is also a single-character-word with
some meanning. An identifier is just a string of symbol , the character
symbol in CJK has inherent meanning in each character, "a" "b" "c" have no
explicit meanning in each characters , but CJK characters have , the
combination of them means the combination of meanning with each character's
meaning. That is why the identifier of CJK characters have more information
density in them. Even the characters has individual meanning , you can not
forget it is only a character in a string of identifier.
1-1 mappimg TC/SC characters can be direct replaced each other,
but they are not presented in each area by the forbidden native code
encoding. The UNICODE based character input and display like win2000 has
break the separation barrier . Now, the mixing of TC/SC is more like the
case of ASCII character.
Matching with word is not the work of layer 1 DNS , but
matching with identifier is . Even the identifier is composed by CJK
single-character-word , it is an indentifier and the 1-1 mapping TC/SC is
the script characters too. The single chinese character has inherent
meanning like a word . But the IDN chinese identifier is composed by these
single character word , you can not forget the fact CJK character is a
character. 1-1 mapping of TC/SC is not to do the conversion of whole phrase
meanning. It only try to reduce the complexity of combination in
pair-replacable-characters but with the restriction to dispay them in user
favor fom.
L.M.Tseng