[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] opting out of SC/TC equivalence
--On 2001-09-01 17.59 +0800
"=?UTF-8?B?dHNlbmdsbUDoqIjntrLkuK3lv4Mu5Lit5aSnLnR3?="
<tsenglm@cc.ncu.edu.tw> wrote:
>> > By this principle, why partial set of CJK characters that
>> > are partial setted by local language tag can be used with
>> > different TLD (cn,jp,tw..). Because the TLD implied the language tag
>> > and will let them differentiable from other TLD .
>>
>> You have to be more precise.
> We talk the version , updating and how to keep backward
> compatible.
> And you give a good example:
> Let's assume that the table day one include the following characters:
> {A,B,C,D,E,F,G,H,c}
> We agree that the simple equivalence rules for 1-1 mapping maps A->B.
> This means that the following characters are available for domain name
> registration:
> {B,C,D,E,F}
No, if the mapping is from A->B. Then you can use {B,C,D,E,F,G,H,c}.
Further, A is not allowed in the zonefile, but a user can type 'A'. I.e. he
can use 'A' as input to a domain name lookup.
> First , let us assume G,H are reserved , because they are not frequently
> used.
What do you mean by "reserved"? Do you mean "forbidden"?
> and A->B is not the reason of easy-to-confuse, it is different font
> shape but with the same meanning. c->C because they are easy-to-confuse
> not only in meannig but also in similar shape.
Wrong. The mapping A->B is there because the Unicode Consortium Technical
Report #15 has defined the normalization process that way.
You start talking about "confusion", and that is a term which this WG has
defined being out of scope. The users WILL be confused. They are already
today.
> Comparing to the above two set, the 1-1 mapping let {A,c} is not allowed
> in registration table.
Wrong. When a user want to register something with 'A', 'B' will be
registered. This means that when a user types A, he will get a match on B.
When a user types 'c', he will match 'C'.
Only 'B' and 'C' is registered.
> {G ,H} are also not in the registration table
No, because they are forbidden (I guess).
That is a different problem.
> It
> is like to let {A,G,H,c} are limited to use for further extension.
Wrong. A and c are already included in the mapping tables, to B and C
respectively.
Only G and H are not used.
> So ,
> you have further chance to let A included in the furture set .
No. That is completely wrong!
> After
> A.example.com is allowed in registration table and you can also let it to
> be assigned as B.example.com for backward compatible first .
No.
> Then you
> can let "A" available to other .
No.
> But the {c,C} is in the set of easy-to-confuse , so {c} is fixed
> mapping to {C}.
Correct.
> (BC.tw , Bc.cn) (AD.tw , BD.cn) are all workable and may not be confused
> in {A,B}{C,c} .
It will, as 'A' and 'B' are defined to be the same character.
> But (BC.com Bc.com ) (AD.com BD.com) will be confused in
> the same domain .
They are already defined to be the same as we have the mapping 'A'->'B'.
So, they are the same. You can not register both A.tw and B.tw, just like
you can not register A.com and B.com. It doesn't matter what TLD you talk
about.
The matchin rules must be the same.
I have told you this several times now, and I am REALLY tired on repeating
myself all the time!
> If you must let {A,G,H,c} in .com , We suggest to divided them
> into 3 parts , {c,C} in nameprep and {A,B} in language related keyword
> system , {G,H} reserved. By letting the character set size small in
> frequently using, the problems can be fixed easily in 1-1 mapping.
This will not work. Full stop! See above.
>>
>> (A) In DNS, we can only use one character set. One only. We have picked
>> Unicode.
>>
>> (B) In DNS, the matching algorithm have to be the same in every piece of
>> software which uses the DNS. It can NOT differ between different
>> languages used by the client. It can NOT differ between different
>> domains. It can
> NOT
>> differ between geographical regions. The matching algorithm we have is
>> nameprep.
>>
>> (C) A TLD CAN have a policy which says that only a subset of the
> characters
>> allowed according to (A) is allowed.
>>
>>
>> You don't specify when you say "TLD" and "language tag" above whether you
>> imply a restriction according to (C) (which is ok) or if you have a
>> requirement which is a vilation to (B), or if you have a violation to
>> (A).
>>
>> Be more specific.
>>
> The (C) is policy based , so how do you restrict .COM not to
> produce more confusing ?
I don't care.
Read the paragraph above (C) again.
> The solution of character mapping in nameprep will assign a
> standard characters in the equivalent set , so CJK area will not happy in
> this situation. But if you let them find later that confusing in
> characters mixing in gTLD will happen then they will be more angry.
> Realname's keyword is fine in language separation but it also need to keep
> the uniqueness in name.
Read the drafts by Dr Klensin and Mr Mealling again as you obviously have
not understood at all what they talk about.
> I do not like to hear someone hate we use a 8bit clean BIND
> compatible DNS server that can do multibyte characters lookup and TC/SC
> equivalence comparations.
See the rules (A), (B) and (C) above. You break those rules.
> MicroSoft has showed all the possibility , I
> think there are a few differences .
No, they have showed that it is impossible.
>> What you and Liana are doing is claiming that the work the last 20 years,
>> the cooperation between all countries which use CJK characters in ISO is
>> wrong, and that IETF should do the job instead.
>>
> It is impossible to IETF. The TC/SC problems come from PRC and
> Taiwan in internal civial war, the mixing of TC/SC will happen in HK, Mo.
> Even now , TC/SC is not a good topic in Taiwan.
But, you are the one which say that we should ignore what the
representatives from the governments have agreed to in ISO!
paf