[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Mixed TC/SC (was Re: Layer 2 and "idn identities")



-----BEGIN PGP SIGNED MESSAGE-----

liana Ye wrote:
> We are discussing how the registrars to "avoid registering mixed scripts
> 'names' ". Can you suggest any way of doing this, or any feasible guidelines?

It's fairly trivial: define a formal syntax that registrars can check
against for domains that should exclude mixed names.

The general form of that syntax would be:

['\' means set difference, '/' means union, '&' means intersection]

  SCONLY = ... ; set of Han ideograph code points that are only used in SC
  TCONLY = ... ; set of Han ideograph code points that are only used in TC

  ; a string that is not mixed either contains no SCONLY code points,
  ; or no TCONLY codepoints.
  NotMixed = *(UNICODE \ SCONLY) / *(UNICODE \ TCONLY)

  HostString = ... ; whatever syntax is decided on
  NotMixedHostString = NotMixed & HostString


If you want to allow mixed TC/SC in strings that also contain Katakana,
Hiragana, Bopomofo or Hangul (on the grounds that these are not Chinese),
then define:

  MixedAllowed = *UNICODE (Scripts:HIRAGANA /
                           Scripts:KATAKANA /
                           Scripts:BOPOMOFO /
                           Scripts:HANGUL) *UNICODE

  AllowedHostString = (NotMixed / MixedAllowed) & HostString


Note that SCONLY and TCONLY can include code points that would have
1-many and many-1 mappings for TC/SC conversion, as well as 1-1 mappings.

- -- 
David Hopwood <david.hopwood@zetnet.co.uk>

Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/
RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5  0F 69 8C D4 FA 66 15 01
Nothing in this message is intended to be legally binding. If I revoke a
public key but refuse to specify why, it is because the private key has been
seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip


-----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
Charset: noconv

iQEVAwUBPBBg8DkCAxeYt5gVAQHYlAgAn+ZCVM1J6LR5JvM5J5L+OuXPuefoDnTC
GJyFEJv7Nk0kAlLFBKQBAjr2Vx5m5gQJ89+yuTczEFx6DdckaOhCDtDuw9rFF56S
fo76QlE8kH4VvuMXid+JVWKS2hvcHibHuDT3HwIVE20GY2ATfrqLr91nST/3WHI5
HBz98FGRuH68sFeNqB2S/OyNk/VXVr+csssSk05K7W8tmNu1gSarLQkblrAKPCLd
JdtrDZW4RhZiHr1cXPRHfifug6rwxMl7WhNFoycKGSVfNT86KRvUEEc3HLCrvxaP
oXbr0gQh3x1QnjfUd+BxpWYKZqOoCYpjEKPcdvflEiNsmlW6eY9QwQ==
=gvKd
-----END PGP SIGNATURE-----