[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] case preservation
some corrections:
----- Original Message -----
From: "Soobok Lee" <lsb@postel.co.kr>
To: "Dan Ebert" <dan@enic.cc>; "Martin Duerst" <duerst@w3.org>
Cc: <idn@ops.ietf.org>
Sent: Thursday, October 11, 2001 12:55 AM
Subject: Re: [idn] case preservation
>
> ----- Original Message -----
> From: "Martin Duerst" <duerst@w3.org>
> To: "Soobok Lee" <lsb@postel.co.kr>; "Dan Ebert" <dan@enic.cc>
> Cc: <idn@ops.ietf.org>
> Sent: Wednesday, October 10, 2001 5:31 PM
> Subject: Re: [idn] case preservation
>
>
> > At 17:12 01/10/10 +0900, Soobok Lee wrote:
> > >No. Regimate registrants could own the domains and use them publicly.
> > >Cyrillic 'H' ( cyrillic upper EN) is read differenly from Latin 'H'.
> > >Cyrillic 'HOME' has nothing to do with English "HOME".
> >
> > There is indeed a non-zero (but very, very small) probability
> > for such cases. But if domain names are written in lower case
> > the way they mostly have been up to now, a word in a language
> > written in Cyrillic looking the same as a word in a language
> > written in Latin would be about as rare as a four-leaf clover.
> >
No. mcuh more frequent than you guess.
Cyrillic small 'a' 'e' 'o' 'c' 'p' 'x' 'y' 'i' 'j' 's' have the exactly same look with latin small ones. All word combinations of them will collide with LDH.ru ones.
For example, lowercased Cyrillic "cape, sexy, epoxy, ceo, pay, cpa "
cannot be distinguished from english ones in most rendering environments.
see http://www.unicode.org/charts/PDF/U0400.pdf
Only Cyrillic 'B' 'H' 'M' 'T' have lowercase forms which look the same with uppercase ones.
Soobok Lee