[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Matching and comparison
Paul Hoffman / IMC wrote:
>
> At 05:47 PM 1/20/00 +0900, Martin J. Duerst wrote:
> > > Unless we can show a need for case-insensitivity *in the
> > > internationalized characters*, we shouldn't force it.
> >
> >The largest need, already discussed, is clearly that a lot of people
> >don't want to have to register ibm/ibM/iBm/iBM/Ibm/IbM/IBm/IBM to
> >make sure nobody else registers. And three-letter companies still
> >have an easy job.
>
> That will always be a problem, regardless of what we do with case
> sensitivity. Using the same logic, he Dürst company would not only have to
> register Dürst.com, it would have to register Dûrst.com, Dúrst.com,
> Dùrst.com, Dûrst.com, and Dùrst.com, not to mention about a dozen more that
> my Eudora MUA didn't want to type for me. And this is just the European
> scripts; I think that Indic and Arabaic scripts would have very similar
> problems.
Well I think that is to strong, but you can make a more realist example by
doing Du\:rst and Durst (say for dutch customers where the german u\: == u
phonetically and the u == ue). Or much the same for things like the #,
AE and
ae in the scandinavian languages or the ij, dz, ts or nj in eastern
europe;
which is a harmless ligature in one language (and easily replaced by
its two
components and/or have it's case folded) whilst in the other language
changes essential meaning when folded or replaced by the two visually similar
singe glyph component.
But then again; this is largely an entry/encoding issue; the Unicode spec
hapily normalized them in a lot of cases in something which is visually
the same.
Dw.