[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [idn] case folding
> From: RJ Atkinson <rja@inet.org>
> In Unicode, there are pre-composed characters and also composed
> characters. If there is no
> pre-composed form for a letter, but there might be (hypothetically)
> multiple ways of composing that letter, then there needs to
> be normalisation to a single form for a given letter prior to
> comparison for DNS purposes.
If I remember correctly, you have characters for "Combining Diacritical
Marks". At that point, normalization is simple: if there is no
specific character, you write it as the sum of components (in our
case, "o" + "^" + "`"), sort them in (unicode) lowest-to-highest code,
and you're done.
This won't work for ideograms, probably, but we already excluded
them in this discussion.
(btw: are Vietnamese characters found in row 1E, Latin Extended
Additional?)
ciao, .mau.