[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Unicode Public Review Issues update
On tisdag, feb 4, 2003, at 17:35 Europe/Stockholm, RJ Atkinson wrote:
Example is that the lunar new year is called "Te^'t",
with both ^ and ' being above the "e". And the question is just
whether
the 3 different ways of composing that character are properly handled
by normalisation.
It should (in theory), because the normalization consists of:
- Decomposition of characters into as many codepoints as possible
- Sorting of the diacritics (canonical ordering)
- Recomposition to as few codepoints as possible
See 3.10 of the Unicode standard 3.0, pages 50-52, about canonical
ordering.
Now, the rules for canonical ordering might be wrong for Vietnamese,
and then that should be pointed out.
paf