[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode Public Review Issues update



On tisdag, feb 4, 2003, at 17:35 Europe/Stockholm, RJ Atkinson wrote:

Example is that the lunar new year is called "Te^'t",
with both ^ and ' being above the "e". And the question is just whether
the 3 different ways of composing that character are properly handled
by normalisation.
It should (in theory), because the normalization consists of:

- Decomposition of characters into as many codepoints as possible
- Sorting of the diacritics (canonical ordering)
- Recomposition to as few codepoints as possible

See 3.10 of the Unicode standard 3.0, pages 50-52, about canonical ordering.

Now, the rules for canonical ordering might be wrong for Vietnamese, and then that should be pointed out.

paf