[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] time to move



Roozbeh Pournader <roozbeh@sharif.edu> wrote:

> Would you describe why do you see UTF-8 bad for Arabic and Cyrillic?

Because for those scripts the ACE encoding is usually significantly
smaller than the UTF-8 encoding.

Seen another way, the Cyrillic and Arabic alphabets are about the same
size as the Latin alphabet, so the amount of information per character
is about the same for all three scripts, but UTF-8 uses almost double
the number of octets per character for Cyrillic and Arabic as opposed to
Latin.

AMC