[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] Unicode tagging
RJ Atkinson wrote:
> I'd bet a Dim Sum lunch that there are other languages with similar
> issues in Unicode/ISO-10646.
No need to bet. Indic languages in Unicode now are basically similar.
And so is Thai.
Correct me if I am wrong but the principle that Unicode adopt is that if
a character can be formed by the NFD, they will use the decompose form
rather than the assigning codepoint for a composed form.
However, this does not prevent anyone trying to change it tho :-) For
example, there are two standard Tamil endorsed by Tamil Nadu known as
TAB and TAM (Bililingual/Monolingual). TAM contains composed form of
Tamil which takes 2 to 3 Unicode codepoint to form.
-James Seng
> The bottom line is that a hard limit does not appear reasonable
> to define and implement -- at least not in a manner that is fair
> to all language groups and fairness was the objective of having
> such a hard limit.
>
> Yours,
>
> Ran
> rja@inet.org