[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Re: [JET-member 464] Re: Fw: Re: new members invitation



John,

"Beats me" is fine answer, as is looking critically at the stated, and
unstated, problem constraints.

If we were to consider Abenaki/Mikmaq/Maliseet/... (17th century French
script derived scripts) and other Indian languages (contemporary, even
modern Spanish and English derived scripts) character sets and "Roman",
we'd be looking at the following equivalence problem:

                U+0070          "8"     when in an alpha-string
                U+0222          "8"     LATIN CAPITAL LETTER OU
                U+0223          "8"     LATIN SMALL LETTER OU
                U+0117,U+0125   "OU"
                U+O117,U+0165   "Ou"
                U+0157,U+0165   "ou"
                U+0127          "W"
                U+0167          "w"

This just for the "8". "TC" in this context would be Abenaki (etc) written
with niche-market IBM Selectric type sets, similar to Inuktitut, and "SC"
would be the same written on an unforgiving 101 key IBM PC keyboard, ACSII.
We call it "diacritically simplified" characters -- basically everything is
promoted to its nearest ASCII look-alike, and all or almost all diacriticals
are stripped.

I don't claim that "solving for Abenaki <or even Cree or Dine'>" is nearly
as important as the SC/TC problem.

Eric