[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] language scripts classified



While gathering  usage frequency data for many language scripts,
I classified them into 6 categories (see enclosed ):
 
  Will all of these scripts  be frequently  used in IDN ?
  At least #3 archaic scripts and #4 won't and
  for #1 and #2 , i am neutral yet.
  I am working on #5.
 
I can find many similarities and ambiguities across these language scripts.
Many Indian language scripts came from the same mother language
'Brahmi' and have  similarly-looking scripts and numerals .
 
 see http://www.omniglot.com/writing/yi.htm
 
 
Soobok Lee 
------------------------------------------------------------------------
 
#0  already in ML.com testbed  (I got huge usage frequency samples for these)
Hebrew:
Arabic:
Thai:
Hindi(Devanagari):
Cyrillic:
Greek:
Hiragana:
Katakana:
Hangul:
Han Ideograph:
 

#1 few native speakers
 
Georgian: < 3.5 millions
Cherokee:  100,000 (North America)
Thaana  :  100,000 (Maldives)
Armenian:   2 millions
 
 
#2 two languages in one country:
 
bengali: Indian language ( > 180 millions )
gujarati: Indian language 
gurmukhi: Indian language
kannada: Indian language
malayalam: Indian language
oriya: Indian language
telugu: Indian language
 
tibetan : chinese (mainland china)   , < 5 milliions
yi syllables: chinese (mainland china), < 5 millions
 
 
#3 Archaic
Ogham:
Runic:
Gothic:
Syriac: only used for for liturgical purpose
Unified Canadian ABoriginal Syllables:
 
 
#4 Written only vertically
mongolian:  (once had been abolished and restored)
 
#5  others

sinhala : Sri lanka
tamil : Sri lanka
khmer : Camobodian
lao  : Laos, Thai
ethiopic : Ethiopia
myanmar : Burmese