[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] reordering strawpoll



In a message dated 2001-11-12 14:41:29 Pacific Standard Time, 
lsb@postel.co.kr writes:

>  If you encode each Hangul syllabic in 3 jamos in utf8,
>  it need 3 octets * 3 = 9 octets, while 3 basic latin letter need 3 octets 
in utf8.
>  3 times more space!  if there were any real "compaction" on hangul
>  syllable code points, that may be just the bare minimum.

But one paragraph earlier, Soobok stated that each hangul character is 
roughly equivalent to (i.e. carries roughly as much information as) 2.2 to 
2.7 Latin letters.  So the 9 octets of UTF-8 actually encode the equivalent 
of 6.6 to 8.1 Latin letters, which means Hangul encoding is 10% to 27% less 
efficient than Latin encoding.  Representing it as two-thirds (67%) less 
efficient is obviously misleading.  Such claims only detract attention away 
from any merit the reordering plan may have.

>  From What i get from reorering experiments, It became clear that 
>  long han/hangul code points sequence of length N can be represented 
>  by 2.0~2.2 * N  latin letters. Without reordering, it would be 3.0~3.1.
>  33% improvement is possible! Why should we go without reordering
>  which merely require simple mapping tables with so many benefits?

James Seng has stated repeatedly that there is no need to reiterate, yet 
again, the supposed benefits of reordering.  Every proposal, including this 
one, has both advantages and disadvantages which must be weighed against each 
other.

-Doug Ewell
 Fullerton, California