[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] reordering strawpoll



Written by Soobok:

> You may have missed my another answer to Kent which provides 2.6 as the average
> researched by Adam. My above examples argument stands even with  2.6 jamos. 
> 2.6*3octets vs 2.6 latins == 3time more space needed. That was my point.
> 3 jamos syllable is just for an example. Okay? :-)

I do see what Adam researched, and it was that there were 2.6 English letters/Hangul
in terms of information content. (He is comparing the length of the Hangul translation
of the bible with the English.) This is a completely different number from the number
of jamos/syllable. This is getting more confusing that it needs to be but it would help
if you quote a bit of the original research you refer to. Like this:

Written by Adam:

> Here are the counts for Genesis chapter 1:
>
> King James:     3167 letters
> Basic English:  3088 letters
> Chinese Union:   778 ideographs
> Korean Revised: 1201 Hangul
>
> references:
> http://www.ccim.org/bible/
> http://bible.wisenet.co.kr/
>
> So it's about 4.0 English letters per Chinese ideograph, and about 2.6
> English letters per Korean Hangul.
>
> Each Korean Hangul takes about 2.9 octets in AMC-ACE-Z, which means a
> maximal Korean domain label (20 hangul) holds about as much information
> as a 52-letter English string, which about 17% less information than
> a maximal English domain label (63 letters), and about 38% less
> information than a maximal Chinese domain label (19 ideographs).
>
> I now retract this statement:
> 
>> Of all the languages I've looked at, Korean is by far the least dense
>> when encoded using AMC-ACE-Z.
>
> In light of the new data, I doubt that Korean is the least dense.
>
> AMC