[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] reordering strawpoll
Written by Soobok:
> You may have missed my another answer to Kent which provides 2.6 as the average
> researched by Adam. My above examples argument stands even with 2.6 jamos.
> 2.6*3octets vs 2.6 latins == 3time more space needed. That was my point.
> 3 jamos syllable is just for an example. Okay? :-)
I do see what Adam researched, and it was that there were 2.6 English letters/Hangul
in terms of information content. (He is comparing the length of the Hangul translation
of the bible with the English.) This is a completely different number from the number
of jamos/syllable. This is getting more confusing that it needs to be but it would help
if you quote a bit of the original research you refer to. Like this:
Written by Adam:
> Here are the counts for Genesis chapter 1:
>
> King James: 3167 letters
> Basic English: 3088 letters
> Chinese Union: 778 ideographs
> Korean Revised: 1201 Hangul
>
> references:
> http://www.ccim.org/bible/
> http://bible.wisenet.co.kr/
>
> So it's about 4.0 English letters per Chinese ideograph, and about 2.6
> English letters per Korean Hangul.
>
> Each Korean Hangul takes about 2.9 octets in AMC-ACE-Z, which means a
> maximal Korean domain label (20 hangul) holds about as much information
> as a 52-letter English string, which about 17% less information than
> a maximal English domain label (63 letters), and about 38% less
> information than a maximal Chinese domain label (19 ideographs).
>
> I now retract this statement:
>
>> Of all the languages I've looked at, Korean is by far the least dense
>> when encoded using AMC-ACE-Z.
>
> In light of the new data, I doubt that Korean is the least dense.
>
> AMC