[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Re: hi



Some Hangul characters are equivalent to only one English vowel,
but most are equivalent to one consonant and one vowel (like
Japanese kana). In addition there many are equivalent to 2 consonants
and 1 vowel so Hangul are more efficient than kana. Perhaps
2.2 English letters per Hangul character at a guess? Also
remember that Kanji are used together with Hangul, which tends
to increase that number.

Bruce

----- Original Message ----- 
From: "Martin Duerst" <duerst@w3.org>
To: "Adam M. Costello" <amc+5nb0vi@nicemice.net>; "IETF idn working group" <idn@ops.ietf.org>
Sent: Monday, October 22, 2001 12:01 PM
Subject: Re: [idn] Re: hi


> Hello Adam,
> 
> At 18:31 01/10/20 -0700, Adam M. Costello wrote:
> 
> >The situation is much worse for Korean.  I think each Hangul character
> >carries the information of only about 1.5 English letters,
> 
> It may be lower than Chinese, but I'm very surprised it should
> be that low. Any pointers to sources? Are they for running text,
> or for names? For running text, Korean uses spaces, but Chinese
> doesn't, so that already could explain quite a bit of the
> difference.
> 
> 
> >but still
> >takes about 2.9 octets in AMC-ACE-Z, which means a maximal Korean domain
> >label (20 hangul) holds about as much information as a 30-letter English
> >string.  Of all the languages I've looked at, Korean is by far the least
> >dense when encoded using AMC-ACE-Z.
> 
> Even if this were true, a 30-letter limit in English would still be
> nothing really bad in actual practice.
> 
> Regards,   Martin.
> 
>