[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Report from the ACE design team



This is a problem of uniform encoding design for all the code points
 in UNICODE.  Some languages needs 26x1 as in ASCII, some needs 
26x2 as in English and some needs 26x3 as in Latin.  Chinese needs
26x10 not beacuse it can cover only 260 particles, it is due to phonetic
element description of  about 1200 sounds of  "mandarin" Chinese. 
When different Chinese character grouped together and form a "normally"
unique identification and good enough for a historically accepted phrase
or ideom, which is extrapolated  to as it is good enough as a registered 
name.  A semantic equivlent phrase  such as "Hitting two birds with one
stone" in Chinese is 4 characters.  This is 27 user readable octets in 
English.  In registered donmain name using StepCode this would be 
"yishiliangniao1233", 18 user readable octets. 
In UNICODE, this is 16 octets, and so it is in DUDE.  

I don't know how the Japanese phonetic system defined in detail. But
at least they are information density for each symbol is different with
the 
26 letters, and their Hiragana and Katakana together only needs 26x8
code points.  So for often used Japanese characters, it is possible only
2 octets is enough.  Now DUDE forces Japanese to use up 4 octets each
character, I don't see how you can let that go as your principle of the 
design concern is in compression rate?

Liana



On Mon, 25 Jun 2001 13:36:34 -0700 Paul Hoffman / IMC <phoffman@imc.org>
writes:
> At 1:32 AM +0900 6/26/01, Makoto Ishisone wrote:
> >The following points are my main concern:
> >
> >1) Is 14-15 character is enough?
> >    At least for Japanese domain names, name of a company or an
> >    organization is sometimes quite long.  My question is whether 
> the
> >    maximum of 14-15 character name for CJK is enough or not.
> 
> There is an assumption in those statements, which is that a Japanese 
> 
> or Chinese company with a very long name would need a DNS name that 
> is as long as their very long name. As we have discussed on this 
> list 
> many times, companies and organizations with "difficult" names (such 
> 
> as very long or hard to remember) often would prefer to use shorter 
> domain names that are more useful to end users. These might be 
> abbreviations or pseudonyms, but the result is the same: a domain 
> name that end users can transcribe more easily. So, the question of 
> maximum 15 characters is not "are there names that are this long" 
> because certainly there are (just as there are European company 
> names 
> that are too long for any ACE encoding), but "are there names that 
> this long that people would actually want end users to enter". This, 
> 
> of course, is a judgement call.
> 
> >2) Potential migration problem
> 
> I agree with the others who have said that this should not be an 
> issue. If we try to balance with RACE, we must also balance with all 
> 
> the other pre-standard encodings, some of which can handle CJK names 
> 
> that are longer than any ACE could possibly achieve.
> 
> --Paul Hoffman, Director
> --Internet Mail Consortium
>