[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Question for the Kanji & Hanja cognosentee



This is the core of the problem.  People want to use their
own script for names, and there will be character conflict
just as any symbol system, and this is the feature 
Internetionalization wanted, isn't this correct?   Let us
use your example for the solution and study the trade offs.

1. There should be:

>   1) in Korea:  kuk
>   2) in Japan:  kuni, kok, ...
>   3) in China:  kuo, ....

> If you designate an unique romanization sequence in 3)  for han 
> 'nation',
>   1) and 2) and even someone in 3) will be unhappy for it  ,

So we shall designate 3 mapping tables!  And remember, 
this is character code, not word codes.  When you do 
registration, the use is "zhongguo"  <china> and the registration 
system should do the Unicode checking for the Unicode
string is taken by Cinese already.  But this does not 
exclude a Korean name says: <KoreanChinaTrading>, 
since the convention for Chinese to to this is 
<ChinaKoreanTrading> so you will have 
>   1) in Korea:  kuk
>   2) in Japan:  kuni, kok, ...
>   3) in China:  kuo, ....
and Unicode <KoreanChinaTrading>.

The naming convention for names are quite different among
Chinese, Japanese and Korean using Kanji.  Similarly we
can look at sirnames to tell people's origins in US, and it is 
more then 95% correct.  The registration process is to catch 
the other say 5% name conflict.  

The three mapping tables are similar with Latin upper-to-lower
case folding, only larger. So for Korean it has five columns:

Unicode		Unicode 	KSC code 	KSC code 	StepCoded
Hangul		Hanja		Hanja		Hangul		Romaji

It is certain the Hanja will have longer list than Hangul. 
So there will be 2 Hangul to 1 Hanja, and your IDNA can decide
which one to use, not the [nameprep].  The StepCode is
concatenation of Romaji+digit and the author has to make it
one-to-one within this kro- table.

Liana

On Fri, 17 Aug 2001 17:53:58 +0900 "Soobok Lee" <lsb@postel.co.kr>
writes:
> For example,
>  Han letter 'nation' is pronounced :
>   1) in Korea:  kuk
>   2) in Japan:  kuni, kok, ...
>   3) in China:  kuo, ....
> If you designate an unique romanization sequence in 3)  for han 
> 'nation',
>   1) and 2) and even someone in 3) will be unhappy for it  ,
>    saying " looks like a RACE label! ", 
>   since romanization lose its merit:  written as it is pronounced!!!
> 
> Soobok
>   
> 
> ----- Original Message ----- 
> From: <liana.ydisg@juno.com>
> To: <lsb@postel.co.kr>
> Cc: <liana.ydisg@juno.com>; <bthomson@fm-net.ne.jp>; 
> <idn@ops.ietf.org>
> Sent: Friday, August 17, 2001 6:06 PM
> Subject: Re: [idn] Question for the Kanji & Hanja cognosentee
> 
> 
> > If you use another feature of the character not based on
> > sound in addition to pronounciation, and fix it in your 
> > case folding table, then  you will have one-to-one 
> > mapping, and the language/semantic context is out
> > of the table.   
> > 
> > Liana
> > 
> > On Fri, 17 Aug 2001 17:23:45 +0900 "Soobok Lee" <lsb@postel.co.kr>
> > writes:
> > > 
> > > ----- Original Message ----- 
> > > From: <liana.ydisg@juno.com>
> > > To: <lsb@postel.co.kr>
> > > Cc: <bthomson@fm-net.ne.jp>; <idn@ops.ietf.org>
> > > Sent: Friday, August 17, 2001 4:54 PM
> > > Subject: Re: [idn] Question for the Kanji & Hanja cognosentee
> > > 
> > > 
> > > > If Hangul mapped to Latin letters like Romaji and then
> > > > add a number to select one Kanji among a few 
> > > > homophones, can this be good enough to idnetify a Hanja
> > > > name in DNS?
> > > 
> > >  some hangul  trailing jamos,
> > > for example , di-geuth, hi-euth and ti-euth,
> > >  have the same sound while their leading jamo 
> > > have different sounds. You need some differenciating 
> > > representation of trailing hangul jamos in romanizing hangeul 
> and
> > > That may cause some overheads...
> > > 
> > > Even a Hanja/Kanji/TC/SC letter often has multiple 
> pronunciations 
> > > in different words and so  multiple romanizations for a hanja 
> > > letter are possible!!
> > > 
> > > IMHO,Pronunciation-based romanization on Hanja/Kanji/TC/SC 
> > > should be performed in word/language context 
> > > (not in individual unicode point context ) , but It's not 
> achievable 
> > > in DNS
> > > which may have no language/script context (in .com) and often 
> have 
> > > no
> > > word sematics in a label (single han letter label).
> > > 
> > > 
> > > Soobok
> > > 
> > > > 
> > > > The same question goes to Bruce Thomson:
> > > > Can Romaji be revered back to Kanji-Kana sequece with
> > > > near 100% rate (with or without case ending)?
> > > > 
> > > > Liana
> > > > 
> > > > On Fri, 17 Aug 2001 16:14:04 +0900 "Soobok Lee" 
> <lsb@postel.co.kr>
> > > > writes:
> > > > > 
> > > > > ----- Original Message ----- 
> > > > > From: <liana.ydisg@juno.com>
> > > > > To: <lsb@postel.co.kr>
> > > > > Cc: <liana.ydisg@juno.com>; <bthomson@fm-net.ne.jp>; 
> > > > > <idn@ops.ietf.org>
> > > > > Sent: Friday, August 17, 2001 4:08 PM
> > > > > Subject: Re: [idn] Question for the Kanji & Hanja 
> cognosentee
> > > > > 
> > > > > 
> > > > > > It is correct, there will be no disambiguations in 
> > > > > > DNS for anyone.  It has to be resolved at registration 
> > > > > > time.  Then do you need Hanja in Domain name at all?
> > > > > 
> > > > > Yes, but rarely.
> > > > > some japanese/chinese restaurants in SEOUL Korea
> > > > >  have the primary name in Hanja(Kanji).
> > > > > Most korean individuals/companies won't pay for
> > > > > rarely used HANJA domains, I guess.
> > > > > 
> > > > > > Why? If Hanja names is only used for Chinese and Japanese,
> > > > > > then how do Korean people separated from each other? 
> > > > > > Are there many people with the same Hangul names?
> > > > > 
> > > > > Most Koreans have their TC-form fullnames. Many Korean
> > > > > businesses , too. But they are not used so frequently
> > > > > as hangul ones.
> > > > > 
> > > > > 
> > > > > In my rough estimation, most frequent 5000 hangul personal 
> full 
> > > > > names 
> > > > > form the set of distinct fullnames of about 90% of korean 
> > > > > populations.
> > > > > 
> > > > > South Korean population reached  47,000,000 recently.
> > > > > 
> > > > > > 
> > > > > > I have heard a law suit case here, that a Vietnanese vs. 
> > > > > > another Vietnanese in the San Francisco area, both
> > > > > > sides of the case and a witness of the case all have 
> > > > > > exact the same name!  And they all need interpretations 
> too.
> > > > > > Imagine the headaches for the lawyers!
> > > > > > 
> > > > > 
> > > > > :-))
> > > > > 
> > > > > Soobok
> > > > > 
> > > > > > Liana 
> > > > > > 
> > > > > > 
> > > > > > On Fri, 17 Aug 2001 15:06:01 +0900 "Soobok Lee" 
> > > <lsb@postel.co.kr>
> > > > > > writes:
> > > > > > > Hi, Liana
> > > > > > > 
> > > > > > > ----- Original Message ----- 
> > > > > > > From: <liana.ydisg@juno.com>
> > > > > > > > What happen when people read newspapers with Hangul 
> > > > > > > > without Hanji such as it is in North Korean?  
> > > > > > > > How to you get a Hanji through hangul if it is 
> one-to-many 
> > > 
> > > > > > > > correspondence?
> > > > > > > > 
> > > > > > > Korean have been familiar with many hangeul homonyms 
> that
> > > > > > > share the same hangeul word but have different TC 
> > > forms/meanings
> > > > > > > and optionally different sounds (long or short vowel 
> etc) .
> > > > > > > Ordinary Korean can disambiguate them  only by the 
> > > surrounding
> > > > > > > semantical context (sentence or paragraph) in which they 
> 
> > > appear.
> > > > > > > 
> > > > > > > In DNS, we have no such contextual clue for 
> disambiguations.
> > > > > > > 
> > > > > > > Soobok
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
>