[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] NFC vs NFKC



This is not an issue.

1. You still have never shown a need for compatibility Jamo. I have
requested real examples on several occasions, but they never showed up. Even
more useful would be hard data. In how many of the Korean domain names that
have been registered in testbeds are compatibility jamo an issue? That is,
how many cases does a name contains an initial jamo compatibility character
followed *immediately* by a medial jamo compatibility character?

2. It is a trivial exercise to map sequences of conjoining jamo (with
filters) back to compatibility jamo in converting Unicode to 5601 (e.g. for
display in a web browser). If you really want, I'm sure that we can get the
UTC to have an explicit statement of how to do this in the next version of
the Unicode Standard, 3.2, due 2002Q1.

Mark
—————

Δός μοι ποῦ στῶ, καὶ κινῶ τὴν γῆν — Ἀρχιμήδης
[http://www.macchiato.com]

----- Original Message -----
From: "Soobok Lee" <lsb@postel.co.kr>
To: "Soobok Lee" <lsb@postel.co.kr>; "Mark Davis" <mark@macchiato.com>;
"Yves Arrouye" <yves@realnames.com>; <idn@ops.ietf.org>; "Martin Duerst"
<duerst@w3.org>
Sent: Sunday, October 28, 2001 17:48
Subject: Re: [idn] NFC vs NFKC


Hi, Mark and Martin,
Still we have the round-trip conversion problem in
   mapping compat jamo into  conjoining jamo + filler sequence .

 Current UTC standards do not specify how to convert
 conjoining jamo + filler sequence back  into compat jamo which can be 1:1
mapped
 to KS C 5601 isolated jamo for localized rendering platforms.
 Many korean mobile devices and applications supports only KS C 5601.

 For proper rendering of isolated jamos in unicode,
  mapping from conjoining jamo+filler sequence int into compat jamo
  should be defined somewhere in UTC standards.
 That's missing now. Rendering engine vendors have no authoritative
reference for it now.
 Without that,  Mark's suggestion will fail to work  on most korean
  commercial platforms now and even in the forseeable *FUTURE*.

Soobok Lee


----- Original Message -----
From: "Soobok Lee" <lsb@postel.co.kr>
To: "Mark Davis" <mark@macchiato.com>; "Yves Arrouye" <yves@realnames.com>;
<idn@ops.ietf.org>; "Martin Duerst" <duerst@w3.org>
Sent: Friday, October 26, 2001 2:09 AM
Subject: Re: [idn] NFC vs NFKC


>
> Not bad for future. But as for now, most current rendering platforms (on
windows and unix/linux)
> would fail to support rendering filler sequences and display fillers as
"white spaces".
> That would cause confusions and misinterpretations for a long time until
> all user platforms are upgraded to support fillers.
>
> Soobok Lee
>
> ----- Original Message -----
> From: "Mark Davis" <mark@macchiato.com>
> To: "Soobok Lee" <lsb@postel.co.kr>; "Yves Arrouye" <yves@realnames.com>;
<idn@ops.ietf.org>; "Martin Duerst" <duerst@w3.org>
> Sent: Friday, October 26, 2001 1:46 AM
> Subject: Re: [idn] NFC vs NFKC
>
>
> > Either deletion, prohibition or producing fillers can be accomplished,
if
> > desired, by adding mappings to "D. Mapping Tables" in
> > http://www.ietf.org/internet-drafts/draft-ietf-idn-nameprep-06.txt. And
this
> > is *without* having to change anything else in nameprep. For example,
for
> > U+3131 HANGUL LETTER KIYEOK here are the three options:
> >
> > Deletion:
> > 3131; ; Deletion
> >
> > Prohibition:
> > 3131; 0000; Prohibition
> >
> > Additional folding
> > 3131; 1100 1160; Additional Folding
> >
> > In the last case, the medial filler is added. It is sufficient to do
this
> > only for the initials, since that prevents any syllable formation.
However,
> > if one wanted, the initial filler could be added before medials, such as
for
> > U+314F HANGUL LETTER A below (or initial and final fillers could be
added
> > before finals).
> >
> > Additional folding
> > 314F; 115F 1161; Additional Folding
> >
> > Mark
> >
> > —————
> >
> > Δός μοι ποῦ στῶ, καὶ κινῶ τὴν γῆν — Ἀρχιμήδης
> > [http://www.macchiato.com]
> >
> > ----- Original Message -----
> > From: "Soobok Lee" <lsb@postel.co.kr>
> > To: "Yves Arrouye" <yves@realnames.com>; <idn@ops.ietf.org>; "Martin
Duerst"
> > <duerst@w3.org>
> > Sent: Thursday, October 25, 2001 09:04
> > Subject: Re: [idn] NFC vs NFKC
> >
> >
> >
> > ----- Original Message -----
> > From: "Martin Duerst" <duerst@w3.org>
> >  >
> > > Hangul domain names in KS C 5601 would normally be encoded as
> > > precombined syllables. KS C 5601 provides about 2300 of them,
> > > the ones most often used. Users who want to used other Hangul
> > > syllables won't use non-conjoining Jamo (compatibility Jamo),
> > > because they will be displayed one Jamo at a time.
> > >
> > > In the nameprep design team, we discussed some cases where users
> > > might want to use sequences of independent consonant Jamo. There
> > > are a few web pages with some examples of these, but not too many.
> > > We are not sure whether they should be allowed in domain names or
> > > not, and if they are, how they should be represented.
> > >
> >
> > Current windows 98 and above  support only  compatibility Jamo, does
> > not support "conjoining jamo + filler" sequence for isolated jamos
> > which are often found in informal texts/business names.
> > No display and No input method for "conjoining jamo + filler" in
> > windows O/S.
> >
> > Compatiliby Jamo became the defacto standard for isolated jamos
> > inadvertantly and contrary to the intent of ISO Korean representatives
and
> > UTC members.
> >
> > NFKC maps Compatiblity Jamo to conjoining jamos withOUT fillers.
> >
> > Nameprep's positions on NFKC's problematic handling of compat jamos
> > should be clarified. Two alternatives.
> >
> >   1) NFC is used instead
> >   2) NFKC is used but bypass all characters from the compatibility jamo
> > block
> >
> > Soobok Lee
> >
> >
> > >
> > > >[Edit] Is that because nobody uses these anymore, and so you would
not
> > > >expect a modern invention like IDN to appear in an anachronic way? Is
> > that
> > > >the case with all LEGACY characters?
> > >
> > > I don't want IDN to carry unnecessary legacy compatibility
> > > baggage. If it turns out that something really helps the user
> > > (e.g. for the full-width Latin letters), I have nothing against
> > > mapping them. But I don't think it's a good idea to map a
> > > wholesale 3000, most of them not really used and very difficult
> > > to type in, just to get a few dozen mapped the right way.
> > > We can easily do the later without having to do the former.
> > >
> > >
> > > Regards,   Martin.
> > >
> >
> >
>
>