[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] complexity/simplicity: NAMEPREP code vs ACE codes

To: "Makoto Ishisone" <ishisone@sra.co.jp>
Subject: Re: [idn] complexity/simplicity: NAMEPREP code vs ACE codes
From: "Soobok Lee" <lsb@postel.co.kr>
Date: Fri, 29 Jun 2001 00:25:25 +0900
Cc: <idn@ops.ietf.org>
Delivery-date: Thu, 28 Jun 2001 08:20:45 -0700
Envelope-to: idn-data@psg.com

You are right.  
KC Norm is hard to learn and implement from scratch.

To find more KC norm related sources in MDNkit.
[root@bora lib]# wc *norm*c
    632    1975   16581 normalizer.c
    459    1710   12201 unormalize.c ( not related to KC norm ???)
   1091    3685   28782 total

If all of you think even huge mapping tables do not add complexity, 
My 'reorder_by_char_frequency-before-encode' idea 
adds no complexity to DUDE, as it adds only simple 
mapping functions and tables.  that's not bad.    :-)

http://www.postel.co.kr/idn-lsb-00.txt
(I am now adding SC/TW support to this. 15%~20% improvement measured
by adding tables for most frequent 2048 han syllables).

Soobok Lee




----- Original Message ----- 
From: "Makoto Ishisone" <ishisone@sra.co.jp>
To: <lsb@postel.co.kr>
Cc: <idn@ops.ietf.org>
Sent: Friday, June 29, 2001 12:01 AM
Subject: Re: [idn] complexity/simplicity: NAMEPREP code vs ACE codes


> In message <001e01c0ffd0$1a717ee0$ed1bd9d2@postel.co.kr>,
> "Soobok Lee" <lsb@postel.co.kr> wrote:
> > For whom had never looked into NAMEPREP codes in MDNkit of JPNIC,
> >  ...
> > [root@bora lib]# wc name*[hc] uni*[hc]
> >     296    1109    8554 nameprep.c
> >     136     804    5475 nameprep_template.c
> >    1694   11778   73804 nameprepdata.c
> >     484    1822   12314 unicode.c
> >    6806   38573  327222 unicodedata.c
> >    9416   54086  427369 total
> 
> If you look closer, you'll find that nameprepdata.c and unicodedata.c
> above contain only data -- some large tables, which are generated from
> NAMEPREP draft and Unicode Character Database.  So I don't think it is
> fair to count them when you compare complexity.  On the other hand
> you overlooked unormalize.c, which implements Unicode Normalization
> Forms.
> 
> Anyway I agree that NAMEPREP (NFKC in particular) is no simpler than
> most of the proposed ACEs.  Before implementing NFKC you have to read
> the specification, which is longer than any ACE I-Ds, and relevant
> documents, understand what's going on, and generate tables from the
> data...  Also I think it is harder to test the correctness of the
> implementation.
> 
> -- ishisone@sra.co.jp
>

Prev by Date: Re: [idn] draft about Tradition and Simplified Chinese Conversion
Next by Date: Re: [idn] complexity/simplicity: NAMEPREP code vs ACE codes
Prev by thread: Re: [idn] complexity/simplicity: NAMEPREP code vs ACE codes
Next by thread: Re: [idn] complexity/simplicity: NAMEPREP code vs ACE codes
Index(es):
- Date
- Thread