[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] New revision 1.0 of my I-D on REORDERING



Today, New revision 1.0 of my I-D on REOREDERING appeared
on IDN WG directory.
http://www.ietf.org/internet-drafts/draft-ietf-idn-lsb-ace-01.txt
This link has the same url  as that of 0.9 but has 
different title and content. (due to some mistakes between me and ietf).
Please click this link and press "refresh"button on your browser
to fetch this new draft.

Next week, I will submit another new draft version 2.0
with new experiments with
AMC-ACE-Z and other ACEs. It will  support 
Hindi,Hiragana,Russian,Arabic 
 in addition to Han,Hangul,Katakana and Latins.

Regards,

Soobok Lee


----- Original Message ----- 
From: "Soobok Lee" <lsb@postel.co.kr>
To: "idn working group" <idn@ops.ietf.org>
Sent: Friday, July 13, 2001 7:09 AM
Subject: call for support for REORDERING


> 
> My I-D on "Improving ACE using code point reordering"
> is looking forward to your supports. It has no contenders, 
> only need your interests.
> 
> It  helps AMC-AMC-Z/DUDE's mean ACE label lengths 
> to be close to that of UCS-2 ( 2*n) for 7 or more letters of 
> typical han/hangul IDNs and does better compression for
> other scripts, too.
> 
> To express your interests ( does not necessarily mean 
> you support all the aspects of my I-D version1.0 )
> in REORDERING, do not hesitate to send mail to me  
> <lsb@postel.co.kr>. 
> 
> You can find my recent I-D (soon to be revised in IDN wg home).
> http://www.postel.co.kr/lsb-ace-01.txt   (version 1.0)
> 
> Soobok Lee
> 
> =========================================================
> The following attachment shows the amounts of
> improvements achieved for big ML.com samples.
> 
> 
> For Chinese ML.com samples.
> 
> N:            length of a domain label ( # of code points)
> FREQ:         number domains of length N
> N*FREQ:       sum of # of code points of domains of length N
> SUM OF AMCZ:  sum of lengths of AMCZ labels
> X:            SUM OF AMCZ / N * FREQ
> SUM OF LAMCZ: sum of lengths of LAMCZ labels
> Y:            SUM OF LAMCZ / N * FREQ
> COMP:         (SUM OF LAMCZ - SUM OF AMCZ) / SUM OF AMCZ * 100
> 
> |  N|    FREQ|    N*FREQ|  SUM OF AMCZ(X)| SUM OF LAMCZ(Y)| COMP|
> 
> |  1|    4642|      4642|     15804(3.40)|     14807(3.19)|6.31|
> |  2|   59708|    119416|    401549(3.36)|    352022(2.95)|12.33|
> |  3|   49471|    148413|    484456(3.26)|    415104(2.80)|14.32|
> |  4|   99402|    397608|   1269398(3.19)|   1034646(2.60)|18.49|
> |  5|   29974|    149870|    467070(3.12)|    381651(2.55)|18.29|
> |  6|   20809|    124854|    384013(3.08)|    304635(2.44)|20.67|
> |  7|    8860|     62020|    186347(3.00)|    147111(2.37)|21.06|
> |  8|    5251|     42008|    124325(2.96)|     97303(2.32)|21.73|
> |  9|    2666|     23994|     69234(2.89)|     54697(2.28)|21.00|
> | 10|    2008|     20080|     57887(2.88)|     44270(2.20)|23.52|
> | 11|     859|      9449|     26914(2.85)|     20836(2.21)|22.58|
> | 12|     671|      8052|     22819(2.83)|     17294(2.15)|24.21|
> | 13|     346|      4498|     12217(2.72)|      9581(2.13)|21.58|
> | 14|     235|      3290|      9084(2.76)|      6933(2.11)|23.68|
> | 15|     117|      1755|      4723(2.69)|      3721(2.12)|21.22|
> | 16|      68|      1088|      2884(2.65)|      2258(2.08)|21.71|
> | 17|      21|       357|       911(2.55)|       704(1.97)|22.72|
> 
> |   |  285108|   1121394|   3539635(3.16)|   2907573(2.59)|17.86|
> 
> 
> 
> 
> 
> For Korean ML.com samples.
> 
> N:            length of a domain label ( # of code points)
> FREQ:         number domains of length N
> N*FREQ:       sum of # of code points of domains of length N
> SUM OF AMCZ:  sum of lengths of AMCZ labels
> X:            SUM OF AMCZ / N * FREQ
> SUM OF LAMCZ: sum of lengths of LAMCZ labels
> Y:            SUM OF LAMCZ / N * FREQ
> COMP:         (SUM OF LAMCZ - SUM OF AMCZ) / SUM OF AMCZ * 100
> 
> |  N|    FREQ|    N*FREQ|  SUM OF AMCZ(X)| SUM OF LAMCZ(Y)| COMP|
> 
> |  1|    1941|      1941|      7764(4.00)|      7764(4.00)|0.00|
> |  2|   16978|     33956|    123248(3.63)|    105628(3.11)|14.30|
> |  3|   38852|    116556|    394410(3.38)|    322373(2.77)|18.26|
> |  4|   61642|    246568|    803121(3.26)|    625970(2.54)|22.06|
> |  5|   40375|    201875|    639079(3.17)|    483118(2.39)|24.40|
> |  6|   24561|    147366|    458978(3.11)|    337398(2.29)|26.49|
> |  7|   13034|     91238|    280346(3.07)|    203406(2.23)|27.44|
> |  8|    5596|     44768|    136452(3.05)|     97248(2.17)|28.73|
> |  9|    2421|     21789|     65504(3.01)|     46536(2.14)|28.96|
> | 10|    1033|     10330|     29964(2.90)|     21330(2.06)|28.81|
> | 11|     427|      4697|     13845(2.95)|      9739(2.07)|29.66|
> | 12|     173|      2076|      5905(2.84)|      4261(2.05)|27.84|
> | 13|      96|      1248|      3588(2.88)|      2539(2.03)|29.24|
> | 14|      32|       448|      1331(2.97)|       921(2.06)|30.80|
> | 15|      22|       330|       927(2.81)|       675(2.05)|27.18|
> | 16|      15|       240|       606(2.52)|       471(1.96)|22.28|
> | 17|       8|       136|       378(2.78)|       267(1.96)|29.37|
> | 19|       1|        19|        26(1.37)|        26(1.37)|0.00|
> 
> |   |  207207|    925581|   2965472(3.20)|   2269670(2.45)|23.46|
> 
> ----- Original Message ----- 
> From: "Soobok Lee" <lsb@postel.co.kr>
> To: <idn@ops.ietf.org>
> Sent: Tuesday, July 10, 2001 10:37 PM
> Subject: chinese/hangul ML.com statistics with DUDE/LDUDE
> 
> 
> >   
> > The next table is from
> > 285108 chinese ML.com samples (old raw data from VGRS). 
> >  
> > "COMP" column includes improvement ratios of LDUDE over DUDE.
> > "Y" column points that  for long chinese domains, LDUDE's label
> > length is close to (2.0~2.5)*(input domain length).
> > 
> > 
> > N:            length of a domain label ( # of code points)
> > FREQ:         number domains of length N
> > SUM OF DUDE:  sum of lengths of DUDE labels
> > X:            SUM OF DUDE / N * FREQ
> > SUM OF LDUDE: sum of lengths of LDUDE labels
> > Y:            SUM OF LDUDE / N * FREQ
> > COMP:         (SUM OF LDUDE - SUM OF DUDE) / SUM OF DUDE * 100
> > 
> > |  N|    FREQ|    N*FREQ|  SUM OF DUDE(X)| SUM OF LDUDE(Y)| COMP|
> > 
> > |  1|    4642|      4642|     18568(4.00)|     18568(4.00)| 0.00|
> > |  2|   59708|    119416|    462031(3.87)|    415599(3.48)|10.05|
> > |  3|   49471|    148413|    566440(3.82)|    477649(3.22)|15.68|
> > |  4|   99402|    397608|   1509929(3.80)|   1168378(2.94)|22.62|
> > |  5|   29974|    149870|    554237(3.70)|    426226(2.84)|23.10|
> > |  6|   20809|    124854|    457412(3.66)|    333416(2.67)|27.11|
> > |  7|    8860|     62020|    220880(3.56)|    160563(2.59)|27.31|
> > |  8|    5251|     42008|    146822(3.50)|    103903(2.47)|29.23|
> > |  9|    2666|     23994|     81433(3.39)|     58657(2.44)|27.97|
> > | 10|    2008|     20080|     68385(3.41)|     46708(2.33)|31.70|
> > | 11|     859|      9449|     31596(3.34)|     22111(2.34)|30.02|
> > | 12|     671|      8052|     27039(3.36)|     18135(2.25)|32.93|
> > | 13|     346|      4498|     14306(3.18)|     10088(2.24)|29.48|
> > | 14|     235|      3290|     10676(3.24)|      7230(2.20)|32.28|
> > | 15|     117|      1755|      5568(3.17)|      3854(2.20)|30.78|
> > | 16|      68|      1088|      3383(3.11)|      2376(2.18)|29.77|
> > | 17|      21|       357|      1075(3.01)|       750(2.10)|30.23|
> > 
> > |   |  285108|   1121394|   4179780(3.73)|   3274211(2.92)|21.67|
> > 
> > 
> > 
> > The next table is  
> > From 207207 hangul ML.com samples (old raw data from VGRS). 
> > 
> > |  N|    FREQ|    N*FREQ|  SUM OF DUDE(X)| SUM OF LDUDE(Y)| COMP|
> > 
> > |  1|    1941|      1941|      7764(4.00)|      7764(4.00)|0.00|
> > |  2|   16978|     33956|    129239(3.81)|    111308(3.28)|13.87|
> > |  3|   38852|    116556|    436845(3.75)|    341333(2.93)|21.86|
> > |  4|   61642|    246568|    915355(3.71)|    653736(2.65)|28.58|
> > |  5|   40375|    201875|    743090(3.68)|    502097(2.49)|32.43|
> > |  6|   24561|    147366|    540245(3.67)|    349710(2.37)|35.27|
> > |  7|   13034|     91238|    332964(3.65)|    211206(2.31)|36.57|
> > |  8|    5596|     44768|    162833(3.64)|    100618(2.25)|38.21|
> > |  9|    2421|     21789|     78945(3.62)|     48633(2.23)|38.40|
> > | 10|    1033|     10330|     36144(3.50)|     22323(2.16)|38.24|
> > | 11|     427|      4697|     16744(3.56)|     10259(2.18)|38.73|
> > | 12|     173|      2076|      7178(3.46)|      4578(2.21)|36.22|
> > | 13|      96|      1248|      4386(3.51)|      2725(2.18)|37.87|
> > | 14|      32|       448|      1656(3.70)|      1006(2.25)|39.25|
> > | 15|      22|       330|      1168(3.54)|       750(2.27)|35.79|
> > | 16|      15|       240|       757(3.15)|       529(2.20)|30.12|
> > | 17|       8|       136|       470(3.46)|       299(2.20)|36.38|
> > | 19|       1|        19|        30(1.58)|        30(1.58)|0.00|
> > 
> > |   |  207207|    925581|   3415813(3.69)|   2368904(2.56)|30.65|
> > 
> > 
> > 
> > 
> > 
> > 
> 
>