I have taken another experiments with *PURE*
Unihan/Hangeul ML.com
domains without any scripts from Digits,Hiragana,Katakana and
etc.
-------------------------------------------------------------------
LAMCZ with pure UniHan ML.com N: length of a domain label ( # of code points) FREQ: number domains of length N N*FREQ: sum of # of code points of domains of length N SUM OF AMCZ: sum of lengths of AMCZ labels X: SUM OF AMCZ / N * FREQ SUM OF LAMCZ: sum of lengths of LAMCZ labels Y: SUM OF LAMCZ / N * FREQ COMP: (SUM OF LAMCZ - SUM OF AMCZ) / SUM OF AMCZ * 100 | N| FREQ| N*FREQ| SUM OF
AMCZ(X)| SUM OF LAMCZ(Y)| COMP|
| 1| 3735|
3735| 12177(3.26)|
11589(3.10)|4.83|
| 2| 42793| 85586| 281303(3.29)| 249930(2.92)|11.15| | 3| 28033| 84099| 267446(3.18)| 230951(2.75)|13.65| | 4| 54607| 218428| 685584(3.14)| 562349(2.57)|17.98| | 5| 12591| 62955| 195596(3.11)| 157259(2.50)|19.60| | 6| 7680| 46080| 141927(3.08)| 110465(2.40)|22.17| | 7| 2761| 19327| 59231(3.06)| 44754(2.32)|24.44| | 8| 1336| 10688| 32554(3.05)| 24120(2.26)|25.91| | 9| 641| 5769| 17490(3.03)| 12833(2.22)|26.63| | 10| 298| 2980| 8962(3.01)| 6570(2.20)|26.69| | 11| 137| 1507| 4575(3.04)| 3226(2.14)|29.49| | 12| 57| 684| 2057(3.01)| 1529(2.24)|25.67| | 13| 25| 325| 983(3.02)| 727(2.24)|26.04| | 14| 6| 84| 253(3.01)| 181(2.15)|28.46| | 15| 6| 90| 266(2.96)| 195(2.17)|26.69| | 17| 1| 17| 48(2.82)| 32(1.88)|33.33| | | 154707| 542354|
1710452(3.15)| 1416710(2.61)|17.17|
-------------------------------------------------------------------
LAMCZ with pure Hangul ML.com
N: length of a domain label ( # of code points) FREQ: number domains of length N N*FREQ: sum of # of code points of domains of length N SUM OF AMCZ: sum of lengths of AMCZ labels X: SUM OF AMCZ / N * FREQ SUM OF LAMCZ: sum of lengths of LAMCZ labels Y: SUM OF LAMCZ / N * FREQ COMP: (SUM OF LAMCZ - SUM OF AMCZ) / SUM OF AMCZ * 100 | N| FREQ|
N*FREQ| SUM OF AMCZ(X)| SUM OF LAMCZ(Y)| COMP|
| 1|
1940| 1940|
7760(4.00)| 7760(4.00)|0.00|
| 2| 16492| 32984| 119927(3.64)| 102659(3.11)|14.40| | 3| 37406| 112218| 380305(3.39)| 310666(2.77)|18.31| | 4| 57732| 230928| 756089(3.27)| 587684(2.54)|22.27| | 5| 36661| 183305| 587547(3.21)| 440929(2.41)|24.95| | 6| 22090| 132540| 418286(3.16)| 304984(2.30)|27.09| | 7| 11503| 80521| 251226(3.12)| 180533(2.24)|28.14| | 8| 4963| 39704| 122742(3.09)| 86642(2.18)|29.41| | 9| 2104| 18936| 57964(3.06)| 40725(2.15)|29.74| | 10| 833| 8330| 25332(3.04)| 17599(2.11)|30.53| | 11| 358| 3938| 11919(3.03)| 8225(2.09)|30.99| | 12| 123| 1476| 4422(3.00)| 3092(2.09)|30.08| | 13| 71| 923| 2752(2.98)| 1901(2.06)|30.92| | 14| 28| 392| 1160(2.96)| 805(2.05)|30.60| | 15| 18| 270| 798(2.96)| 565(2.09)|29.20| | 16| 10| 160| 460(2.88)| 342(2.14)|25.65| | 17| 7| 119| 354(2.97)| 243(2.04)|31.36| | | 192339|
848684| 2749043(3.24)|
2095354(2.47)|23.78|
-------------------------------------------------------------------
LDUDE with pure UniHan ML.com
N: length of a domain label ( # of code points) FREQ: number domains of length N N*FREQ: sum of # of code points of domains of length N SUM OF DUDE: sum of lengths of DUDE labels X: SUM OF DUDE / N * FREQ SUM OF LDUDE: sum of lengths of LDUDE labels Y: SUM OF LDUDE / N * FREQ COMP: (SUM OF LDUDE - SUM OF DUDE) / SUM OF DUDE * 100 | N| FREQ| N*FREQ| SUM OF
DUDE(X)| SUM OF LDUDE(Y)| COMP|
| 1| 3735|
3735| 14940(4.00)|
14940(4.00)|0.00|
| 2| 42793| 85586| 328641(3.84)| 296679(3.47)|9.73| | 3| 28033| 84099| 318916(3.79)| 268987(3.20)|15.66| | 4| 54607| 218428| 826020(3.78)| 639417(2.93)|22.59| | 5| 12591| 62955| 237862(3.78)| 178866(2.84)|24.80| | 6| 7680| 46080| 173063(3.76)| 122010(2.65)|29.50| | 7| 2761| 19327| 72750(3.76)| 49710(2.57)|31.67| | 8| 1336| 10688| 40078(3.75)| 26187(2.45)|34.66| | 9| 641| 5769| 21554(3.74)| 14009(2.43)|35.01| | 10| 298| 2980| 11164(3.75)| 7095(2.38)|36.45| | 11| 137| 1507| 5671(3.76)| 3482(2.31)|38.60| | 12| 57| 684| 2528(3.70)| 1678(2.45)|33.62| | 13| 25| 325| 1188(3.66)| 807(2.48)|32.07| | 14| 6| 84| 309(3.68)| 202(2.40)|34.63| | 15| 6| 90| 323(3.59)| 204(2.27)|36.84| | 17| 1| 17| 55(3.24)| 39(2.29)|29.09| | | 154707| 542354|
2055062(3.79)| 1624312(2.99)|20.96| -------------------------------------------------------------------
LDUDE with pure Hangul ML.com N: length of a domain label ( # of code points) FREQ: number domains of length N N*FREQ: sum of # of code points of domains of length N SUM OF DUDE: sum of lengths of DUDE labels X: SUM OF DUDE / N * FREQ SUM OF LDUDE: sum of lengths of LDUDE labels Y: SUM OF LDUDE / N * FREQ COMP: (SUM OF LDUDE - SUM OF DUDE) / SUM OF DUDE * 100 | N| FREQ|
N*FREQ| SUM OF DUDE(X)| SUM OF LDUDE(Y)| COMP|
| 1|
1940| 1940|
7760(4.00)| 7760(4.00)|0.00|
| 2| 16492| 32984| 125398(3.80)| 107795(3.27)|14.04| | 3| 37406| 112218| 420812(3.75)| 328202(2.92)|22.01| | 4| 57732| 230928| 860712(3.73)| 610053(2.64)|29.12| | 5| 36661| 183305| 682356(3.72)| 454656(2.48)|33.37| | 6| 22090| 132540| 492239(3.71)| 313997(2.37)|36.21| | 7| 11503| 80521| 298365(3.71)| 186251(2.31)|37.58| | 8| 4963| 39704| 146595(3.69)| 89221(2.25)|39.14| | 9| 2104| 18936| 69955(3.69)| 42413(2.24)|39.37| | 10| 833| 8330| 30705(3.69)| 18333(2.20)|40.29| | 11| 358| 3938| 14452(3.67)| 8628(2.19)|40.30| | 12| 123| 1476| 5397(3.66)| 3314(2.25)|38.60| | 13| 71| 923| 3387(3.67)| 2055(2.23)|39.33| | 14| 28| 392| 1452(3.70)| 893(2.28)|38.50| | 15| 18| 270| 998(3.70)| 632(2.34)|36.67| | 16| 10| 160| 582(3.64)| 386(2.41)|33.68| | 17| 7| 119| 438(3.68)| 272(2.29)|37.90| | | 192339|
848684| 3161603(3.73)|
2174861(2.56)|31.21|
|