[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Re: permission <draft-ietf-idn-ace37-00.txt (attach)



Edmon <edmon@neteka.com> wrote:

> Worst case scenario CJK could have 21 han characters!

That's assuming the encoded string can be up to 63 octets.  For IDNA
the limit is 59 octets, in which case ACE37 can support up to 19 Han
characters, same as several other ACEs (but better than DUDE's 15).

But I'm pretty sure that ACE37 is less efficient than DUDE for all
non-CJK scripts.

> All the while, the algorithm is kept to be as simple as DUDE.

It doesn't look as simple as DUDE to me.  To me it looks no simpler than
AMC-ACE-W, which is as efficient as ACE37 for CJK, as efficient as DUDE
for single-row scripts, and more efficient than DUDE for Latin-based
scripts.  Whereas AMC-ACE-W's complexity is in a state machine, ACE37's
complexity is in encoding/recognizing lots of different patterns.  I
know nothing about Excel (and do not have it), but maybe patterns are
easier than state machines in Excel, and maybe ACE37 appears simpler
than AMC-ACE-W when both are implemented in Excel.

By the way, I'm now in the process of tweaking AMC-ACE-Z, which is as
simple as AMC-ACE-W (if not simpler) and also more efficient.  A draft
will be coming soon, along with an evaluation of many ACE proposals.
Edmon, if you'd like to provide me with a C implementation of ACE37, I
can include it in the evaluation.  You are welcome (and even encouraged)
to use my example implementation of DUDE as a template.

AMC