Seems like there are a number of advantages for
using DUDE as suggested by the design team. As I looked into the draft I
have found that perhaps it wasn't necessary to use the base-32 mapping. I
have just submitted a draft on an enhanced version of DUDE to the WG
-- ACE using Extended Hex Values (ACE16x). Anyway, since there was a lot
of interest in DUDE recently, if you are interested you can checkout: http://www.dnsii.org/idn-ace16x-00.txt.
In brief, ACE16x utilizes largely the same
mechanism as DUDE including the one-pass and XOR features. However, it
does not require a base32 mapping scheme or any 5-bit handling. Instead it
utilizes extended hex (16x) characters to inidicate the last quartet of a
compressed code point (as opposed to prepending "0" or "1" to form a quintet as
specified in DUDE). The 16x character is calculable instead of having
to be "mapped" making it much more efficient. Additionally, in most cases
the simple hex dump mechanism is used. The size-wise performance however
is not compromised and the resulting string is exactly the same length as DUDE.
(hence supports ~15-39 IDN characters)
Since simplicity was the intent for DUDE and a
key criteria of the ACE design team, I think ACE16x is an improvement cause it
provides a more simple mechanism than DUDE. In fact I also created an
excel spreadsheet that will do the ACE16x encoding and you can find it at http://www.dnsii.org/ace16x/ace16x-encode.xls. (you
can do DUDE-encode with it too in a separate worksheet, and you will see
that DUDE is much more complicated). I have also chosen to use the
initial value of 0x30 (instead of 0x60) so that all domains starting with
a digit (0-9) will be shorter. This is a more likely scenario
for cjk names (than an English letter) where it is more important to conserve
character spaces.
Edmon
The following is extracted from Section 6 of my
draft:
6. Key Improvements of ACE16x in comparison with
DUDE-02
- ACE16x does NOT need character mapping. Instead it uses a shifting mechanism that is calculable: 16x = Original hex + 0x67 (or +0x47 for uppercase) - ACE16x maintains the one pass system and utilizes XOR instead of masking as in DUDE-01 - ACE16x does not employ a 5bit mechanism, therefore increases efficiency - The initial value is set to 0x30 so that all domains beginning with a digit will be shorter when encoded - ACE16x simply hex dumps most quartets improving process time both in encoding and decoding. - The overall process time will be reduced by means of the following: 1) Hex dump verses base-32 mapping 2) Shifting verses base-32 mapping 3) No need to pre-pend "1" or "0" bit(during encode) 4) No need to strip first bit (during decode) - ACE16x is a much more simple algorithm without compromising performance. The encoding mechanism is so simple that it could easily be expressed in an Excel spreadsheet: http://www.dnsii.org/ace16x/ace16x-encode.xls (The DUDE encode mechanism is also represented in a separate worksheet. It could be observed that ACE16x is much more simple than DUDE.) Abstract
ACE16x is a simplified version of DUDE [DUDE-02] that requires no 5 bit or base-32 mapping. ACE16x encoding results in a string that performs as well as DUDE technically. Instead of resorting to a quartet-to-quintet mapping mechanism, ACE16x simply uses the hex values with an extended hex (16x) scheme for compression. In essence, instead of pre-pending an extra bit, ACE16x shifts the last quartet of a compressed code point up to another character. Additionally, the 16x value is calculable instead of needing to be mapped. fulltext: http://www.dnsii.org/idn-ace16x-00.txt
|