[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] ACE16x (An enhanced version of DUDE)



Seems like there are a number of advantages for using DUDE as suggested by the design team.  As I looked into the draft I have found that perhaps it wasn't necessary to use the base-32 mapping.  I have just submitted a draft on an enhanced version of DUDE  to the WG -- ACE using Extended Hex Values (ACE16x).  Anyway, since there was a lot of interest in DUDE recently, if you are interested you can checkout: http://www.dnsii.org/idn-ace16x-00.txt.
 
In brief, ACE16x utilizes largely the same mechanism as DUDE including the one-pass and XOR features.  However, it does not require a base32 mapping scheme or any 5-bit handling.  Instead it utilizes extended hex (16x) characters to inidicate the last quartet of a compressed code point (as opposed to prepending "0" or "1" to form a quintet as specified in DUDE).  The 16x character is calculable instead of having to be "mapped" making it much more efficient.  Additionally, in most cases the simple hex dump mechanism is used.  The size-wise performance however is not compromised and the resulting string is exactly the same length as DUDE. (hence supports ~15-39 IDN characters)
 
Since simplicity was the intent for DUDE and a key criteria of the ACE design team, I think ACE16x is an improvement cause it provides a more simple mechanism than DUDE.  In fact I also created an excel spreadsheet that will do the ACE16x encoding and you can find it at http://www.dnsii.org/ace16x/ace16x-encode.xls. (you can do DUDE-encode with it too in a separate worksheet, and you will see that DUDE is much more complicated).  I have also chosen to use the initial value of 0x30 (instead of 0x60) so that all domains starting with a digit (0-9) will be shorter.  This is a more likely scenario for cjk names (than an English letter) where it is more important to conserve character spaces.
 
Edmon
 
 
The following is extracted from Section 6 of my draft:
 
6. Key Improvements of ACE16x in comparison with DUDE-02
   
   - ACE16x does NOT need character mapping.  Instead it uses a
     shifting mechanism that is calculable: 
     
          16x = Original hex + 0x67 (or +0x47 for uppercase)
   
   - ACE16x maintains the one pass system and utilizes XOR instead of
     masking as in DUDE-01
   
   - ACE16x does not employ a 5bit mechanism, therefore increases
     efficiency
   
   - The initial value is set to 0x30 so that all domains beginning
     with a digit will be shorter when encoded
   
   - ACE16x simply hex dumps most quartets improving process time both
     in encoding and decoding.
   
   - The overall process time will be reduced by means of the
     following:
         1) Hex dump verses base-32 mapping 
         2) Shifting verses base-32 mapping
         3) No need to pre-pend "1" or "0" bit(during encode)
         4) No need to strip first bit (during decode)
   
   - ACE16x is a much more simple algorithm without compromising
     performance.  The encoding mechanism is so simple that it could
     easily be expressed in an Excel spreadsheet:
     http://www.dnsii.org/ace16x/ace16x-encode.xls (The DUDE encode
     mechanism is also represented in a separate worksheet.  It could
     be observed that ACE16x is much more simple than DUDE.)
 
 
 
Abstract
   
   ACE16x is a simplified version of DUDE [DUDE-02] that requires no 5
   bit or base-32 mapping.  ACE16x encoding results in a string that
   performs as well as DUDE technically.
   
   Instead of resorting to a quartet-to-quintet mapping mechanism,
   ACE16x simply uses the hex values with an extended hex (16x) scheme
   for compression.  In essence, instead of pre-pending an extra bit,
   ACE16x shifts the last quartet of a compressed code point up to
   another character.  Additionally, the 16x value is calculable
   instead of needing to be mapped.
 
fulltext: http://www.dnsii.org/idn-ace16x-00.txt