[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Report from the ACE design team



http://www.ntia.doc.gov/ntiahome/domainname/130dftmail/unir.txt
??? DUDE
??? ACE
4:86 RACE


Jim Fleming
http://www.unir.com
Mars 128n 128e
http://www.unir.com/images/architech.gif
http://www.unir.com/images/address.gif
http://www.unir.com/images/headers.gif
http://msdn.microsoft.com/downloads/sdks/platform/tpipv6/start.asp
http://www.ietf.org/mail-archive/ietf/Current/msg12213.html
http://www.ietf.org/mail-archive/ietf/Current/msg12223.html


----- Original Message ----- 
From: "Makoto Ishisone" <ishisone@sra.co.jp>
To: <idn@ops.ietf.org>
Sent: Monday, June 25, 2001 11:32 AM
Subject: Re: [idn] Report from the ACE design team


> In message <p0510032cb74fd055e43c@[165.227.249.18]>,
> Paul Hoffman / IMC <phoffman@imc.org> wrote:
> > Greetings again. This report from the ACE design team was turned in 
> > yesterday and will appear in the official Internet Drafts directory 
> > on Monday.
> > ...
> 
> As draft-ietf-idn-ace-report-00 says, the current recommendation of the
> ACE design team is DUDE.  However, as stated in the draft, the design
> team has not yet come to a complete agreement on DUDE, and I'm one of
> the members who is not convinced of the recommendation.
> 
> My opinion is certainly reflected in the draft, but I'd like to
> explain here the reason why I think DUDE may not be the best choice.
> 
> What I'm worrying about DUDE is its relative inefficiency for CJK
> scripts.
> 
> DUDE's compression algorithm (variable-length differential encoding)
> seems to work very efficiently when the code points of the characters
> in a name are clustered in a small range.  It is the case for most of
> Western scripts.
> 
> However, for languages with large number of characters (such as CJK),
> the algorithm tends to work poorly.  In the worst case DUDE encodes
> one Unicode character (in the Basic Multilingual Plane) as 4-octet
> sequence.  This happens frequently for CJK Han or Hangul names because
> the characters in these scripts are scattered in the Unicode code
> point space.
> 
> This means that in the worst case a name of 15 characters might not
> fit into a 63-octet label (assuming 4-octet prefix such as 'dq--').
> We expect that typically up to 15 character name can be encodable by
> DUDE.
> 
> The following points are my main concern:
> 
> 1) Is 14-15 character is enough?
>    At least for Japanese domain names, name of a company or an
>    organization is sometimes quite long.  My question is whether the
>    maximum of 14-15 character name for CJK is enough or not.  If it
>    is, DUDE would be fine.  But if it isn't, other ACE which is
>    more efficient (in dealing long names) but less simple might be
>    better.
> 
> 2) Potential migration problem
>    Many NICs has already begun registering internationalized domain
>    names using RACE as the ACE.  In RACE, any names up to
>    17-characters can be fit in 63-octet label.  So it is possible that
>    some of the registered names suddenly become invalid when migration
>    from RACE to DUDE take place.  Of course it is a risk that they
>    have to take, but if choosing other ACE can prevent it, or lower
>    the possibility, it might be a better choice.
> 
> -- ishisone@sra.co.jp
>