[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Report from the ACE design team



In message <p0510032cb74fd055e43c@[165.227.249.18]>,
Paul Hoffman / IMC <phoffman@imc.org> wrote:
> Greetings again. This report from the ACE design team was turned in 
> yesterday and will appear in the official Internet Drafts directory 
> on Monday.
> ...

As draft-ietf-idn-ace-report-00 says, the current recommendation of the
ACE design team is DUDE.  However, as stated in the draft, the design
team has not yet come to a complete agreement on DUDE, and I'm one of
the members who is not convinced of the recommendation.

My opinion is certainly reflected in the draft, but I'd like to
explain here the reason why I think DUDE may not be the best choice.

What I'm worrying about DUDE is its relative inefficiency for CJK
scripts.

DUDE's compression algorithm (variable-length differential encoding)
seems to work very efficiently when the code points of the characters
in a name are clustered in a small range.  It is the case for most of
Western scripts.

However, for languages with large number of characters (such as CJK),
the algorithm tends to work poorly.  In the worst case DUDE encodes
one Unicode character (in the Basic Multilingual Plane) as 4-octet
sequence.  This happens frequently for CJK Han or Hangul names because
the characters in these scripts are scattered in the Unicode code
point space.

This means that in the worst case a name of 15 characters might not
fit into a 63-octet label (assuming 4-octet prefix such as 'dq--').
We expect that typically up to 15 character name can be encodable by
DUDE.

The following points are my main concern:

1) Is 14-15 character is enough?
   At least for Japanese domain names, name of a company or an
   organization is sometimes quite long.  My question is whether the
   maximum of 14-15 character name for CJK is enough or not.  If it
   is, DUDE would be fine.  But if it isn't, other ACE which is
   more efficient (in dealing long names) but less simple might be
   better.

2) Potential migration problem
   Many NICs has already begun registering internationalized domain
   names using RACE as the ACE.  In RACE, any names up to
   17-characters can be fit in 63-octet label.  So it is possible that
   some of the registered names suddenly become invalid when migration
   from RACE to DUDE take place.  Of course it is a risk that they
   have to take, but if choosing other ACE can prevent it, or lower
   the possibility, it might be a better choice.

						-- ishisone@sra.co.jp