[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] Report from the ACE design team
- To: idn@ops.ietf.org
- Subject: Re: [idn] Report from the ACE design team
- From: Makoto Ishisone <ishisone@sra.co.jp>
- Date: Tue, 26 Jun 2001 01:32:16 +0900
- Delivery-date: Mon, 25 Jun 2001 09:35:16 -0700
- Envelope-to: idn-data@psg.com
In message <p0510032cb74fd055e43c@[165.227.249.18]>,
Paul Hoffman / IMC <phoffman@imc.org> wrote:
> Greetings again. This report from the ACE design team was turned in
> yesterday and will appear in the official Internet Drafts directory
> on Monday.
> ...
As draft-ietf-idn-ace-report-00 says, the current recommendation of the
ACE design team is DUDE. However, as stated in the draft, the design
team has not yet come to a complete agreement on DUDE, and I'm one of
the members who is not convinced of the recommendation.
My opinion is certainly reflected in the draft, but I'd like to
explain here the reason why I think DUDE may not be the best choice.
What I'm worrying about DUDE is its relative inefficiency for CJK
scripts.
DUDE's compression algorithm (variable-length differential encoding)
seems to work very efficiently when the code points of the characters
in a name are clustered in a small range. It is the case for most of
Western scripts.
However, for languages with large number of characters (such as CJK),
the algorithm tends to work poorly. In the worst case DUDE encodes
one Unicode character (in the Basic Multilingual Plane) as 4-octet
sequence. This happens frequently for CJK Han or Hangul names because
the characters in these scripts are scattered in the Unicode code
point space.
This means that in the worst case a name of 15 characters might not
fit into a 63-octet label (assuming 4-octet prefix such as 'dq--').
We expect that typically up to 15 character name can be encodable by
DUDE.
The following points are my main concern:
1) Is 14-15 character is enough?
At least for Japanese domain names, name of a company or an
organization is sometimes quite long. My question is whether the
maximum of 14-15 character name for CJK is enough or not. If it
is, DUDE would be fine. But if it isn't, other ACE which is
more efficient (in dealing long names) but less simple might be
better.
2) Potential migration problem
Many NICs has already begun registering internationalized domain
names using RACE as the ACE. In RACE, any names up to
17-characters can be fit in 63-octet label. So it is possible that
some of the registered names suddenly become invalid when migration
from RACE to DUDE take place. Of course it is a risk that they
have to take, but if choosing other ACE can prevent it, or lower
the possibility, it might be a better choice.
-- ishisone@sra.co.jp