[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] CIDNUC in action
- To: idn@ops.ietf.org
- Subject: Re: [idn] CIDNUC in action
- From: Paul Hoffman / IMC <phoffman@imc.org>
- Date: Thu, 23 Mar 2000 09:42:12 -0800
- Delivery-date: Thu, 23 Mar 2000 09:42:10 -0800
- Envelope-to: idn-data@psg.com
At 10:56 PM 3/23/00 +0800, James Seng wrote:
>1. Implementation of CIDNUC while not very complicated is *very* time
> consuming and to debug! (Argghh!!, I have to spend my time going thru
> bit by bit to get it right)
Quite true. Bit-twiddling is always error-prone.
>2. Its compression is useful but unfortunately has not much effect on CJK.
> In fact, it makes CJK even longer *doh*.
Longer than what? It makes it shorter than UTF-5.
> It is useful for many languages
> which does not use more than 256 code point.
Which is every script in 10646 other than Han, Yi, and Hangul syllables.
This WG should decide how important it is to have name parts for these
scripts be longer than 14 characters (UTF-5), 17 characters (CIDNUC), or 8
characters (8&down). Non-BMP characters have longer encodings in all three
proposals.
>(Would be nice if we have a
> generic LZW or Huffman compression or UTR#6?)
If you can design one that does not overly-restrict many scripts, that
would be great!
>3. No explaination on what do encoding or decoding algorthm should do when
> it encounter an invalid character.
Sure it does; see sections 2.3.2, 2.3.3, and 2.3.4.
>4. www.t[..jp (SJIS www.yahoo.co.jp) in cidnuc will be
> www.aq8gdsnl7a.aq83bhru6j6.jp, leaving www and jp intact. :)
>
> Still dont really like the aq8, without it would a bit shorter.
Agree. However, all proposals need a way for the ASCII-encoded name to be
able to be differentiated from non-IDN names. Otherwise, there will be
errors in trying to decode a non-IDN name from the ASCII encoding. Dan and
I have been talking about this, and both the methods in cidnuc and 8&down
have positive and negative attributes. When (if?) the WG is ready to start
picking a single proposal, we'll need to revisit this. Fortunately, all of
the proposals can use any of the proposed tagging mechanisms.
--Paul Hoffman, Director
--Internet Mail Consortium