[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Punicode: Upper-case in example

To: IETF idn working group <idn@ops.ietf.org>
Subject: Re: [idn] Punicode: Upper-case in example
From: Martin Duerst <duerst@w3.org>
Date: Thu, 28 Nov 2002 05:41:28 +0900
In-reply-to: <20021126225013.GB8215@nicemice.net>
References: <4.2.0.58.J.20021127035248.03c13bf8@localhost><E189mrm-0004QZ-00@psg.com><200211071321.VAA04362@msr.hinet.net><200211071321.VAA04362@msr.hinet.net><4.2.0.58.J.20021127035248.03c13bf8@localhost>

Hello Adam,

Many thanks for your quick response.

At 22:50 02/11/26 +0000, Adam M. Costello wrote:

Martin Duerst <duerst@w3.org> wrote:

> In http://www.ietf.org/internet-drafts/draft-ietf-idn-punycode-03.txt,
> example (I) says:
>
>  (I) Russian (Cyrillic):
>         U+043F u+043E u+0447 u+0435 u+043C u+0443 u+0436 u+0435 u+043E
>         u+043D u+0438 u+043D u+0435 u+0433 u+043E u+0432 u+043E u+0440
>         u+044F u+0442 u+043F u+043E u+0440 u+0443 u+0441 u+0441 u+043A
>         u+0438
>         Punycode: b1abfaaepdrnnbgefbaDotcwatmq2g4l
>
> The presence of the upper-case 'D' (not to say the string 'Dot' :-)
> is confusing, because it seems completely arbitrary.  There is no
> upper-case letter in the Cyrillic string.

> How did the upper-case D get in there?

It corresponds to the uppercase U in one of the code points in the u+
notation.  The sample Punycode implementation uses the case of the u
as a 1-bit annotation.

I see. I don't think this is a very good idea to use the U+ for
distinction, for the following reasons:

1) The u+ -> lower case, U+ -> upper case is not documented anywhere
   in the punycode draft (or at least I didn't find it). If used at
   all, it should be documented straight at the start of the examples.

2) The above convention is very easy to overlook, in particular because
   u+ and U+ look so very similar. It is close to a widely established
   convention, but differs slightly.

3) Punycode can be used in different ways, on mixed strings, on
   lc strings that still contain the original casing info, and
   on pure lc strings. Maybe there should be separate examples
   for all these three uses.

Regards,   Martin.

Follow-Ups:
- Re: [idn] Punicode: Upper-case in example
  - From: "Adam M. Costello" <idn.amc+0@nicemice.net.RemoveThisWord>
- Re: [idn] Punicode: Upper-case in example
  - From: martin@v.loewis.de (Martin v. Loewis)

References:
- [idn] Punicode: Upper-case in example
  - From: Martin Duerst <duerst@w3.org>
- [idn] Re: Privacy Policy
  - From: ietfauto@ietf.org (Internet Draft Submission Manager)
- Re: [idn] Punicode: Upper-case in example
  - From: "Adam M. Costello" <idn.amc+0@nicemice.net.RemoveThisWord>

Prev by Date: Re: [idn] Re: Fwd: Unicode letter ballot
Next by Date: Re: [idn] Punicode: Upper-case in example
Previous by thread: Re: [idn] Punicode: Upper-case in example
Next by thread: Re: [idn] Punicode: Upper-case in example
Index(es):
- Date
- Thread