[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] UTF-8 / RACE



As I say I will try to give some comments on UTF-8 with fast nameprep
after I get an understanding of what you mean by that.
Costs of UTF-8 with fast nameprep
What it means. Domain names are encoded as UTF-8 strings in all of the
above contexts.
Bad names are not allowed to appear. (Exception: Users can send bad
names in DNS registration forms; the registrar will send back a
rejection notice showing the closest good name.)

JS: What is considered a "bad names"? From some of your note, you
suggested only lower-case characters allowed in DNS. Does this two match
"françious" vs "FRANÇIOUS"? (By your fast-nameprep definition, they wont
match unless the users key in exactly in the same way the administrator
puts it in).

Making it work. gethostbyname needs to be upgraded. Many current
installations, in violation of RFC 2181, reject DNS answers that contain
unusual characters. (However, some versions will work correctly with
options allow_special all or options no-check-names in
/etc/resolv.conf.)

JS: Agree with this altho it is probably a simplistic summary of the
pain.

mutt needs to be upgraded. UTF-8 needs to be spaced properly.

pine needs to be upgraded. UTF-8 needs to be spaced properly.

JS: Lets expand this to any non-UTF-8 applications which uses Domain
Names.

sendmail needs to be upgraded. Current versions are not 8-bit clean:
they discard bytes \200 through \237 in mail message headers, because
those bytes are used for other purposes inside sendmail's
string-handling routines. The relevant section of code is in collect.c:
...

JS: It is too early to suspect how I18N Email will look like to decide
what changes would be required in sendmail but I suspect making it 8-bit
clean as you suggested here is probably needed if UTF-8 are use in IDN.

There's one report that an obsolete version of the Netscape mailer
crashes under Solaris when it reads UTF-8 messages. I need verifiable
details.

JS: Lets go thru the list you made for IDNA and I see most of them would
need to be patch or made 8-bit clean to work (including bind).
Applications which does not support UTF-8 has to be upgraded to support
UTF-8.

But most important of all, you left out the needs to 'patch/replace' the
keyboard so that 'bad names' cannot be entered.

-James Seng

----- Original Message -----
From: "James Seng/Personal" <James@seng.cc>
To: "D. J. Bernstein" <djb@cr.yp.to>; <idn@ops.ietf.org>
Sent: Monday, May 28, 2001 3:09 PM
Subject: Re: [idn] UTF-8 / RACE


> Is this your defination for Fast Nameprep?
>
> If that is the case, are you suggesting we change the keyboard
interface
> to do "Nameprep" so there is no 'bad' dot? Please explain further how
> you proposed to interact with keyboard interface...
>
> Incidently for those who cares, on a Chinese/Japanese IME, a dot can
> either be U+3002 or U+002E depending if it is full/half width and in
> Korean IME, a dot can either be U+FF9E or U+002E.
>
> -James Seng
>
> > This is just like the current handling of dots. Yes, there are bad
> dots,
> > but the keyboard interface helps the user type domain names with the
> > ASCII dot, so applications don't have to worry about bad dots. Are
you
> > going to demand that we change and redeploy thousands of programs to
> > accept non-canonical dots in domain names?
> >
> > ---Dan
> >
>
>