[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] UTF-8 / RACE
- To: "D. J. Bernstein" <djb@cr.yp.to>, <idn@ops.ietf.org>
- Subject: Re: [idn] UTF-8 / RACE
- From: "James Seng/Personal" <James@Seng.cc>
- Date: Mon, 28 May 2001 05:29:21 +0800
- Delivery-date: Sun, 27 May 2001 14:30:46 -0700
- Envelope-to: idn-data@psg.com
Lets go on this track a bit further since you touch the topic of "seeing
on the screen".
On an OS/Application, we can consider an I18N UI to have three layer,
altho it is not obvious sometimes:
Layer 1: Rendering of Glyphs/Fonts
Layer 2: Application Encoding
Layer 3: Machine Encoding
Layer 3 is what the OS use internally for representation of codepoints.
Some OS uses Unicode (UTF-16), very few uses UTF-8, and a lot of legacy
one uses their own native encodings. In Windows example, this would be
either UTF-16 (>Win98) or native encodings (<Win98SE).
Layer 2 is what the application uses internally. Similarly at OS,
different apps choose different encodings. For example, Microsoft IE
uses native encodings (vary occuring to locale OS) at its interface
layer but you can switch it to support mulitple encodings including
UTF-8.
Layer 1 is where Layer 2 encodings are rendered into proper characters
so it can be display. This is by far the most troublesome since we have
to assumed the users have the right fonts and rendering engine to
display it.
Layer 1 cause the problem of 'seeing on the screen'. Supposing I were to
attach some chinese characters in this email using UTF-8 but you are not
equip with the chinese fonts, all you end up seeing is some gibberish.
This is a problem which cannot be solve using UTF-8 (Layer 2/3), but
rather an rendering and font problem (Layer 1).
AFAICS, some 8-bit gibberish is no different some seeing some ACE
gibberish if I am not ready to handle your scripts. A gibberish is a
gibberish if I cant see it even if it is encoded in UTF-8. Hence, the
end-user have to upgrade/update the client, in some ways.
Thus, "seeing on the screen" is probably not so a simple problem of
'using UTF-8 and we are done'.
Aside, I am a bit confuse by your defination of "UTF-8 with Fast
Nameprep". When you mean "Fast Nameprep", do you mean we don't do
Nameprep at all? I would be glad to give comments on the second part of
your website once this is clear up.
-James Seng
----- Original Message -----
From: "D. J. Bernstein" <djb@cr.yp.to>
To: <idn@ops.ietf.org>
Sent: Monday, May 28, 2001 4:26 AM
Subject: Re: [idn] UTF-8 / RACE
> My web page carefully specifies details of the ACE solution: where ACE
> is used, where it is not used, and what has to be changed to make that
> work. The reason that gethostbyname() has to be upgraded, for example,
> is that ACE is not used in ``command lines for telnet, ssh, etc.'' but
> ACE is used in ``DNS queries and responses.''
>
> My web page then carefully specifies details of the UTF-8 solution.
See
> http://cr.yp.to/proto/idn.html.
>
> I'm willing to analyze the costs of other IDN solutions. But I'm not
> interested in proposals that don't actually provide internationalized
> domain names. It's idiotic to claim that we have Greek domain names if
> Greek users aren't seeing Greek alphas and betas on the screen. (Does
> anyone dispute this?)
>
> James Seng/Personal writes:
> > The administrator should already have the ACE for his domain and
> > that could goes directly into the config file.
>
> No. That would not be a working IDN solution. A Greek administrator
> would see unreadable ACE gobbledygook instead of a readable Greek
alpha.
>
> ---Dan
>