[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] UTF-8 / RACE



On Sun, 27 May 2001, Adam M. Costello wrote:

> What should they display?  They should display the ACE, because that's
> no uglier than anything else they might display, and has the added bonus
> that you can copy it into any other application and it will still work.

Yes, it is uglier than something else. How much uglier depends on what
ACE is chosen. In the worst case everything is garbage while most of
the characters could have been displayed correctly. In the best case
it's only unnecessary ugly because it's displaying an internal prefix
to the user and some of the characters in an uncommon way.

Furthermore, a computer can't always recognize what is a domain name
and what is not. I think it's pretty darn ugly that the same domain
e g in the header of a mail message and in the body of the same
message should be displayed in two completely different ways.

The problem of characters which can't be displayed should be solved in
exactly the same way it is today. Which is: Not standardized. Some
systems or programs can use a default character. Others, like Emacs,
could use a backslash ('\') and a coding point number. And so on.
Since the coding point is stored internally even though it can't be
displayed, there should be no problems with copying and pasting it.

ACE has it's advantages but the display problem is not a reason to
choose ACE. I think it's obvious that it should eventually be phased
out. Once again, in the long perspective we need a common encoding
like ASCII: The Unified Encoding that is used for every single piece
of text that is transmitted on the Internet.


> (The Roman alphabet is one of the smallest
> alphabets in existence, and is already widely recognized.  If you have
> to chose a fallback character set for everyone to learn, that's the best
> choice.)

Since you're not talking about the Roman alphabet here, I suggest you
use the term "English alphabet" instead. Even though the first is
based on the latter, they are not the same. Being correct doesn't hurt.

/Magnus