[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] UTF-8 / RACE
>
> Let's say John Luser sends the following mail to you (where <oe> is o
> with umlaut).
>
> From: John Luser <luser@f<oe>reningen.org>
> To: Keith Moore <moore@cs.utk.edu>
> Subject: Hello
> Content-Type: TEXT/PLAIN; charset=ISO-8859-1
> Content-Transfer-Encoding: 8BIT
>
> Hello Keith,
>
> Why don't you take a look at our new website on f<oe>reningen.org?
>
> Have a nice day!
>
> John
>
>
> Now, let's say your local system knows about ISO 8859-1 and is able to
> convert it to the internal encoding but unable to display <oe>. Thus
> it displays it as [] (square box - your systems default character).
>
>
> With mr Costellos suggestion this would be displayed to you as:
>
> From: John Luser <luser@px--sdn3fnfuwy4rn5wutn.org>
> To: Keith Moore <moore@cs.utk.edu>
> Subject: Hello
>
> Hello Keith,
>
> Why don't you take a look at our new website on f[]reningen.org?
>
> Have a nice day!
>
> John
>
>
> (Or whatever the chosen ACE encoding turns it into.)
>
> It's hard to recognize px--sdn3fnfuwy4rn5wutn.org as the same domain
> as f[]reningen.org. It's a lot uglier as well.
okay, I see your point. I can see why a user might want either form -
the "ugly" format so that he/she could transcribe the ACE format, and
the "pretty" format so that it wouldn't be so distracting. If I were
writing an MUA I'd probably display the pretty one by default but
have an option to display the ugly form (just as MUAs have options to
display certain message headers that aren't displayed by default).
I don't think I'd want the protocol standards to dictate how
non-presentable characters are displayed, except perhaps to say
that they shouldn't mislead the recipient. (i.e don't display
o with umlaut as o or oe and make the recipient think that it's
a different address than it really is.)
> ACE is always a worse and an uglier thing to display when a system
> encounters unknown characters. Thus it's hardly an argument for using
> ACE. I can't agree that displaying the domain in the header as ACE,
> i e as "px--f+xg-reningen.org", "px--sdn3fnfuwy4rn5wutn.org" or
> whatever, is "no uglier than anything else they might display".
nor can I. and the experience with RFC 2047 encodings would also
seem to support your argument (and those were designed to be at least
somewhat readable for supersets of ASCII). people found display of
2047 encodings annoying even when they could manage to read them.
to a lesser extent this was also true of quoted-printable in message
bodies.
> Do you understand what I'm saying now?
I think so (though perhaps I should not be so presumptious as to
claim that I do for sure?) thanks for the clarification.
> > I am not sure that we will find that Holy Grail anytime soon.
> > Even if we adopt UTF-8 we will still have to deal with various ways
> > of encoding "rich" text.
>
> True, but we have to start somewhere. Every journey begins with a
> single step. (Although, this journey started with BCP 18. This is just
> a very large and important step on the way.)
True. But (as long as we are being philosophical) just because you start
on a journey does not mean you know where it will take you, or how you
will get there. :)
> > But this is all beside the point. The IDN WG cannot legislate the
> > encodings that are used by other applications; and it cannot legislate
> > that existing applications change their behavior. It can only recommend
> > how to solve the I18N problem for domain names. If the solution that
> > IDN recommends doesn't work well for some applications, it will not get
> > adopted for those applications - even if the recommended solution gets
> > approved as a standard.
> >
> > It's very important to be realistic about what can be acheived.
>
> Sure, but it is hardly a reason or excuse for this WG to suggest a bad
> solution. And I fail to see what it has to do with displaying domains
> as ACE when the application encounters unknown characters.
I think that ACE is necessary for other reasons, but I agree that this
particular argument does not lend much support to ACE.
Keith