[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Let's go forward with IDNA and UTF-8



Patrik said:

>I see two differences between UTF-8 and ACE which was what I used as
>arguments when I made up my mind:
>
>(1) Risk of loss of information
>
>ACE approach is NOT destroying any information when data is sent from
>sender to receiver in any of the protocols we have today. It is 100%
>backward compatible.
Probably 100%, but some software may compare an ACE version of the name
with thde decode version and fail.

But ACE+nameprep IS destroying information.

Also ACE can only be used for host names, if may not fit all types of
DNS names and it cannot be used for other textual data in DNS.
So by using ACE you always need to implement handling of UTF-8 (or what
is used) for all other textual data in DNS.
When we internationalise DNS so IDNs can be handled it must also
include internationalisation of all text data handled in DNS.

>
>Use of UTF-8 in any place in any protocol where we have a domainname today
>MIGHT (I don't use any stronger term, but I don't see I have to either)
>destroy some data during transport.
>
>(2) How to fix broken domainnames
>
>In an ACE approach, the enduser himself can change his own applications to
>get rid of the ACE encoding and instead see the IDN as it was expected.
>
>In an UTF-8 approach the user MIGHT have to change his applications so they
>can display UTF-8 characters and (here comes the important part) someone
>else than the enduser MUST upgrade the intermediary boxes (HTTP proxies,
>SMTP gateways, SMTP MTA's, POP servers, IMAP servers etc) which is not
>under his control.
>
>I.e. if we compare with when we launched MIME, many people have told me
>that "but the goal is that the user should not see the ACE, and geee what
>Quoted Printable was ugly, let's not use the same mechanism again".

There is one major difference between MIME and ACE/UTF-8 in DNS.
In MIME the most importat element for delivery of e-mail, the
e-mail address, was not changed and forced to remain in ascii.
This means that if an adress is copied from an application understanding
MIME to one who does not, it will still work.

With DNS it is the address itself that is changed.
If you have an application understanding ACE and decodes it
into the real address, I am sure people will copy the decoded form
into applications that do not know about ACE and will therefore result
in UTF-8 or other encoding being sent into DNS.
You will get this problem with the UTF-8 solution also.
Some of these problems can be handled if you use DNS servers
like those I have in UDNS which can handle both ACE and UTF-8, as
well as some local characters sets.
So I am sure several of the problems you see with UTF-8 will
also exist with the ACE only solution, though with ACE only
decoded ACE names will never work.

-
One thing everybody also may think about. Even with UTF-8 as
only encoding, there must be a way to display names that
cannot be displayed using the characters available in a client.
So an ACE or a (local charset) Compatible Encoding is still
needed. And I prefer one that can be used by all programs so
all programs display non-displayable names in the same way.


   Dan