[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[idn] Some other aspects
We are very focused on host names, but maybe we should think a
little broader.
Let us think we have reached a time where all protocols use UTF-8.
And your system works with and uses UTF-8 (or other form of UCS).
Still, it will not be able to display all characters in UCS (and it
might not even be wanted).
What is going to happen with text that cannot be displayed?
For a domain name an ACE could be used, but why not use a single
standard form for all text?
One easy way could be to do like: abc\uXXXXef\UXXXXXXXXgh\xXXij.
Where \uXXXX represents 16-bit UCS value as hex in XXXX and
the orther two for 31-bit and 8-bit UCS-values.
All programs could use this way. There is no need to use something
like an ACE to make it compact and fit some ASCII world.
It would be much simple than like now having ACE, quoted-printable,
%-encoding and probably some more special encodings.
What I want to say with the above is that, yes UTF-8 (UCS) is fine,
but it will not make all of UCS be displayable everywhere so we still
need a way to present non-displayable characters to the user.
But lets not do this by introducing special encodings into ASCII for
every application.
-
I also IDN is the wrong level to start implementing international DNS.
For internationalisation of DNS, we should start by making DNS use
UCS for all character data by having it normalised and using well
defined matching rules.
At an upper layer, above basic DNS, we can add IDN rules like
what character are forbidden. But these rules should not be applied to
the lower levels of DNS because they restrict DNS to be only
"host names", now and forever.
Dan