[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Charter, refocus



There have been a lot of talk about domain names and charter recently.
And
in the charter there are many milstones for drafts I do not know what
they mean.

I will here go back to what I think DNS stand for, and how it relates to
how things
are expected to work by people, and what we need to do.

The foundation in DNS is a binding of a "domain name" to one or more
objects.
The RFCs directely or indirectely state that the "domain name" is a
printable text string.
There is an RFC for binary labels with EDNS, but the standard label is a
text label.
While allowed to contain "binary", the normal usage everywhere is a
printable character string.
The normal expected usage, and what has been defined by the RFCs, are
that the text
labels (domain name) is to be matched case-insensitively. This results
in that
within a domain, you cannot define two labels that only differs in case.

So the expected usage of DNS is "a text string" (domain name) bound to
one or more
objects. The same domain name can be bound to both a "host" and to
something else.
Because of this, the DNS do not have spcial rules for "host names". It
only have
rulse for "domain names". The same format and matching rules are applied
for
all domain names: a sequence of text lables, 1-63 characters long, max
255 for complete
domain name, and each label case-insensitively matched.
And current DNS have only handled character in the ASCII range.

Current DNS also defines that stored case shall be returned, if
possible. This means
that a host name returned in a PTR

There is something called "host name" or maybe it is arpanet host name
that have been
changed several times. I expect the restriction on characters allowed
was to simplify
for programmers more than for users. DNS does allow a host name to
contain any
printable character.

That is the world today.

Now we are defining how DNS should handle domain names when they include

non-ASCII characters. This means that the "domain name" now can include
any printable character in UCS.

Following the standard and, by users, expected workings of DNS, this
means that
the DNS should still be a database with "domain names" bound to one or
more
objects. "domain names" should still be printable text strings and they
should still
match case-insensitively. And due to that a "domain name" can be bound
to
both a "host" and something else, DNS must treat all domain names the
same way.

Now when I look at the charter I see that we should produce a
domain name normalisation draft. What is this? I have not seen any
such thing yet.

Unicode and W3C have used the word "normalisation" on a text to
mean: encode each character in a single standard way.
This means that, for example, the Angstrom sign is replaced
with "A with a ring above" and "double width" latin letters should
be replaced by standard width. It does NOT mean to convert to
lower case.
This is needed because UCS does allow the same character to
be encoded in many ways, and that makes software quite complex
when doing matching of strings.
As we want to use UCS as the single character set to domain names,
all domain names must be normalised. Otherwise they cannot be compared
easily. A suitable starting point is Unicode normalisation form KC.

So what I see we need for DNS is:

1) An RFC stating the standard normalisation of domain names.

2) An RFC defining how domain names must be matched. As a minimum
   it must require case-insentive matching for characters with case.

When that is done, we can add:
3) An RFC for ACE so we have a standard way to encode domain names
   in ASCII,  complying with the legacy "host name" character
restriction.

4) An RFC updating what applications should accept as "host names".

5) An RFC how to layer ACE on top of the current DNS protocol to allow
   legacy software to handle new domain names (for example IDNA).

6) An RFC defining how DNS itself can handle UCS based domain names
   using UTF-8.

Now in the charter is says:
- produce a draft on normalisation of domain name identifiers.
  Do we have a draft on this? I have not seen one. The existing
  nameprep draft does have parts of normalisation but
  includes much outside normalisation.

- produce a draft for architecture on many things.
   Do we have this?

- produce a IDN protocol draft.
  What is this? IDN protocol? Is this the IDNA layer on top of DNS?


Can we not start with the basic needs of DNS instead of jumping at
"host names"?

      Dan