[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] naming syntax rules




Dan Oscarsson wrote:

> >No, DNS has two rules: octet strings with ASCII, and a hostname subset.
> 
> The DNS standard only have one type of label: octet strings with the
> code values 0-127 treated as ASCII during comapring.
> The DNS server do not have a hostname subset and may not reject
> any name not following it.

This is what I meant by my followup message yesterday when I said that we
are probably in agreement here. The proposed syntax rules only define one
"domain name" syntax (non-normalized, non-lowercased, any UCS code), but
there are several application-specific subset definitions.

Note that the subset definitions are either implicitly and explicitly
defined for STD13. For example, the mailbox rules are not defined in STD13
but instead section 8 of RFC1035 kinda sorta delegates the rules to
RFC822. The difference here is that the IDN rules (should) explicitly
declare the mailbox rules as being defined by the local-part definitions
in 2822.

Anyway, I don't think we're in disagreement on this part.


> >Mailbox names must be case-preserved in order to satisfy protocol
> >dependencies, and are not used in lookups so normalization is not
> required.
> 
> Normalisation is still needed, otherwise may some clients interpret
> the name as something else. I could use the ASCII range to encode all
> characters I need for Swedish and use that in labels, but you would
> not get the right name if you assume they are ASCII.
> That is why character data must be normalised.

If an i18n successor to 2822 allows encoded versus unencoded mailbox names
to collide, it will have failed.

Under the current rules, they can only contain US-ASCII, must be
case-preserved, so that is all we should worry about. If/When the mailbox
rules are changed, they should be backwards compatibile. If we enforce our
own biases and rulesets on these names -- even though it is not our area
of responsibility -- we will impose additional considerations on the
successor to 2822. We do not have any authority to do so, have no
technical justification for doing so, and therefore we must not do so.


> Just like some systems have case-sensitive mailboxes, some systems have
> case-sensitive host names (Unix is one of them). So you need to retain
> case to avoid breaking them.

I want to call out this topic for discussion separately.


> > this change would mean ~"if all of the characters in the delegation
> > are LDH, then the minimum length is 2 characters, otherwise it is
> > one character."
> 
> I do not know what you mean by current delegation rules. In .com domain?
> 
> Some domains allow 1 character subdomains.

I was looking through ICANN's site for a canonical reference on this point
and cannot find one. However:

  a) you cannot register a.[com|net|org]

  b) some of the new gTLDs have minimum length rules of 3 characters,
     and not just two characters

  c) I haven't ever seen any ccTLD take 1 char delegations but I will
     believe you.

Since those three points are non-normative and no canonical ruleset exists
to reference (that I can find), I would say that we should redefine this
rule to be a minimum of 1 character, with the disclaimer that delegation
rulesets may be different. If com|net|org still only accept 2 chars in
delegations then that will be their issue to deal with.

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/