[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Update Charter revision 2



-----BEGIN PGP SIGNED MESSAGE-----

James Seng/Personal wrote:
> 
> > - There is no target for 'Requirements' in milestone.  I think it
> >   should be.
> 
> It is already move forward for IESG action. it is technically out of the
> hand of the wg.

Nonsense. There has been no Last Call. I made comments on the draft that
have not been adequately addressed (in fact none of my technical points
were addressed at all). Until the end of the Last Call period, the document
certainly isn't "out of the hands of the WG".

It is worth pointing out that:

 - IDNA does not satisfy requirements 11 (since it allows unassigned
   codepoints in queries without versioning) and 18 (since it allows
   more than one representation of a name, differing in case),

 - no solution that requires changes to the DNS protocol can satisfy
   requirement 1,

 - no solution that uses both ACE and transparent representations can
   satisify requirement 18.

IMHO there is little point in publishing a document specifying
requirements that are unachievable.


Here are my comments again, and hopefully we can have some substantive
technical discussion of them this time.

- -----

The original (-08) document is quoted using '>', and my suggested changes
are quoted using '=>'.

> 6. A transfer encoding syntax (TES) is a reversible transform of encoded
>    data which may (or may not) include textual data represented in
>    one or more character encoding schemes. Examples: 8bit,
>    Quoted-Printable, BASE64, UTF-7 (defunct), UTF-5, and RACE.

This definition is never used.

> ... This document attempts to
> set requirements for an equivalent of the "used services" given above,
> where "hostname" is replaced by "Internationalized Domain Name". This
> does not preclude the fact that IDN should work with any kind of DNS
> queries. IDN is a new service. Since existing protocols like SMTP or
> HTTP use the old service, it is a matter of great concern how the new
> and old services work together, and how other protocols can take
> advantage of the new service.

IDN is not a new service; it makes more sense to consider it as an
extension of all the existing services. For example, in IDNA, the
existing IP-to-hostname service can return an (ACE-encoded) IDN, or a
non-IDN query can follow a DNAME record that points to an IDN. These
cases wouldn't be possible if IDN was a separate service.

=> ... This document attempts to
=> set requirements for extensions of the "used services" given above,
=> where "hostname" is replaced by "Internationalized Domain Name".
=> That is, IDN should work with any kind of DNS queries. Since we are
=> extending services used by existing protocols like SMTP or HTTP,
=> compatibility with these existing uses is a matter of great concern,
=> as well as how both new and old protocols can take advantage of the
=> new facilities.


> 2. General Requirements
> 
> These requirements address two concerns: The service offered to the
> users (the application service), and the protocol extensions, if needed,
> added to support this service.
> 
> In the requirements, we attempt to use the term "service" whenever a
> requirement concerns the service, and "protocol" whenever a requirement
> is believed to constrain the possible implementation.

What we are setting requirements for are IDN proposals. Some (not all)
of the cases where "service" or "protocol" is used should actually say
"proposal", IMHO. I've made those changes below without further comment.


> [1] The DNS is essential to the entire Internet. Therefore, the service
> MUST NOT damage present DNS protocol interoperability. It MUST make the
> minimum number of changes to existing protocols on all layers of the
> stack.

Requiring the "minimum number of changes" fails to consider the cost
or feasibility of any change; it is requiring an absolute, which is
always a bad idea.

I.e. if proposal B needs protocol changes in addition to those of proposal
A, then no matter how insignificant the cost of those changes are, or what
other advantages B has, this requirement would eliminate B. The effect is
that minimising protocol changes overrides every other consideration.

> It MUST continue to allow any system anywhere that implements
> the IDN specification to resolve any internationalized domain name.

"continue to" should be deleted. Obviously no system can resolve an IDN
at the moment.

=> [1] The DNS is essential to the entire Internet. Therefore, the proposal
=> MUST NOT damage present DNS protocol interoperability. It MUST allow any
=> system anywhere that implements the IDN specification to resolve any
=> internationalized domain name. The overall cost and feasibility of
=> changes to existing protocols, on all layers of the stack, is of great
=> importance when evaluating any IDN proposal.


> [3] The DNS protocol (the packet formats that go on the wire) MUST
> NOT limit the codepoints that can be used. A service defined on top of
> the DNS, for instance the IDN-to-address function, MAY limit the
> codepoints that can be used. The service descriptions MUST describe
> what limitations are imposed.

The packet formats that go on the wire use octet strings, not strings
of codepoints. In order to maintain compatibility with the requirements
of RFC 2181, it is the set of octet strings that must not be limited.

Also, there may be other restrictions on host names besides the set of
allowed codepoints (for example relating to mixing of left-to-right and
right-to-left scripts, or names that start with an ACE prefix).

=> [3] The DNS protocol (the packet formats that go on the wire) MUST
=> NOT limit, apart from in length, the set of octet strings that can be
=> used as an encoded domain name. A service defined on top of the DNS,
=> for instance the IDN-to-address function, MUST define a mapping between
=> host name strings and these octet string encodings, and MAY impose
=> limitations on host names, for example by restricting the set of
=> allowed codepoints. The service descriptions MUST describe what
=> limitations are imposed.


> [4] The protocol MUST work for all features of DNS, IPv4, and
> IPv6. The protocol MUST NOT allow an IDN to be returned to a requestor
> that requests the IP-to-(old)-domain-name mapping service.

This is unclear. Returning an ACE name to an "old" requestor will
clearly not break anything, and an ACE name is an (encoded) IDN. It also
doesn't take into account that some resolver interfaces are already
Unicode-aware, in which case they would not require any distinction
between old and new requests (this is true for InetAddress.getHostName
in the Java API, for example, or for getipnodebyaddr, etc. in Plan-9).

=> [4] The proposal MUST work for all features of DNS, IPv4, and IPv6.
=> The proposal MUST ensure that the responses to requests for an IP
=> to domain name mapping will not break existing requestors.


> [5] The same name resolution request MUST generate the same response,
> regardless of the location or localization settings in the resolver, in
> the master server, and in any slave servers involved in the resolution
> process.

This is also unclear (with respect to the resolver; I agree about the
rest). Surely it wasn't meant to prohibit any use of locale information
in defining the API to the resolver?

=> [5] The same name resolution request MUST generate the same response,
=> regardless of the location or localization settings in the master
=> server, or in any slave servers involved in the resolution process.
=> Any description of functionality required of a resolver API MUST
=> discuss any dependency of the API on localization settings in the
=> client.


> [8] The service MAY modify the DNS protocol RFC 1035 and other related
> work undertaken by the DNSEXT WG. However, these changes SHOULD be as
> small as possible and any changes SHOULD be coordinated with the
> DNSEXT WG.

"As small as possible" is too strong - it is possible to support IDNs
without changing the DNS/DNSEXT protocol at all, and so this is equivalent
to "the DNS/DNSEXT protocol SHOULD NOT be changed". Nevertheless, changing
it may have advantages.

=> [8] The proposal MAY modify the DNS protocol RFC 1035 and other related
=> work undertaken by the DNSEXT WG. However, any such changes SHOULD be as
=> small as needed to support the basic design of the proposal, and SHOULD
=> be coordinated with the DNSEXT WG.


> [9] The protocol supporting the service SHOULD be as simple as possible
> from the user's perspective. Ideally, users SHOULD NOT realize that IDN
> was added on to the existing DNS.

A better way of expressing this is

=> [9] The user's perspective of each service SHOULD be as simple as can
=> be practically achieved. Ideally, users will find the services no more
=> difficult to use than if internationalised names had always been
=> supported by the DNS.


> [11] The protocol should handle with care new revisions of the CCS.
> Undefined codepoints should not be allowed unless a new revision of
> the protocol can handle it. Protocol revisions should be tagged.

The current version of nameprep allows unassigned code points in queries
without revision tagging, for good reasons.

=> [11] The proposal should handle with care new revisions of the CCS.
=> Proposals MUST discuss how undefined codepoints are handled.


> [12] Internationalized characters MUST be allowed to be represented and
> used in DNS names and records. The protocol MUST specify what charset is
> used when resolving domain names and how characters are encoded in DNS
> records.

Note that "charset" usually means a MIME-registered charset, which is
not necessarily the case here (for example when the encoding is ACE, or
when more than one possible encoding is allowed - see [18] below).

In fact the modified version of [3] above requires the mapping between
host names and the octet strings used on the wire to be specified (and
in any case, not doing this would be a violation of BCP 18), so the
second sentence is now redundant.

=> [12] Internationalized characters MUST be allowed to be represented and
=> used in DNS names and records.


[14], [15], [16], [17]: protocol -> proposal.


> [18] While there are a wide range of devices that use the DNS and a
> wide range of characteristics of international scripts and methods of
> domain name input and display, IDN is only concerned with the
> protocol. Therefore, there MUST be a single way of encoding an
> internationalized domain name within the DNS.

No, there does not need to be a single way of encoding an IDN (there is
not even for pure ASCII names, because equivalent names that differ in
case are allowed in the DNS protocol). For example, a solution where
names on the wire can be either ACE or UTF-8 will work perfectly well.
The wire encoding isn't even directly visible to users or applications,
so I don't see why the requirements document should be saying anything
about it.

If the intention was to refer to the external representation of a name,
rather than the wire encoding, that would prohibit any solution that
involves a transition from ACE to transparent names, or simultaneous use
of ACE and transparent names. Either way, this requirement is unnecessary
and should be deleted.


> [19] To achieve interoperability, canonicalization MUST be done at a
> single well-defined place in the DNS resolution process. The protocol
> MUST specify canonicalization; it MUST specify exactly where in the
> DNS that canonicalization happens and does not happen; it MUST specify
> how additions to ISO 10646 will affect the stability of the DNS and
> the amount of work done on the root DNS servers.

The overspecification here is "at a *single* ... place". For example,
if canonicalization is specified by nameprep, it is idempotent, i.e.
nameprep(nameprep(x)) = x. So doing it more than once only hurts
efficiency, not interoperability or any other requirement. It doesn't
even hurt efficiency very much, since the common case where a name is
already in the correct form can be optimised.

> ... The protocol MUST specify canonicalization; ...

This is meaningless without specifying what the goal of canonicalization
is. The minimum requirement is to ensure that characters that are
indistinguishable to users are treated the same, and so that is what
should be stated:

=> [19] Characters that appear absolutely indistinguishable to users
=> MUST be canonicalized. To achieve interoperability, canonicalization
=> MUST be done at a well-defined place or places in the DNS resolution
=> process, i.e. the proposal MUST specify exactly where in the DNS
=> canonicalization can and cannot happen.

[I changed "happens and does not happen" to "can and cannot happen",
because whether a particular piece of the infrastructure actually does
canonicalization will depend on which pieces have been upgraded. If they
haven't been upgraded, then they obviously won't do canonicalization,
regardless of whether the IDN proposal says that they should.
Note that this is a potential motivation for a proposal to specify that
canonicalization may be done in more than one place.]


> [23] If other canonicalization is done, it MUST be done before the
> domain name is resolved.

It makes perfect sense to do canonicalization as part of resolution, not
before it. Also, canonicalizing after resolution is certainly feasible,
even if it is inefficient.

> ... Further, the canonicalization MUST be easily
> upgradable as new languages and writing systems are added.

It *will not* be easy to upgrade canonicalization, no matter where or how
it is done, because the incentive to upgrade to support scripts that most
people will consider completely obscure, is much less than the incentive
to get IDN support working the first time round.

=> [23] Consideration MUST be given to the ease of upgrading
=> canonicalization as new languages and writing systems are added.
=> As long as a name containing new characters has been canonicalized
=> according to the latest canonicalization version, resolving that name
=> MUST NOT depend on further upgrades to the resolver or DNS servers.


> [25] If the charset can be normalized, then it SHOULD be normalized
> before it is used in IDN. Normalization SHOULD follow Unicode
> Technical Report #15.

Although this describes a perfectly reasonable approach, it does not
belong in a requirements document. [26] already captures part of the
requirement ("The protocol SHOULD avoid inventing a new normalization
form provided a technically sufficient one is available."), and the
modified [19] above captures the rest. So, delete this requirement.


[29]: service -> proposal.
[30]: protocol -> proposal.


> 3. Security Considerations
> 
> Any solution that meets the requirements in this document MUST NOT be
> less secure than the current DNS.

That is not necessarily achievable. The main issue is name spoofing using
look-alike characters: even if a proposal specifically tries to address
that (by registration procedures, for example), it can't absolutely
guarantee that there will not be cases of this that rely on IDNs.

> Specifically, the mapping of
> internationalized host names to and from IP addresses MUST have the
> same characteristics as the mapping of today's host names.
> 
> Specifying requirements for internationalized domain names does not
> itself raise any new security issues. However, any change to the DNS MAY
> affect the security of any protocol that relies on the DNS or on
> DNS names. A thorough evaluation of those protocols for security
> concerns will be needed when they are developed.

That evaluation is needed for existing protocols, not just new protocols.

- -- 
David Hopwood <david.hopwood@zetnet.co.uk>

Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/
RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5  0F 69 8C D4 FA 66 15 01
Nothing in this message is intended to be legally binding. If I revoke a
public key but refuse to specify why, it is because the private key has been
seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip


-----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
Charset: noconv

iQEVAwUBO9e+oDkCAxeYt5gVAQEOCgf8Dgk0sC/8cl1ltpRRy/CZKwSdj4Cs89EV
tRqu4+DOBRBRE5x0s1ZkQKxCx2VjKu2fJ73lHLDsqiWgwolfOlO0cx4iovfpempy
K2yPytC+n37yUjaxX3iTJYe48KNuozldNjdtV1F4SXQCbKfNRDQhHVqQj/2CTzCy
PNYPTZpcFoL/gXLJ94QyBCAH24EIS7n3FuAY0rDt3IWyG1Lbdkr1SkpDzf7iToji
msuMl00YOn6u1TL3DO9qBBzejHVav5Bpw9L/NXGEIgtjZl1ShPgLRwFBztLC0IrV
lewoJInjswf4sCXW2yEBOeyZq99pzQ1FLZpwTdkXf4zkm2nHOSn+Wg==
=FDVk
-----END PGP SIGNATURE-----