[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Alternative Solutions



Kenneth Whistler <kenw@sybase.com> wrote:

> Has anyone considered trying bit and byte transforms directly on the
> UTF-8 encoding form, rather than on the Unicode code point?

My intuition is that this would result in much longer encodings than
any of the ACEs considered so far.  UTF-8 is already pretty bad for
non-Latin languages.

> How about some deterministic and quick compression run on the UTF-16
> encoding form of a Unicode string, and *then* rearranging the
> resulting bytes to accomodate ASCII restrictions?

You have just described RACE and LACE.

> Or has WALID actually gotten a patent on a generic concept without
> actually specifying a particular technique to accomplish it??

That's my impression.

Keith Moore <moore@cs.utk.edu> wrote:

> one way to do this might be to define a new RR type that specified
> that a single ASCII label and a single IDN label (both relative to the
> current zone) were equivalent.

In other words, instead of doing one lookup to map ####.%%%%.org to
foo.bar.org, you would first look up %%%%.org to get bar.org, then look
up ####.bar.org to get foo.bar.org.  This might be a good idea.  It
looks like more steps, but the single lookup of ####.%%%%.org can be
just as many steps in practice.

I still think you need two new RR types, one to map ####.bar.org to
foo.bar.org, and one to map foo.bar.org to ####.bar.org.  And there's
nothing to stop the administrator of the *.bar.org zone from making
those two pointers inconsistent, just like there's nothing to stop
people from making A and PTR records inconsistent today.  But no one can
usurp your authority.  If you want your own domains to be well-behaved,
then you make your pointers consistent.

You can encourage consistency, but there's no way to guarantee it.  For
example, some name servers today return different responses depending
on the source address of the query.  The administrator of a zone can do
anything they want with that zone.

John C Klensin <klensin@jck.com> wrote:

> But, from reading the correspondence with Walid, and based on what
> I've been told several times, the main threat is to IDNA, not to the
> set of ACE proposals themselves.

That's my impression as well, in which case it doesn't matter whether
we use an old dumb ACE or a new clever ACE, the problem is the whole
IDNA approach.  And maybe the idea of using a DNS lookup instead of an
algorithm to do the transformation still doesn't go far enough to escape
the scope of the patent.  I don't know.

Keith Moore <moore@cs.utk.edu> wrote:

> are any characters in UTF-8 "RFC1035 non-compliant"?

This phrase undoubtedly intendsd to refer to section 2.3.1 "Preferred
name syntax", which says:

    However, when assigning a domain name for an object, the prudent
    user will select a name which satisfies both the rules of the domain
    system and any existing rules for the object, whether these rules
    are published or implied by existing programs.

    For example, when naming a mail domain, the user should satisfy both
    the rules of this memo and those in RFC-822.  When creating a new
    host name, the old rules for HOSTS.TXT should be followed.  This
    avoids problems when old software is converted to use domain names.

    The following syntax will result in fewer problems with many
    applications that use domain names (e.g., mail, TELNET).

The BNF that follows describes the syntax, but that syntax was relaxed
by RFC 1123 to allow labels to start with digits.

> with the possible exception of length restrictions on labels, ACE
> isn't needed to make a domain name RFC 1035 compliant, it's needed to
> make the domain name compliant with other protocols that expect ASCII.

Right, the patent claim doesn't quite say what it means.  But I think
judges usually try to interpret language "reasonably" rather than
literally.

AMC