[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] I-D ACTION:draft-ietf-idn-idna-08.txt
John C Klensin <klensin@jck.com> wrote:
> Instead, what is needed is one very clear paragraph in the IDNA
> document (I think). That paragraph should say that IDNA is to be used
> (with nameprep, etc.) for the representation of non-ASCII domain names
> in labels associated with RRs of type <list 1> in Class=IN, that it
> MUST NOT (?) be used with RRs of type <list 2> in Class=IN, and that
> it SHOULD NOT be used with RRs of type <list 3> in Class=IN until and
> unless a standards-track specification is produced that specifies
> otherwise. It should say similar things about data fields (for MX,
> NS, CNAME, etc (?)) using similar lists.
But IDNA has to work with more than just DNS, it has to work with the
wide variety of other protocols that carry domain names. By logical
extension, your proposed paragraph needs to list every field/argument of
every protocol/interface where IDNA may/should-not/must-not be used.
Isn't that too much trouble (or even impossible)? Isn't it simpler to
design IDNA so that it can safely be used for any (textual) domain label
anywhere? That was our intention.
"Eric A. Hall" <ehall@ehsco.com> wrote:
> Resolvers, middle-boxes and replication masters all need to be able
> to convert between EDNS and ACE as part of the fallback process.
> Distributing the profile-specific prefix to every point where
> conversion might occur is a massive problem.
I was suggesting that conversion between ASCII and non-ASCII never
be done inside the infrastructure except possibly when it uses the
well-known standard profile; for application-specific profiles, I was
suggesting that conversion be done only at the edges. This model avoids
profile-agnostic conversion; only entities that know the proper profile
perform the conversion, which simplifies the security analysis.
Your model is based on profile-agnostic conversion happening inside the
infrastructure. Let's examine how that would benefit applications.
Applications interact with the infrastructure in basically two ways:
sending strings into the infrastructure, and receiving strings from the
infrastructure.
When sending strings that use an application-specific profile into the
infrastructure, the application must perform Stringprep itself, because
the infrastructure needs to compare the string but doesn't know the
profile. Performing Punycode and prepending the prefix is not any
extra effort for the application programmer; whether the program calls
Stringprep(profile) or ToASCII(profile,prefix), it's one function call
either way. So there's no benefit in this case.
Now let's consider applications receiving strings that use an
application-specific profile. If the infrastructure cannot do
profile-agnostic conversion, then the application might receive an ACE,
whereas if the infrastructure can do profile-agnostic conversion, then
it can ensure that the application never receives an ACE. Whether this
is a benefit depends on what the application does with the string. If
it compares the string, then it needs to call Stringprep at least, and
calling ToASCII is no more trouble, so there's no benefit. If the
application passes the string along to a non-human, ACEs are not a
problem, so there's no benefit. If the application displays the string
to a user, then the ACE will need decoding, whereas a non-ACE wouldn't.
That's the one case I can think of where applications could benefit from
profile-agnostic conversion inside the infrastructure.
Now let's consider the cost of your model. Profile-agnostic conversions
and comparisons can return wrong answers if the inputs are not prepared
using the proper profile (whether by accident or by malice). There is
nothing in the label itself to indicate the proper profile; you want to
use the same prefix regardless of which profile is needed. The entities
that depend on correct conversions and comparisons need to know the
proper profile, and you are assuming that will be implied by context,
like the DNS RR type. But IDNA is useless if it only works for DNS,
it also needs to work for mail headers and SMTP commands and URIs and
SSL certificates and so on. So before IDNA could be used securely in
a given protocol/interface/etc, one would need to wait for the proper
profile to be specified for that particular protocol/interface/etc. I'm
sure that would be fine with you Eric, but it would defeat one of the
main design goals of IDNA, which is to allow applications to start using
it without waiting for standards to be updated.
I think it's simpler to have a single conversion and comparison rule for
all domain labels everywhere. In the few cases where applications need
to coerce other data types into domain names while preserving mixed-case
or non-normalized strings, they can define their own mapping function
(which does not use the IDNA prefix), and they'll just have to do the
conversion themselves with no help from the infrastructure.
> we are left with the application always performing conversion.
> That design blows up the architectural benefits from having the
> middle-boxes do it (in particular, having the caches learn the data so
> they can cache it).
I don't understand this. If the conversion to/from ACE is done only in
applications, then the strings on the wire are all ASCII, and caches
handle them just as they always have.
> <fooprep> FOO <barprep>
>
> Nobody has yet told me why this won't work.
I'm very leary of allowing <fooprep> on the left. If <fooprep> does
not include case-folding, then two queries for names that differ only
in upper/lower case could return different data. That's a fundamental
departure from the current model.
As for <barprep> on the right, the same argument could be made for
reverse queries, though I admit that reverse queries are hardly ever
used. My main concerns were give above.
Would it work? Maybe it would. But is it a good idea? It still
seems to me that allowing multiple profiles for domain labels is more
complication than it's worth.
> AMC and paf seem to be intimating that the IDNA labels may not be
> reversible, which may require an additional fixup if so, and if it is
> possible.
Punycode is reversible. It can encode any sequence of integers, and the
input sequence can always be recovered exactly.
Nameprep is deliberately non-reversible. It does normalization and
case-folding.
ToASCII is non-reversible only because it calls Nameprep.
If you were to define Eric's-ToASCII, which does not call Nameprep, then
it would be reversible.
AMC