[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] I-D ACTION:draft-ietf-idn-idna-08.txt
on 6/7/2002 11:48 PM Adam M. Costello said the following:
> This will still be true for IDNs. No matter whether you ask for
> föo, Föo, or FÖO, you'll get the same records back.
Important footnote clarification. The i18n namespace is case-sensitive
because of the AMC-Z encoding, not because of nameprep. The original
capitalization has to be burned to suit the encoding.
As a result, all i18n domain names (unencoded) must be compared as
case-specific data, by requirement of the codec.
This applies to every domain name which can be encoded with IDNA
regardless of whether or not nameprep is used.
> You seem to be proposing a model in which there might be special kinds
> of labels that use different Stringprep profiles that don't do case
> folding.
Absolutely. If somebody needs an RR that preserves case, there's no reason
they shouldn't be able to do so. It's a choice between:
1) do I store the data in case-specific form and require
case-specific comparisons
or
2) do I burn character case on creation through mandatory
lowercasing and also require case-specific comparisons
There is no reason to require #2, just as there is no reason to always
require normalization. Furthermore, there are already well-known
data-types that require #1.
> But then queries for föo, Föo, and FÖO could return three
> different results. That would be a fundamental departure from today's
> model.
We just finished this argument. Labels are currently stored, transferred
and compared as octet-streams, with the exception being that ASCII A-z is
compared as case-insensitive. The difference here is that ASCII A-z will
always be case-specific for i18n domain names (a requirement of the
codec). Otherwise, the behavior I am arguing for is *IDENTICAL* to the
current model, while you are arguing for mandatory lowercasing for storage
and transfer in addition to comparison.
> Let's examine how föo and FÖO can get compared. The ACE form of both
> is xx--fo-fka. But if we were to skip Nameprep when converting FÖO, we
> would get xx--FO-ohA.
>
> Now consider an entity that knows that föo and FÖO and xx--fo-fka and
> xx--FO-ohA are domain labels, but does not know that they are special
> labels that don't use Nameprep.
Why would it ask for a special RR that it doesn't know how to read? There
are loads of RRs out there that applications cannot parse. ping can't
process SOA RRs, why would it be expected to read the FOO RR?
People can already write RRs that suit specific application requirements,
and I'm asking that you consider this continued requirement. They are the
only users of those RRs, so what do we care what they do? Moreover, why do
we need an arbitrary rule that prohibits current practice?
>>Is IDNA capable of producing identical output for two different inputs
>>(setting aside the issue of normalizations)?
>
> I'm not sure what you mean. Nameprep can of course produce identical
> output for two distinct inputs. But for two distinct outputs of
> Nameprep, ToASCII cannot produce the same output.
Is the encoding form guranteed to always be reversible to the original
capitalization?
--
Eric A. Hall http://www.ehsco.com/
Internet Core Protocols http://www.oreilly.com/catalog/coreprot/