[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[idn] Re: Legacy charset conversion in draft-ietf-idn-idna-08.txt
Paul Hoffman / IMC <phoffman@imc.org> writes:
> At 3:25 PM +0200 5/27/02, Simon Josefsson wrote:
>>I think the third paragraph of the security consideration should more
>>clearly express that IDNA actually is vulnerable to the attack if
>>machines, like most machines on the Internet, use legacy encodings.
>
> It isn't clear what "the attack" is. There is clearly a problem for
> the user when System A transcodes text from Encoding X into Unicode
> differently than System B does, but I don't see what the security
> issue is. Could you provide some suggested wording for the security
> consideration?
The basic attack: Alice runs on host that uses Latin-1 for
input/output and enters www.µbank.com (where µ is 8859-1 0xB5). The
domain is registered using U+00B5, but Alice's application transcode
the string using U+03BC. Either Alice can't connect (if the other
domain doesn't exist) or she ends up talking to someone else (if the
other domain does exist).
There are many arguments that can be raised on the applicability of
the simple attack:
1 You shouldn't map ISO-8859-1 0xB5 into U+03BC, the application is
broken and should map it into U+00B5. My reply: This might be true,
but this doesn't follow from IDNA as IDNA leaves the transcoding
issue open to the implementator. It seems as if either IDNA need to
reference mapping tables that MUST be used for legacy encodings, or
state that IDNA enables the attack.
2 So what if Alice talks to someone else? DNS can be spoofed anyway,
so this isn't a new problem. My reply: Such problems in DNS can be
solved with DNSSEC. However, even if DNSSEC is used, IDNA would
enable this new attack => same conclusion as in 1.
3 Still, so what if Alice talks to someone else? Alice should use TLS
or IPSEC or SSH or CMS or Kerberos or SASL or something else to
authenticate the endpoints and protect data. My reply: This is
where the subtle problems appears, and I think more investigations
are needed here. A (probably flawed) initial attempt:
1 TLS and PKIX certs does not support IDNA, so we must first assume
they are extended to support IDNA. The security implications might
be different depending on how IDNA is implemented, so we must study
each approach individually. Some approaches I can see:
1 Put IDNA strings in DN/subjectAltName of the PKIX cert. Alice
compares IDNA in cert with the IDNA used to contact the host.
While this appear to solve the problem, it really only moves the
transcoding problem elsewhere. Perhaps the CA had to convert an
ISO-8859-1 string it received in mail into the IDNA when
generating the cert. Perhaps the machine used to apply for the
certificate used ISO-8859-1 and had to convert it into an IDNA.
Unless you assume the whole world switches to Unicode the day IDNA
hits the street, the transcoding problem is present in at least
one step in the chain, and has to be solved there. Point is, the
mapping tables must be standardized or you open up for attacks.
2 Put IDNA strings in DN/subjectAltName of the PKIX cert. Alice
compares decoded IDNA in cert with the name used to contact the
host. Decoding from Unicode into legacy encodings is tricky.
Alice must have mapping tables here as well.
3 Use UTF-8 strings in DN/subjectAltName of the PKIX cert. Alice
compares ToASCII(name-in-cert) with IDNA used to contact host.
This seems similar as 1, in that unless the whole world uses
UTF-8, the mapping has to be done somewhere and must be well
defined there to be secure.
4 Add {charset, string} elements indicating the character set
used, and the string in DN/subjectAltName of the PKIX cert. If
charset is the same as the charset that Alice uses, it will work
fine. However, in all other cases, mapping tables are needed, but
this time O(n^2) tables must exist.
2 SASL is just a framework, so it is each SASL mechanism that has to
be studied. Several SASL mechanisms (HMAC-MD5, DIGEST-MD5, SRP)
does not include names of the endpoints, so the identity of the
other end is only implicitly known after a succesful authentication.
Thus it only makes man-in-the-middle attacks or password cracking
slightly easier, no real security impact.
... etc, the cases for Kerberos, IPSEC and SSH seems to only repeat
the discussions above. I have not slept for a long time, so I'll
spare you the discussion and me the typing. :-)
Suggested modified security consideration below. It essentially says
that unless everyone switches to UTF-8, IDNA will enable new attacks
that has security implications.
--- draft-ietf-idn-idna-08.txt.orig Mon May 27 18:18:58 2002
+++ draft-ietf-idn-idna-08.txt Mon May 27 20:08:44 2002
@@ -690,10 +690,24 @@
are introduced by the encoding process or the use of these encoded
values, apart from those introduced by the ACE encoding itself.
-Domain names are used by users to connect to Internet servers. The
-security of the Internet would be compromised if a user entering a
-single internationalized name could be connected to different servers
-based on different interpretations of the internationalized domain name.
+Domain names are used by users to identify and connect to Internet
+servers. The security of the Internet is compromised if a user
+entering a single internationalized name is connected to different
+servers based on different interpretations of the internationalized
+domain name. When all systems use ASCII or Unicode, different
+interpretations are not allowed in this specification.
+
+When involved systems use non-ASCII and non-Unicode characters (such
+as ISO-8859-1 and ISO-2022-JP, which are common on the Internet),
+however, this specification leaves the transcoding problem up to the
+application. Thus there can not be any assurance that two
+applications will not implement different transcoding rules. When two
+applications implement different transcoding rules, they will
+(assuming both domains exists) contact different servers. Note that
+the problem can not just easily be solved by using a security protocol
+such as TLS to identify and authenticate to end points, unless these
+protocols have already solved the problem which IDNA is trying to
+solve.
Because this document normatively refers to [NAMEPREP], it includes the
security considerations from that document as well.