[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Heads-up: IAB statement on the use of wildcards

To: iesg@ietf.org
Subject: Heads-up: IAB statement on the use of wildcards
From: Harald Tveit Alvestrand <harald@alvestrand.no>
Date: Fri, 19 Sep 2003 15:07:14 -0700
Cc: ogud@ogud.com, okolkman@ripe.net, dmm@1-4-5.net, sra@hactrn.net

For your information:

The IAB is planning to release the statement below as an IAB statement today, Friday (if possible).

As IETF leadership, and as chairs of DNS-related WGs in the IETF, you should know ahead of time.

If you find *show-stopper* problems in the below text, yell, of course, but we'd REALLY like to get this puppy out the door....

                    Harald
-----------------------------------------------------------------------

Architectural concerns on the use of DNS wildcards

  There are many architectural assumptions regarding DNS behavior that
  are not specified in the IETF standards documents describing DNS, but
  which are deeply embedded in the behavior of Internet protocols and
  applications. These assumptions are inherent parts of the network
  architecture of which the DNS is one component.

  It has long been known that it is possible to use DNS wildcards in
  ways that violate these assumptions.

  Recent deployments of DNS wildcards with A records at high levels in
  the DNS tree have shown by experience that the cost of violating these
  assumptions is significant. In this document we provide an explanation
  of how DNS wildcards function, and many examples of how their
  injudicious use negatively impacts both individual Internet
  applications and indeed the Internet architecture itself.

  In particular, we recommend that DNS wildcards should not be used in a
  zone unless the zone operator has a clear understanding of the risks,
  and that they should not be used without the informed consent of those
  entities that are delegated within the zone.
    _________________________________________________________________

A brief primer on DNS wildcards

  The DNS "wildcard" mechanism has been part of the DNS protocol since
  the original specifications were written twenty years ago, but the
  capabilities and limitations of wildcards are sufficiently tricky that
  discussions of both the protocol details of precisely how wildcards
  should be implemented and the operational details of how wildcards
  should or should not be used continue to the present day. This section
  attempts to explain the essential details of how wildcards work, but
  readers should refer to the DNS specifications ([RFC 1035] et
  sequentia) for the full details.

  In essence, DNS wildcards are rules which enable an authoritative name
  server to synthesize DNS resource records on the fly. The basic
  mechanism is quite simple, the complexity is in the details and
  implications.

  The most basic and by far the most common operation in the DNS
  protocols is a simple query for all resource records matching a given
  query name, query class, and query type. Assuming (for simplicity)
  that all the software and networks involved are working correctly,
  such a query will produce one of three possible results:

  success
         If the system finds a match for all three parameters, it
         returns the matching set of resource records;

  no data
         If the system finds a match for the query name and query class
         but not for the query type, it returns an indication that the
         name exists but no data matching the given query type is
         present.

  no such name
         If the system fails to find a match for the given query name
         and query class, it returns an indication that the name does
         not exist.

  Ordinarily, matches for all three parameters must be exact. This is
  where wildcards come into the picture.

  A wildcard record is an otherwise ordinary DNS resource record whose
  leftmost (least significant) label consists of a single asterisk ("*")
  character, such as "*.bar.example". Conceptually, the asterisk matches
  one or more labels at the left (least significant) end of the DNS
  name.

  When wildcard records are present, the rules become more complicated.
  Specifically, if the query class matches, there is no exact match for
  the query name, and the closest match for the query name is a
  wildcard, the system in effect synthesizes a set of resource records
  matching the query name on the fly by treating the resource records
  present at the wildcard name as if they had been present at the query
  name. Thus, if the wildcard name has records matching the desired
  query type, the system will return those records, precisely as in the
  "success" case case above; otherwise, the system will return an
  indication that the name exists but no data matching the given query
  type is present, precisely as in the "no data" case above. The
  response is identical to that of a normal "success" response for the
  query name, so the resolver which issued the query can not tell that
  the results it got back were the result of wildcard expansion.

  Note that, in the case of a wildcard match, the "no such name" case
  cannot occur; the wildcard match eliminates this possibility. Note
  also that only the query name and query class matter for purposes of
  determining whether a wildcard matches: any record type can produce a
  wildcard match, regardless of whether or not the record type happens
  to match the query type.
    _________________________________________________________________

Problems with Wildcard Records

  One of the main known weaknesses and dangers of wildcard records is
  that they interact poorly with any use of the DNS which depends on "no
  such name" responses. The list of such uses turns out to be quite
  large, and will be discussed in some detail in a later section.

  Another known weakness and danger of wildcard records stems from the
  fact that the wildcard label will match anything at all, so long as no
  non-wildcard name within the zone is a closer match to the query name
  than the wildcard is. This doesn't sound like a major problem until
  one considers the number of conventions and, in some cases, protocols,
  which use labels at the left (least significant) ends of the names of
  resource records to distinguish between records associated with
  different services, rather than using different types of records. That
  is, in these cases, otherwise unrelated services use the same type of
  record and clients (or users) are expected to use the name
  corresponding to the particular service desired. This applies both to
  the ad-hoc naming conventions described in [RFC 2219] such
  www.foo.example and also to mechanisms such as the SRV record type
  [RFC 2782] in which the naming scheme is part of the formal protocol.
  When names of this type are covered by wildcards such as an address
  record named *.bar.example, such a wildcard would hand back the same
  address record regardless of the service name encoded in the query
  name, thus ftp.foo.bar.example, mail.foo.bar.example,
  ntp.foo.bar.example and so forth would all end up with the same
  synthesized address record. This problem is even worse in the SRV
  case, both because names such as _finger._tcp.foo.bar.example are part
  of the protocol and because SRV records include TCP and UDP port
  numbers, so the client will be confused not only about which host it
  should contact but also about the port on which it should contact that
  host. The only way to avoid these problems with names of this type is
  to add explicit records for such names to the DNS.

  Finally, the two factors listed above ("match anything" behavior, and
  poor interaction with anything that depends on "no such name"
  responses) interact with normal and predictable human errors to allow
  wildcards to have effects far beyond their intended scope. Properly
  speaking, a wildcard record's scope is limited to a single zone,
  since, by definition, a wildcard record never matches any name that
  really does exist in the zone, and thus will not match any
  (non-wildcard) delegation of a portion of the namespace from a parent
  zone to its child. (Wildcard NS records, while theoretically possible,
  have sufficiently bizzare semantics that it is probably best to limit
  their use to torture-tests of DNS software.) So, at first blush, it
  would seem that the administrator of a zone is free to use wildcards
  without worrying about effects which this might have on the zone's
  delegated children. Unfortunately, this turns out not to be the case,
  because DNS names are heavily exposed in user interfaces, and users,
  being humans, make mistakes. So, while delegating the bar.example zone
  will prevent a wildcard record *.example from affecting a user who
  typed foo.bar.example as foi.bar.example, it will not prevent the same
  wildcard record from affecting the same user when the error is
  foo.bat.example. Thus, from the users' point of view, some of the
  effects of wildcards do leak from a parent zone to its children. This
  is not a big deal if the parent and child zones are associated with a
  single organization, but it can become a real problem if the parent
  and child zones are associated with different organizations whose
  interests are not perfectly aligned.

  The above is probably not an exhaustive list. Even after twenty years
  of experience with the DNS, the effects of unexpected uses of
  wildcards can still be quite surprising, because the small but
  fundamental way in which they change the record lookup rules has a
  nasty way of violating implicit (or, sometimes, explicit) assumptions
  in deployed DNS-using software.

  For these reasons, almost all use of DNS wildcards has been limited to
  a relatively small number of reasonably well-understood roles, and
  most wildcard use has been limited to a single role: the MX records
  used in mail delivery.

  Since MX records are only used for electronic mail delivery, wildcard
  MX records are relatively safe, and since electronic mail for any
  particular DNS name is generally handed by the organization that is
  furthest down the delegation tree, wildcard MX records are most likely
  to appear in zones where their effects will not cross organizational
  boundaries. While the latter is not universally true, the primary use
  of wildcard records has been and remains wildcard MX records for
  handling an organization's own mail.

  Given these issues, it seems clear that the use of wildcards with
  record types that affect more than one protocol should be approached
  with caution, that the use of wildcards in situations where their
  effects cross organizational boundaries should also be approached with
  caution, and that the use of wildcards with record types that affect
  more than one protocol in situations where the effects cross
  organizational boundaries should be approached with extreme caution,
  if at all.
    _________________________________________________________________

Principles To Keep In Mind

  In reading the rest of this document, it may be helpful to bear in
  mind two basic principles of architectural design which have served
  the Internet well for many years:
    * The Robustness Principle: "Be conservative in what you do, be
      liberal in what you accept from others." [Jon Postel, RFC793]
    * The Principle Of Least Astonishment: A program should always
      respond in the way that is least likely to astonish the user.
      [Traditional, original source unknown]

  We will come back to these points after the next section.
    _________________________________________________________________

Problems encountered in a recent experiment with wildcards

  We have recently had the opportunity to observe the results of an
  experiment on use of wildcards in large top-level domains, with some
  rather undesirable and unintended consequences. This section attempts
  to detail some of the problems that network users and operators around
  the world encountered as a result of this deployment.

  We must emphasize that, technically, this was a legitimate use of
  wildcard records that did not in any way violate the DNS
  specifications themselves. One of our main points here is that simply
  complying with the letter of the protocol specification is not
  sufficient to ensure the operational stability of the applications
  which depend on the DNS: there are protocol features which simply are
  not safe to use in some circumstances.

  The specific change which this operator chose to make was to add a
  single wildcard address record at the zone apex of each of these two
  zones. As a direct result of this change, two things happened:
   1. the authoritative servers for these two zones no longer give out
      "no such name" responses for any possible name in these zones, and
   2. every possible name rooted in one of these zones which, until this
      change, did not exist at all, now has a synthesized address record
      pointing at a "redirection server" run by the operator of this
      zone.

  The implications of this simple change were many and varied. The list
  below is almost certainly incomplete:

Web Browsing

  Web browsers all over the world stopped displaying "page not found" in
  the local language and character set of the users when given incorrect
  URLs rooted under these TLDs. Instead, these browsers now display an
  English language search page from a web server run by the zone
  operator.

  It should be noted that the language tags in the HTTP protocol do not
  always match the locale used in the local browser. So, even though the
  global search page is dynamic and uses the information in the HTTP
  request to guess what language and script is to be used -- it will
  never be able to emulate what the user expected. There is, in short,
  not enough context in the HTTP protocol for the engine which generates
  the search page.

  In many situations, web browsers have been written to provide some
  assistance to the user, often based on local conventions, directories,
  and language, when a DNS lookup fails. All such systems are now
  disabled for URLs rooted under these TLDs, since DNS lookups no longer
  fail, even when the specified destination does not exist.

  Even if these were acceptable changes, the new mechanism has poor
  scaling properties, and unless the operator chooses to invest
  significant resources in maintaining a large, robust web server setup,
  the user experience is going to get even worse: instead of either a
  local language error message or an English search page, the user is
  going to get "attempting to connect..." followed by a long wait.

Email

  All mail to non-existent hostnames under these TLDs now flows to the
  registry operator's server, where the registry operator bounces it.
  Some operators find this intolerable and have changed their mail
  system configurations to bypass this "bounce service", but the vast
  majority of mail servers undoubtedly now route mail for nonexistent
  names under these TLDs to the bounce server rather than just bouncing
  it directly. This has a number of ramifications:
    * If operators choose to allow their mail to go to the bounce
      server, they now have an increased mail load handling additional
      routing of messages to the bounce server; if operators choose not
      to allow this to happen, they have an additional development and
      maintenance burden configuring their servers to prevent it.
    * Operators who allow mail to go to the bounce server are now
      dependent on the performance of the bounce server. If the bounce
      server ever slows or fails, mail that previously would bounce will
      now queue at the SMTP relay for that relay's queue time before
      bouncing back to the user. This creates a very poor user
      experience, since typographical errors that in the past would have
      bounced immediately may now go unnoticed for several days.
    * Operators who allow mail to go to the bounce server are also
      dependent on the correct operation of the bounce server. If the
      bounce server is buggy (which happened to be the case with this
      rollout), mail may not bounce at all: it may be reported to the
      user as having been delivered correctly while actually vanishing
      without a trace. This also creates a very poor user experience.
    * In some cases where the set of MX records associated with a
      particular DNS name included a misconfigured record pointing to a
      nonexistent hostname, installing these wildcard records was the
      last straw that broke a misconfigured-but-functional mail
      configuration: previously, the nonexistent hostname would have
      failed to resolve and been ignored, now it bounces.
    * The normal flow of data from a client in SMTP when one address has
      a typo is as follows:
        1. The client looks up the IP address of his outgoing SMTP proxy
           in DNS.
        2. The client opens a TCP connection to his outgoing SMTP proxy.
        3. The client sends information about himself to the SMTP proxy.
        4. The proxy accepts or rejects the client.
        5. The client sends information about the recipient to the SMTP
           proxy.
        6. The proxy look up the destination in DNS, and gets "no such
           name" back.
        7. The proxy sends information to the client that the address is
           wrong.
      With a wildcard for mistyped domain, the following happens:
        1. The client looks up the IP address of his outgoing SMTP proxy
           in DNS.
        2. The client opens a TCP connection to his outgoing SMTP proxy.
        3. The client sends information about himself to the SMTP proxy.
        4. The proxy accepts or rejects the client.
        5. The client sends information about the recipient to the SMTP
           proxy.
        6. The proxy looks up the destination in DNS, and gets "success"
           back.
        7. The proxy accepts the message and closes the connection to
           the client.
        8. The proxy opens a TCP connection to the bounce server.
        9. The proxy present himself to the bounce server.
       10. The bounce server indicates that the recipient address is not
           acceptable.
       11. The proxy generates an error message which is sent back to
           the sender's email address.
    * A different scenario happens if the SMTP client has been
      misconfigured with the incorrect name of the outgoing SMTP proxy.
      As the domain name resolves using a wildcard, the client will
      connect to the bounce server, and start to send mail to it. The
      result is that the bounce server (at the IP address of the
      wildcard) says that the recipient address is wrong even though it
      is in fact correct. The error presented to the user is incorrect,
      as it is the name of the outgoing proxy which was wrong and not
      the name of the recipient.

Informing Users of Errors

  Many application GUIs check domain names for validity before allowing
  the user to progress to the next step. Examples include email clients
  that directly check the domain of the email addresses resolves before
  sending, and network printer configuration tools that check that the
  print spooler name is valid before accepting the configuration.
  Previously the user would be prompted early that they had made an
  error in the domain name. In the case of email, the error may now not
  be noticed at the time of sending, but only when email later bounces.
  In the case of the printer configuration, the error may not be noticed
  during configuration, but only afterwards when printing fails to work,
  where the problem diagnosis is more difficult.

  Spam Filters
  Installing these wildcard records broke several simple spam filters
  commonly used to front end inbound mail servers, as well as more
  complex filtering that checks for the existence of a sending domain in
  order to screen out obviously bogus senders. This technique for spam
  has diminished as this filtering mechanism has increased, but one
  sample operator reports that it still equals about 10% of inbound mail
  attempts on their large shared MX cluster. ISPs who are aware of this
  problem will probably extend their filtering rules to have special
  knowledge of the address returned by these wildcard records, but will
  have to carry the cost of doing so, both in terms of code maintenance
  and increased execution time for their filtering.

Interactions with Other Protocols

  The wildcard address records trap lookups for any network service, but
  the number of protocols somewhere in use on the Internet (including
  private protocols used between two or more parties on ports which they
  may or may not have registered with IANA) is large enough that it
  simply is not possible for the zone operator (or anyone) to provide a
  redirection service for every protocol. In this particular example,
  the zone operator only provided handlers for HTTP (which they directed
  to a search page) and SMTP (which they attempted to bounce). All other
  protocols received at best TCP resets, or, in some cases, simply had
  their packets dropped. Any application that uses the DNS has (or
  should have) some way of handling "no such name" errors; in almost all
  cases the error message is sufficiently clear to an experienced user
  that it is immediately obvious when the application has failed because
  it was given an incorrect DNS name. With these wildcard records in
  place, however, incorrect DNS names which are matched by the wildcard
  record will not show up as DNS name errors at all, but instead will
  show up as mysterious connection failures or as unreachable
  destinations for all services that the zone operator does not
  redirect. Depending on the details of the application protocol and
  implementation involved, this change may also convert an obvious "hard
  failure" (incorrect name) into a soft failure which the application
  thinks it should retry, as seen above in the email case. This may
  result in very long delays, perhaps of days or weeks, before even
  trivial errors are brought to the user's attention. Transport
  protocols using UDP may also retry until the transport protocol retry
  limit is reached (especially if ICMP messages are being filtered at a
  firewall), which may be very considerably longer than the time it
  would have taken to return an error to the user indicating they
  mistyped the destination.

Automated Tools

  Automated or embedded tools which use HTTP but which do not have a
  user interface may also be confused by this change, since such tools
  may expect configuration failures to show up as DNS errors and may not
  realize that the HTTP response they have received from the zone
  operator's search page is not the page which the tool expected to
  reach. Such tools may fail in unpredictable ways, and may not be easy
  to upgrade.

Charging

  The current response from the service in question is just over 17
  KBytes of data because the client has to open a TCP connection and
  receive a not insignificant amount of data. A "no such data" response
  would have fitted in one packet. In the case of volume-based charging
  for Internet Access (as with most cellular data services) the
  recipient will have to pay additional charges.

Single Point of Failure

  Even for cases in which the redirection service works as intended,
  such a service creates a very large single point of failure. Single
  points of failure are obvious targets both for deliberate attacks and
  for the sort of accidental "attacks" caused by bugs and configuration
  errors which already generate much of the traffic at the DNS name
  servers for the root zone. Furthermore, the IP address associated with
  this single point of failure is a likely target both for routing
  attacks intended to redirect the IP address to some other server.

Privacy

  An interception service with this kind of scope raises significant
  privacy concerns, since traffic received by the interception service
  is, pretty much by definition, not going where its sender originally
  intended. The potential for abuse in this situation is very high, and
  makes the interception service an even more attractive target, this
  time for attackers who wish to gain control of it in order to practice
  such abuse.

Reserved Names

  This sort of wildcard usage is incompatible with any use of DNS which
  relies on reserving names in a registry with the express intent of not
  adding them to the DNS zone itself. An example of such a use is the
  JET-derived IDN approach of "registry restrictions" and "reserved
  names", which depends on the existence of names that are reserved and
  can be registered only by the holder of some related name, but which
  do not appear in the DNS. By some readings of the current ICANN IDN
  policy, support for that "reserved name" approach is required. To
  accomplish the goal of reduced consumer confusion, the reserved names
  must not be resolvable at all. This reserved name approach appears to
  be completely incompatible with this sort of wildcard usage: since the
  wildcard will always cause a result to be returned, even for a
  reserved name which does not appear in the zone, one can support
  either one or the other, but not both.

Undesirable Workarounds

  ISPs have responded to the deployment of these wildcards in a number
  of ways, all of which are both understandable and worrisome. Some ISPs
  have contemplated modifying their routing systems to drop all packets
  destined to the zone operator's redirection server into a black hole.
  Others have deployed patches to their DNS resolvers which attempt to
  reverse the effects of these wildcard records. Still other ISPs have
  considered using this as an opportunity to play the same game that the
  zone operator is playing, but for the ISP's own benefit. All of these
  responses are both understandable and predictable, but none of them
  are good. Even more worrisome is that different ISPs are taking
  different approaches to dealing with this, which may lead to a
  balkanization problem and create an ongoing headache for anyone having
  to deal with cross-network DNS or application debugging.
    _________________________________________________________________

Principles, Conclusions, and Recommendations

  The Robustness principle tells us that in some (not all) of the
  problems detailed above, both parties could be construed as being at
  fault. In some cases this is hardly surprising: spam filtering in
  particular, by its nature, tends to be extremely ad hoc and somewhat
  fragile. No doubt there are lessons here for all parties involved.

  The Principle of Least Astonishment suggests that the deployment of
  wildcards was disastrous for the users. It had widesweeping effects on
  other users of the Internet far beyond those enumerated by the zone
  operator, created several brand new problems, and caused other
  internet entities to make hasty, possibly mutually incompatible and
  possibly deleterious (to the internet as a whole) changes to their own
  operations in an attempt to react to the change.

  Note that these considerations apply to any wildcard deployment of
  this type. The list of problems encountered in this case clearly
  demonstrates that, although wildcard records are part of the base DNS
  protocol, there are situations in which it simply is not safe to use
  them. As noted in an earlier section, two warning flags suggesting
  that this type of wildcard deployment is dangerous were that
   a. it affected more than one protocol, and
   b. it was done high enough up in the DNS hierarchy that its effects
      were not limited to the organization that chose to deploy these
      wildcard records.

  Note also that a significant component of some of the listed problems
  was not precisely the wildcard-induced behavior per se so much as it
  was the abrupt change in the behavior of a long established
  infrastructure mechanism.

  In conclusion, we would like to propose a guideline for when wildcard
  records should be considered too risky to deploy, and make a few
  recommendations on how to proceed from here.

  Proposed guideline: If you want to use wildcards in your zone and
  understand the risks, go ahead, but only do so with the informed
  consent of the entities that are delegated within your zone.

  Generally, we do not recommend the use of wildcards for record types
  that affect more than one application protocol. At the present time,
  the only record types that do not affect more than one application
  protocol are MX records.

  For zones which do delegations, we do not recommend even wildcard MX
  records. If they are used, the owners of zones delegated from that
  zone must be made aware of that policy and must be given assistance to
  ensure appropriate behavior for MX names within the delegated zone. In
  other words, the parent zone operator must not reroute mail destined
  for the child zone without the child zone's permission.

  We hesitate to recommend a flat prohibition against wildcards in
  "registry"-class zones, but strongly suggest that the burden of proof
  in such cases should be on the registry to demonstrate that their
  intended use of wildcards will not pose a threat to stable operation
  of the DNS or predictable behavior for applications and users.

  We recommend that any and all TLDs which use wildcards in a manner
  inconsistent with this guideline remove such wildcards at the earliest
  opportunity.
    _________________________________________________________________

Acknowledgements

  The IAB gratefully acknowledges the kind assistance of David Schairer,
  John Curran, John Klensin, and Steve Bellovin for helpful suggestions
  and, in some cases, significant chunks of text. None of these
  contributors bear any responsibility for what the IAB has done with
  their contributions. We note that Leslie Daigle reclused herself from
  the process of producing this document.

Follow-Ups:
- Re: Heads-up: IAB statement on the use of wildcards
  - From: David Meyer <dmm@1-4-5.net>

Prev by Date: thinkingcat e-mail is busted
Next by Date: Re: draft-ietf-idwg-beep-tunnel-05.txt
Previous by thread: thinkingcat e-mail is busted
Next by thread: Re: Heads-up: IAB statement on the use of wildcards
Index(es):
- Date
- Thread