[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: known request routing



(With my apologies for not getting involved sooner and for anything I say 
that's already been raised.)

At 12:49 4/18/2001 -0400, Abbie Barbir wrote:

>hi all,
>
>I would like your feedback regarding the known request routing draft.
>We need to determine what is needed to move it to an RFC.
>
>Your comments are highly welcomed.

>                    Known CDN Request-Routing Mechanisms
>                draft-cain-cdnp-known-request-routing-01.txt
>1. Introduction
>
>    Request-Routing techniques are generally used to direct client requests
>    for objects to a surrogate or a set of surrogates that could best 
> serve that
>    content. Request-Routing mechanisms could be used to direct client 
> requests
>    to surrogates that are within a Content Deliver Network (CDN) or to
>    surrogates that are in a cooperating or peered CDN.

You're using a phrase that's defined in the "models" document, yet you 
don't refer the reader to that document before using the phrase.  Without 
that definition this paragraph is confusing.  There's a feeling of jumping 
right in too - no real introduction to what this document is all about and 
why it's here.  (You cover some of that in the abstract - it would be 
useful to expand on that in the introduction.)

Also, I disagree with the use of "generally".  The techniques *are* used to 
direct clients (hmm, are you directing the clients or their requests...) to 
a surrogate.  Er, that's what the definition says :)

At the base level an ordinary DNS server provides request routing.  So the 
mechanisms *are* used to direct requests to systems (surrogates in the 
specific case being covered in this field) that serve that content.

>    Request-Routing techniques can be thought off as agents that are 
> positioned
>    in the communications path between a Content Source and the CLIENT, 
> and are
>    responsible for determining which requests should be redirected to a 
> given
>    surrogate that could serve that content. An example of a Request-Routing
>    system occurs when a Content Provider relies on a Content Delivery 
> Networks
>    (CDNs) using DNS Request-Routing to distribute some or all of its 
> content.

I dislike the notion of being positioned "in the communications path".  And 
it doesn't really work for the DNS case, since there's no "between" as 
such. "as part of" might be better.  You mention "CLIENT" which is again 
part of the standard terminology... and don't note the reasoning for 
capitalization.

>    In general, Request-Routing techniques can be used as a vehicle to extend
>    the reach and scale of Content Delivery Networks (CDNs). There exist
>    multiple Request-Routing mechanisms. At a high-level, these may be
>    classified under: DNS Request-Routing, transport-layer Request-Routing,
>    and application-layer Request-Routing.

You're raising two points here, the first sentence seems to be 
standalone.  And given the existing definition in -model-05 doesn't seem to 
fit.  The scope of this document appears to give a taxonomy of the 
mechanisms used in "getting the request to the right box" today, yet there 
seems to be a jump toward the inter-CDN routing issues here.  With the 
context I see in this document that sentence would just be a copy of the 
definition, and is therefore either redundant or else should be quoted (and 
earlier in the introduction!).


>    In principle a request routing system uses a set of metrics in an 
> attempts
>    to direct users to surrogate which can best serve the request. For 
> example,
>    the choice of the surrogate could be based on network proximity, 
> bandwidth
>    availability, surrogate load and availability of content.
>
>    The memo is organized as follows: Section 2 provides a summary of known
>    DNS based Request-Routing techniques. Section 3 discusses transport-layer
>    Request-Routing methods. In section 4 application-layer Request-Routing
>    mechanisms are explored. Section 5 provides insight on combining the
>    various methods that were discussed in the earlier section in order to
>    optimize the performance of the Request-Routing System. Section 6
>    provides a summary of possible metrics and measurements techniques
>    that could be used by the Request-Routing system to choose a
>    given surrogate.
>
>
>2. DNS based Request-Routing Mechanisms
>
>    DNS based Request-Routing techniques are common due to the ubiquity
>    of DNS as a directory service. In DNS based Request-Routing
>    techniques, a specialized DNS server is inserted in the DNS resolution
>    process.

You may wish to declare the base case out of scope, but as I mentioned 
above given the definition of REQUEST-ROUTING a standard DNS lookup could 
be defined as request routing.

Also check on the use of "DNS as a directory service".
draft-alvestrand-directory-defs-02.txt might be useful.

>   The server is capable of returning either a different set of
>    A, NS or CNAME records based on user defined policies, or metrics
>    or combination of both.
>
>    The overall goal is to improve the performance and scalability of the
>    objects that are resolved by DNS system.

Objects or resources?  Not too sure that "scalability of resources" conveys 
the right meaning either.  "Performance and scalability in the delivery 
of..." perhaps?

>2.1 Single Reply
>
>    In this approach, the DNS server is authoritative for the entire DNS
>    domain or a sub domain.  The DNS server returns the IP address of
>    the best surrogate in an A record to the client site DNS server.  The IP
>    address of the surrogate could also be a virtual IP(VIP) address of
>    the best set of surrogates for the client site DNS server.

"client site DNS server" seems confusing, as it doesn't directly identify 
the problems that might be occurring in this model.  "system performing the 
DNS lookup (which may be a server upstream of the requesting client)" might 
be better, though probably complicates things with the use of the word client.


>2.2 Multiple Replies
>
>    In this approach, the Request-Routing DNS  server returns multiple
>    replies such as several A records for various surrogates. Common
>    implementations of client site DNS server's cycles through the multiple
>    replies in a Round-Robin fashion. The order in which the
>    records are returned can be used to direct multiple clients using a
>    single client site DNS server.
>
>2.3 Multi-Level Resolution
>
>    In this approach multiple Request-Routing DNS servers can be
>    involved in a single DNS resolution. The rational

rationale

>  of utilizing
>    multiple Request-Routing DNS servers in a single DNS resolution is
>    to allow one to distribute more complex decisions from a single
>    server to multiple, more specialized, Request-Routing DNS servers.
>    The most common mechanisms used to insert multiple Request-Routing
>    DNS servers in a single DNS resolution is the use of NS and
>    CNAME records.

"An example would be the case where a higher level DNS server operates 
within a territory, directing the DNS lookup to a more specific DNS server 
within that territory to provide a more accurate resolution."  (Or 
something along those lines.)

>2.3.1 NS Redirection
>
>    A DNS server can use NS records to redirect the authority of the
>    next level domain to another Request-Routing DNS server. Thus, this
>    techniques allows multiple DNS server to be involved in the name
>    resolution process. For example, a client site DNS server resolving
>    a.b.c.com would eventually request a resolution of a.b.c.com from the
>    name server authoritative for c.com. The nameserver authoritative for
>    this domain might be a Request-Routing DNS server. In this case the
>    Request-Routing DNS server can either return a set of A records
>    or can redirect the resolution of the request a.b.c.com to the DNS
>    server that is authoritative for b.c.com using NS records.
>
>    One drawback of using NS records is that the number of Request-Routing
>    DNS servers is limited by the number of parts in the DNS name.  This
>    problem results from DNS policy that causes a client site DNS server
>    to abandon a request if no additional parts of the DNS name are resolved
>    in an exchange with an authoritative DNS server.
>
>    A second drawback is that the last DNS server can determine the TTL
>    of the entire resolution process. Basically, the last DNS server can
>    return in the authoritative section of its response its own NS record.
>    The TTL for this record is solely determined by the last DNS server.
>    The client will use this cached NS record for further request resolutions
>    until it expires.
>
>    Another drawback is that some implementations of bind voluntarily
>    cause timeouts  to simplify their implementation in cases in which a
>    NS level redirect points to a name server for which no valid A
>    record is returned or cached. This is especially a problem if the
>    domain of the name server does not match the domain currently resolved,
>    since in this case the A records, which might be passed in the DNS
>    response, are discarded for security reasons.

No reference to the latency implications of this?

>2.3.2 CNAME Redirection
>
>    Multi-level redirection using CNAMEs works in a similar fashion to
>    NS records redirection. In this scenario, the Request-Routing DNS
>    server returns a CNAME record to direct resolution to an entirely
>    new domain. In principle, the new domain might employ a new set of
>    Request-Routing DNS servers.
>
>    One disadvantage of this approach is the additional overhead of
>    resolving the new domain name. The main advantage of this approach
>    is that the number of Request-Routing DNS servers is independent
>    of the depth of the domain name.

"depth of the *initial* domain name"

>2.6 Anycast
>
>    To combine measurement and redirection, the Request-Routing DNS
>    server can advertise an anycast address as its IP address. The same
>    address, is used by multiple physical DNS servers. In this
>    scenario, the Request-Routing DNS server that is the closest to the
>    client site DNS server in terms of OSPF and BGP routing will receive
>    the packet containing the DNS resolution request. The server can use
>    this information to make a Request-Routing decision. Drawbacks of
>    this approach are listed below:
>
>        *  The DNS server may not be the closest server in terms of routing
>           to the client.
>
>        *  Typically, routing protocols are not load sensitive. Hence,
>           the closest server may not be the one with the least network
>           latency.
>
>        *  The server load is not considered during the Request-Routing 
> process.

You've kindof given a definition of what anycast is, but within the 
specific example for request routing.  It might be wise to give a short 
paragraph on what  anycast is, and then apply that to the specific example 
here.

[snip]

>2.8 DNS Request-Routing Limitations
>
>    Some limitations of DNS based Request-Routing techniques are described 
> below:
>
>        1.  DNS only allows resolution at the domain level. However, an
>            ideal request resolution system should service requests
>            per object level.
>
>        2.  In DNS based Request-Routing systems servers may be required to
>            return  DNS entries with a short time-to-live (TTL)values.
>            This may be needed in order to be able to react quickly in the
>            face of changing conditions. This in return may increase the
>            volume of requests to DNS servers.

In the last sentence, s/may/will ?

>        3.  DNS implementations sometimes do not always adhere to
>            DNS standards. For example, many implementations
>            do not honor the DNS TTL field.
>
>        4.  DNS Request-Routing is based only on knowledge of the local
>            DNS server, as client addresses are not relayed within DNS
>            requests.  This limits the ability of the system to determine
>            client's proximity to the surrogate.

This almost gets there.

"Proximity measurements used in DNS Request-Routing are based on the 
knowledge of where DNS servers are, not the location of clients; client 
addresses are not relayed within DNS requests..." perhaps

>        5.  DNS servers can request and allow recursive resolution of DNS
>            names. For recursive resolution of requests, the Request-Routing
>            DNS server will not be exposed to the IP address of the client
>            site DNS server. In this case, the Request-Routing DNS
>            server will be exposed to the address of the DNS server that
>            is recursively requesting the information. For example,
>            imgs.company.com might be resolved by a CDN, but the request
>            for the resolution might come from dns1.company.com as a result
>            of the recursion.

I think I know what's meant here, but it doesn't seem particularly 
clear.  It also seems that parts of the example are reversed.  ("The IP 
address of the client's DNS server will not be exposed to the R-R DNS 
server" maybe?)

>        6.  Users that share a single client site DNS server will be
>            redirected to the same set of IP addresses during the TTL
>            interval. This might lead to overloading of the surrogate
>            during a flash crowd.
>
>        7.  Some implementations of bind can cause DNS timeouts to occur
>            while handling exceptional situations.  For example, timeouts
>            can occur for NS redirections to unknown domains.

[snip]

>4. Application-Layer Request-Routing
>
>    Application-layer Request-Routing systems perform deeper examination
>    of client's packets beyond the transport layer header.

Is it appropriate to be discussing the examination of packets while 
discussing the application layer?

>  Deeper examination
>    of client's packets provides fine-grained Request-Routing control down
>    to the level of individual objects. The process could be performed in
>    real time at the time of the object request. Application-layer
>    Request-Routing systems can provide better control over the selection
>    of the best surrogate, due to their exposure to the client's IP address.

The IP address is available at lower levels of the stack, so suggesting 
that something working at the application layer would (only now) see the 
client's IP address seems wrong.

"Since the client's IP address is also known, application-layer R-R systems 
can provide better control over the selection of the best surrogate than 
DNS R-R systems."

This is, of course, also applicable to transport-layer R-R systems.

>4.1 Header Inspection
>
>    Applications such as HTTP [4], RTSP [3], and SSL [2] provide hints
>    in the initial portion of the session about how the client request
>    must be directed. These hints may come from the URL of the content
>    or other parts of the MIME request header such as Cookies.
>
>4.1.1 URL-Based Request-Routing
>
>    HTTP and RTSP content requests describe the requested content by its
>    URL. In many cases, this information is sufficient to disambiguate
>    the content and suitably direct the request. In most cases, it may be
>    sufficient to make Request-Routing decision just by examining the
>    prefix or suffix of the URL.

Hmm... please define "prefix" and "suffix".

>4.1.2 Mime Header-Based Request-Routing
>
>    This technique involves the task of using MIME-headers such as
>    Cookie, Language, and User-Agent, in order to select a surrogate.
>
>    Cookies are used to identify a customer or session by a web site.
>    Cookie-based Request-Routing provides content service differentiation
>    based on the client.

based on information set by the content provider.

Many clients may share the same cookie.

>  In addition, it is possible to direct a connection
>    from a multi-session transaction to be directed to the same server to
>    achieve session-level persistence.
>
>    The language header can be used to direct traffic to a language-specific
>    delivery node. The user-agent header helps identify the type of client
>    device. For example, a voice-browser, PDA, or cell phone can indicate
>    the type of delivery node that has content specialized to handle the
>    content request.

You've picked some very specific examples of headers that might be used 
here, and the examples are generally OK.  However, in this latter example 
some of those decisions could also be better made if correct "Accept" 
headers had been provided in the first place.  (I'm just trying to 
demonstrate that the examples might be a little too specific, and may leave 
folks thinking those are the 'only' headers being examined.)

>4.2 Content Modification
>
>    This technique enables a content provider to take direct
>    control over Request-Routing decisions without the need for specific
>    switching devices or directory services in the path between the client
>    and the origin server. Basically, a content provider can directly
>    communicate to the client the best surrogate that can serve the request.
>    Decisions about the best surrogate can be made on a per-object basis
>    or it can depend on a set of metrics. The overall goal is to improve
>    scalability and the performance for delivering the modified content,
>    including all embedded objects.
>
>    In general, the method takes advantage of content objects that consist
>    of basic structure that includes references to additional, embedded
>    objects. For example, most web pages, consist of an HTML document that
>    contains plain text together with some embedded objects, such as GIF
>    or JPEG images. The embedded objects are referenced using embedded
>    HTML directives. In general, embedded HTML directives direct the
>    client to retrieve the embedded objects from the origin server.
>    A content provider can now modify references to embedded objects
>    such that they could be fetched from the best surrogate.
>
>    This technique is also known as URL rewriting. The basic types of URL
>    rewriting are discussed in the following subsections.

The premise of this section is that objects are embedded and treated 
differently to the thing they're embedded within.  I'm not sure I can come 
up with an example where the *REQUEST-ROUTING* issues are different for the 
embedded object over the "master".  E.g. in HTTP all the objects are 
independent.  (This is an over-simplification, and I'm well aware of the 
problems in persistent links to "special" URLs.)

>4.2.1 A-priori URL Rewriting
>
>    In this scheme, a content provider rewrites the embedded URLs
>    before the content is positioned on the origin server. In this
>    case, URL rewriting can be done either manually or by using a software
>    tools that parse the content and replace embedded URLs.
>
>    A-priori URL rewriting alone does not allow consideration of client
>    specifics for Request-Routing. However, it can be used in combination
>    with DNS Request-Routing to direct related DNS queries into the
>    domain name space of the service provider. Dynamic Request-Routing
>    based on client specifics are then done using the DNS approach.

This doesn't sound any different to section 2, except that it's on specific 
content.

>4.2.2 On-Demand URL Rewriting
>
>    On-Demand or dynamic URL rewriting, modifies the content when the
>    client request reaches the origin server. At this time, the identity
>    of the client is known and can be considered when rewriting the
>    embedded URLs. In particular, an automated process can determine,
>    on-demand, which surrogate would serve the requesting client best.
>    The embedded URLs can then be rewritten to direct the client to
>    retrieve the objects from the best surrogate rather than from
>    the origin server.

Hmm, OK, but this is really putting the smarts of the DNS stuff into the 
Web server process (or a device in the network in front of it).

>4.2.3 Content Modification Limitations
>
>    Content modification as a Request-Routing mechanism suffers from
>    the following limitations:
>
>        1.  The first request from a client to a specific site
>            must be served from the origin server.
>
>        2.  Content that has been modified to include references to
>            nearby surrogates rather than to the origin server should be
>            marked as non-cacheable. Alternatively, such pages can be marked
>            to be cacheable only for a relative short period of time.
>            Rewritten URLs on cached pages can cause problems, because they
>            can be outdated and point to surrogates that are no longer
>            available or no longer good choices.

This second point would seem to be a common "persistent link into a mirror" 
problem.

>        3.  On-demand URL rewriting (including content parsing,
>            information retrieval, and URL rewriting) has to be done in
>            real-time, which poses the question of performance and
>            processing capabilities.

But that's not a limitation of the Request-Routing model per se.  And given 
some of the other things that go on in page construction, I wonder whether 
it's really worth mentioning.

>5. Combination of Multiple Mechanisms
>
>    There are environments in which a combination of different
>    mechanisms can be beneficial and advantageous over using one of the
>    proposed mechanisms alone. The following example illustrates how the
>    mechanisms can be used in combination.
>
>    A basic problem of DNS Request-Routing is the resolution granularity
>    that allows resolution on a per-domain level only. A per-object
>    redirection cannot easily be achieved. However, content modification
>    can be used together with DNS Request-Routing to overcome this
>    problem. With content modification, references to different objects
>    on the same origin server can be rewritten to point into different
>    domain name spaces. Using DNS Request-Routing, requests for those
>    objects can now dynamically be directed to different surrogates.

Would benefit from some actual examples, perhaps.

>6. Measurements
>
>    Request-Routing systems can use a variety of metrics in order
>    to determine the best surrogate that can serve a client's request.
>    In general, these metrics are based on network measurements and
>    feedback from surrogates. It is possible to combine multiple metrics
>    using both proximity and surrogate feedback for best surrogate
>    selection. The following sections describe several well known metrics
>    as well as the major techniques for obtaining them.

I'm curious... is there any reason why this isn't the second section?  It 
seems you introduce some things that might be beneficial to have earlier in 
the document, else have this as an appendix rather than as part of the 
actual reference material in the draft.

>6.1 Proximity Measurements
>
>    Proximity measurements can be used by the Request-Routing system to
>    direct users to the "closest" surrogate.  In a DNS Request-Routing
>    system, the measurements are made to the client's local DNS server.
>    However, in a client-side direction model, the IP address of the
>    client is directly exposed and therefore more accurate proximity
>    measurements can be obtained.

"client-side direction model"?

>    Proximity measurements can also be exchanged between the set
>    of surrogates and the requesting entity. In many cases, proximity
>    measurements are "one-way" in that they measure only either the
>    forward or reverse path of packets from the surrogate to the
>    requesting entity. This is important as many paths in the Internet
>    are asymmetric.
>
>    In order to obtain a set of proximity measurements, a network may
>    employ active probing techniques and/or passive measurement techniques.
>    The following sections describe these two techniques.
[snip]


>6.1.3 Metric Types
>
>    The following sections list some of the metrics, which can be used
>    for proximity calculations.
>
>        *  Latency: Network latency measurements metrics are used to
>           determine the surrogate (or set of surrogates) that has the
>           least delay to the requesting entity.  These measurements can
>           be obtained using either an active probing approach or a
>           passive network measurement system.

"Round Trip Time" would seem to be a common phrase here perhaps?

>        *  Packet Loss: Packet loss measurements can be used as a
>           selection metric.  A passive measurement approach can easily
>           obtain packet loss information from TCP header information.
>           Active probing can periodically measure packet loss from
>           probes.

Would this be better defined as "link quality"?  You can measure the link 
quality from packet loss...

>        *  Hop Counts: Router hops from the surrogate to the requesting
>           entity can be used as a proximity measurement.
>
>        *  BGP Information: BGP AS PATH and MED attributes can be used to
>           determine the "BGP distance" to a given prefix/length pair.
>           In order to use BGP information for proximity measurements, it
>           must be obtained at each surrogate site/location.
>
>
>6.2 Surrogate Feedback
>
>    The Request-Routing system can use feedback from surrogates in order
>    to select a "least-loaded" delivery node.  Feedback can be delivered
>    from each surrogate or can be aggregated by site or by location.
>
>
>6.2.1 Probing
>
>    Feedback information may be obtained by periodically probing a surrogate
>    by issuing an HTTP request and observing the
>    behavior. The problems with probing for surrogate information are:
>
>        1.  It is difficult to obtain "real-time" information.
>
>        2.  Non-real-time information may be inaccurate.
>
>
>    Consequently, feedback information can be obtained by agents that 
> reside on
>    surrogates that can communicate a variety of metrics about
>    their nodes.

Probing in this way also has inherent problems that you're dependent upon 
the network quality of the intermediate network, so you may actually be 
measuring something that wouldn't be a "problem" (or not seeing a problem 
when there is one somewhere else).

>6.2.2 Well Known Metrics
>
>    The following provides a brief summary of  several of the popular metrics
>    that is used for surrogate feedback:
>
>        *  Surrogate CPU Load.
>        *  Interface Load / Dropped packets.
>        *  Number of connections being served.
>        *  Storage I/O Load.