[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: comments on draft-ietf-cain-request-routing-req-00.txt




Dima,

	you are making some interesting points here, however,
I am not sure if I can provide detailed comments. As you pointed
out our current charter requires us to define the requirements.
As we use BGP peering loosely as a model we have not decided
to use either BGP as the transport or any other BGP feature for
that matter. So is there anything we have to put in the requirements
to address the issues you outlined before? Could you suggest
a paragraph?

Oliver

Dmitri Krioukov writes:
 > Oliver, Brad,
 > 
 > The traffic engineering (TE) extensions to IGPs (Brad's
 > multiple metrics, I assume) or the multiprotocol extension
 > to BGP treat the corresponding routing protocol as an
 > information distribution tool only. The protocol itself is
 > not used to avoid routing loops. Some external (and not
 > immediately obvious) methods are required for loop
 > avoidance. In the TE IGP extensions case, it's CSPF (see
 > draft-kompella-te-pathcomp-00.txt and I'd really like to
 > attract your attention to the very first sentence of
 > Section 5 there :); in the MBGP case, it depends on
 > usage -- in RFC2547, for example, the loop avoidance
 > problem is irrelevant to BGP since BGP is used as just a
 > pure transport tool distributing VPN-IPv4 reachability
 > information between PEs within (in most cases) the same AS.
 > 
 > The point I'm trying to make is that:
 > - if the RRS peering protocol (RPP) model is as one of BGP
 >   in RFC2547, then some explicitly external to RPP loop
 >   avoidance mechanism is required;
 > - if the BGP loop avoidance method (path-vector) is chosen
 >   instead and optimal metrics from the draft are defined
 >   as optimal attributes *and* used in the route selection
 >   algorithm (by "route selection" I mean what is described
 >   in Section 9.1 of RFC1771, for example), then I can not
 >   see how one can claim that to come up with such an
 >   algorithm causing no global route selection inconsistencies
 >  (read, no route oscillations) is an easy task.
 > 
 > This is in addition to what I mentioned before about issues
 > related to exposure of RRS internals to RPP.
 > 
 > Or is this all now outside of the WG charter no longer
 > requiring the protocol specification and, hence, protocol
 > design considerations?
 > --
 > dima.
 > 
 > > -----Original Message-----
 > > From: owner-cdn@ops.ietf.org [mailto:owner-cdn@ops.ietf.org]On Behalf Of
 > > Oliver Spatscheck
 > > Sent: Tuesday, January 30, 2001 8:57 AM
 > > To: Dmitri Krioukov
 > > Cc: Oliver Spatscheck; cdn@ops.ietf.org
 > > Subject: RE: comments on draft-ietf-cain-request-routing-req-00.txt
 > >
 > >
 > > Dmitri Krioukov writes:
 > >
 > >  >
 > >  > >  > 1) Support of various types of metric
 > >  > >  >
 > >  > >  >    Wouldn't it be easier to support just
 > >  > >  >    one standard metric or, at least, to fix
 > >  > >  >    supported metrics to some reasonable
 > >  > >  >    minimal set of required ones? CPGs are
 > >  > >  >    then responsible for mapping the internal
 > >  > >  >    proprietary set of metrics to this
 > >  > >  >    standard one (either metric or a set of
 > >  > >  >    metrics) and vice versa.
 > >  > >  >
 > >  > >
 > >  > > I think that is what we intend to propose. Maybe it didn't become
 > >  > > clear in the draft. I think the consensus was to require a minimal
 > >  > > set of metrics say one and to allow for custom metrics which
 > >  > > are on a peer to peer basis. The rational was to be similar
 > >  > > to BGP community strings (peer to peer custom metric) while
 > >  > > preventing the shortfall of community strings that even for
 > >  > > common metrics every set of peers has to reinvent the wheel
 > >  > > (small set of required metrics).
 > >  >
 > >  > The BGP community is not a metric, it's an attribute -- quite
 > >  > different thing from the protocol design and operation standpoint.
 > >  > It's also a separate document from the specification standpoint.
 > >
 > > I am not quite sure what the difference between a peer to peer
 > > attribute and a peer to peer metric is  since they are both
 > > only interpreted by the peers. However, if you want to call them
 > > attributes that works for me. Otherwise I think we agree
 > > here:
 > >
 > > - one (or as few as agreeable by consensus) required metric
 > > - the option to exchange peer to peer attributes (I rather
 > >   call them metrics since that is what they represent...)
 > >
 > >  > I guess I need to explain what I mean under these examples.
 > >  >
 > >  > Imagine there are two peering CDNs: CDN-A and CDN-B. CDN-A
 > >  > uses a DNS based RRS; CDN-B uses a L7 based RRS. A client
 > >  > enters CDN-A and asks for a.b.c/d. Since CDN-B RRS is L7
 > >
 > >                             ^^^^^^ On the RRS lelvel the client
 > > 				   will ask for a.b.c not
 > > 				   a.b.c/d
 > >
 > >
 > >  > based, {a.b.c/d, surrogate ID set} was advertised from CPG-B
 > >  > to CPG-A via RPP. Since CDN-A RRS is DNS based only a.b.c is
 > >  > known to be served by CPG-A (! - the gateway to CDN-B) to the
 > >  > CDN-A RRS boxes. When the DNS request for a.b.c is received by CDN-A,
 > >  > proximity calculations between the CDN-A a.b.c surrogates and
 > >  > the client LDNS is triggered (assuming the simplest form
 > >  > of DNS based RRS is used), plus CPG-A send request to CPG-B
 > >  > to measure proximity between a.b.c (which is probably CPG-B
 > >  > surrogates having "all" the a.b.c content) and the client
 > >  > LDNS.
 > >
 > > Maybe I am missing something. Are you suggesting that CDN-A
 > > has to contact CDN-B for each DNS resolution? This is not
 > > very efficient especially for small objects.
 > >
 > >  > Clearly, CDN-B cannot perform L7 based merriments at
 > >  > this point. It can either perform some primitive proximity
 > >  > measurements and report the corresponding result ({surrogate ID,
 > >  > metric} pair) back to CPG-A or reply to CPG-A with the negative
 > >  > result. In the latter case, the default metric between CDN-A
 > >  > and CDN-B (most probably administratively configured) is used.
 > >  > CPG-A injects then the obtained results into the CDN-A RRS
 > >  > process and the routing decision is made.
 > >
 > > Are you assuming the RRS of CDN-A determines a surrogate? This will not
 > > work. If the openness of ISPs in the BGP routing arena is any
 > > indication you can
 > > not assume that CDN-A will be given enough information to find
 > > the appropriate
 > > surrogate in CDN-Bs domain (this would be the same as giving your
 > > competing
 > > ISP your BGP policy and OSPF weights). The model I had in mind is
 > > rather that
 > > CDN-A's RRS will defer finding the right surrogate to CDN-B's RRS. All
 > > CDN-A's RRS has to decide is to use CDN-B.
 > >
 > >  > The case when a client enters CDN-B is much simpler. Since
 > >  > CDN-A uses a DNS based RRS, {a.b.c, surrogate ID set} was
 > >  > advertised from CPG-A to CPG-B. When a.b.c/d HTTP request
 > >  > is received by a CDN-B redirector (assuming HTTP redirection
 > >  > is used), it may be up to CPG-B responsibility to come up with
 > >  > the CDN-A {a.b.c/d, surrogate ID subset} and to communicate the
 > >  > result with the redirector. The proximity calculations are
 > >  > performed then and the routing decision is made.
 > >  >
 > >  > The bottom line is that two RRSs of different type can
 > >  > peer at least somehow. Could you please elaborate on how
 > >  > it's possible when the RRS type is exposed in RPP?
 > >  >
 > >
 > > The problem with this simple approach is that the entire content
 > > of a.b.c has to be served by everybody CDN-A peers with.
 > > Since CDN-A only makes a RRS decission based on DNS name.
 > > So somewhere (between the RRP and CPP) the other CDNs have
 > > to be made aware that ALL or NONE of the content of a.b.c
 > > has to be carried in the surrogates. This becomes particularly
 > > troublesome if a push model is used (push is the most common model
 > > for on demand movies at this point). Since now everybody who
 > > serves one movie in his domain a.b.c has to push it to all surrogates
 > > of all CDNs he peers with. On the other hand if content is seperated
 > > by domain name if a DNS based RSS is used (as stated in the requirements)
 > > this overhead can be eliminated.
 > >
 > > I agree that requireing to seperate content by domain name if
 > > DNS based RSS are used is an ugly hack (which happens to be
 > > used by nearly all CDNs...) but I think it will be the most
 > > liekly candidate of beeing adopted in the short term.
 > >
 > > Oliver
 > --
 > > -----Original Message-----
 > > From: brad cain [mailto:bcain@mediaone.net]
 > > Sent: Tuesday, January 30, 2001 10:39 PM
 > > To: Dmitri Krioukov
 > > Cc: Oliver Spatscheck; cdn@ops.ietf.org
 > > Subject: Re: comments on draft-ietf-cain-request-routing-req-00.txt
 > >
 > >
 > >
 > >
 > > Dmitri,
 > >
 > > >
 > > > The BGP community is not a metric, it's an attribute -- quite
 > > > different thing from the protocol design and operation standpoint.
 > > > It's also a separate document from the specification standpoint.
 > >
 > > Not true... communities are often used to convey BGP local preference
 > > which last time I checked is a metric
 > >
 > > Anyway, I don't think arguing this point is useful anymore... I
 > > think there are too many people agree that there needs to be
 > > one default metric and multiple "attributes" or metrics (depending
 > > on what word you would like to use)
 > >
 > >
 > > > It's also easy to name it if we take into consideration that
 > > > the final goal is fastest response.
 > >
 > > I don't think so... simple example: delay or throughput
 > >
 > > > > Asking five people what the one metric should be
 > > > > I got six answers.
 > > >
 > > > We need rough consensus, don't we. If we had needed running
 > > > consensus, we would've had very rough code. Remember ISO!
 > >
 > > We have rough consensus... one default generic metric... IGP
 > > protocols do not specify what the default metric MEANS -- why
 > > do you think we should?
 > >
 > >
 > > >
 > > > Imagine there are two peering CDNs: CDN-A and CDN-B. CDN-A
 > > > uses a DNS based RRS; CDN-B uses a L7 based RRS. A client
 > > > enters CDN-A and asks for a.b.c/d. Since CDN-B RRS is L7
 > > > based, {a.b.c/d, surrogate ID set} was advertised from CPG-B
 > > > to CPG-A via RPP.
 > >
 > > Your example mixes distribution and request routing and confuses
 > > some of the subtle issues.
 > >
 > > If a DNS based lookup is performed (the client is not inline
 > > with a proxy or a L7 switch) then a handoff will occur to the
 > > request routing system with surrogates closest to the user. If
 > > a handoff isn't possible because the CDN does not have a DNS
 > > based system then direct network information can be sent to
 > > other root systems to make a selection.
 > >
 > > My personal belief is that most decisions will be based not
 > > on URIs but on network proximity information anyway.
 > >
 > > However, we need to support a corner case of advertisement to
 > > L7 switches which will probably be used more in the intra
 > > domain case (though we are not solving it).
 > >
 > > > Since CDN-A RRS is DNS based only a.b.c is
 > > > known to be served by CPG-A (! - the gateway to CDN-B) to the
 > > > CDN-A RRS boxes. When the DNS request for a.b.c is received by CDN-A,
 > > > proximity calculations between the CDN-A a.b.c surrogates and
 > > > the client LDNS is triggered (assuming the simplest form
 > > > of DNS based RRS is used), plus CPG-A send request to CPG-B
 > > > to measure proximity between a.b.c (which is probably CPG-B
 > > > surrogates having "all" the a.b.c content) and the client
 > > > LDNS. Clearly, CDN-B cannot perform L7 based merriments at
 > > > this point.
 > >
 > > Yes... again CDN-B is both distribution and request routing.. if
 > > it is using L7 request routing then only a small set of clients
 > > can will use its request routing system... the example is an
 > > ISP with a L7 switch which only services its own customers.
 > >
 > > > It can either perform some primitive proximity
 > > > measurements and report the corresponding result ({surrogate ID,
 > > > metric} pair) back to CPG-A or reply to CPG-A with the negative
 > > > result. In the latter case, the default metric between CDN-A
 > > > and CDN-B (most probably administratively configured) is used.
 > > > CPG-A injects then the obtained results into the CDN-A RRS
 > > > process and the routing decision is made.
 > > >
 > > > The case when a client enters CDN-B is much simpler. Since
 > > > CDN-A uses a DNS based RRS, {a.b.c, surrogate ID set} was
 > > > advertised from CPG-A to CPG-B. When a.b.c/d HTTP request
 > > > is received by a CDN-B redirector (assuming HTTP redirection
 > > > is used), it may be up to CPG-B responsibility to come up with
 > > > the CDN-A {a.b.c/d, surrogate ID subset} and to communicate the
 > > > result with the redirector. The proximity calculations are
 > > > performed then and the routing decision is made.
 > > >
 > > > The bottom line is that two RRSs of different type can
 > > > peer at least somehow. Could you please elaborate on how
 > > > it's possible when the RRS type is exposed in RPP?
 > >
 > > I'm confused... doesn't this conflict with what you've said
 > > below [which i agree with]
 > >
 > > > > Please clarify for me why your protocol would not fulfill the
 > > > > requirements stated in the draft. Maybe we should change
 > > > > the requirements which are violated. We haven't proposed any protocol
 > > > > as of yet and a simple protocol for a complex set of requirements
 > > > > is always a good solution.
 > > >
 > > > See both above and below :)
 > > >
 > > > >  >    In its current form, the corresponding
 > > > >  >    part of the draft:
 > > > >  >    - significantly complicated the protocol
 > > > >  >      design
 > > > >
 > > > > Why?
 > > >
 > > > Because internals of peering RRS (namely, the RRS type)
 > > > is exposed in the protocol.
 > >
 > > I don't know of any way around this... [but i will think
 > > about it]
 > >
 > > My sense is that if we took a vote right now, most people would
 > > want to only advertise network proximity information and not
 > > advertise URIs.  If this is true we can just create another
 > > address family for MBGP and use it.
 > >
 > > -brad
 > --
 > > -----Original Message-----
 > > From: brad cain [mailto:bcain@mediaone.net]
 > > Sent: Tuesday, January 30, 2001 10:55 PM
 > > To: Dmitri Krioukov
 > > Cc: cdn@ops.ietf.org
 > > Subject: Re: comments on draft-ietf-cain-request-routing-req-00.txt
 > >
 > >
 > >
 > >
 > > Dmitri,
 > >
 > > > If the intention was to collect everybody's
 > > > input with the purpose to refine it in the
 > > > future to a crystal set of absolute requirements,
 > > > then this is OK!
 > >
 > > First cut that's what we did.. the next draft will be
 > > slightly distilled but I don't think multiple metrics
 > > is something that will be going away
 > >
 > >
 > > >
 > > > Again, IETF is not ISO.
 > >
 > > So I guess you consider IS-IS, OSPF, and BGP bad protocol
 > > design because they all support multiple metrics?
 > >
 > >
 > > > >       2. not supporting multiple metrics in the protocol
 > > > >       is extremely shortsighted so we want to make it
 > > > >       a requirement.  again, ip routing protocols have
 > > > >       proved that this is invaluable.
 > > >
 > > > Could you please elaborate? Until now, I've been under (now seemingly
 > > > erroneous) impression that the situation here was just quite opposite.
 > >
 > > yes, people want multiple metrics
 > >
 > > > > > 2) Support of various types of RRSs
 > > > > >
 > > > > >    Interconnected RRSs perform the function
 > > > > >    of one global RRS. The function of *any*
 > > > > >    RRS is to intelligently map the {client
 > > > > >    IP address (prefix), URI} duplex to an ID
 > > > > >    (which boils down to an IP address) of some
 > > > > >    surrogate from the set of surrogates capable
 > > > > >    of processing request for that URI.
 > > > > >    "Intelligently" means that the minimal
 > > > > >    metric (which is a function of all three --
 > > > > >    client, URI and surrogate) is searched for.
 > > > >
 > > > > Not true... there are TWO different DEPLOYED
 > > > > type of architectures for request routing... BOTH
 > > > > must be supported..
 > > >
 > > > Yes, of course. I didn't get what's not true, though.
 > > > I'd even add that there are THREE different DEPLOYED
 > > > types of architecture that must be supported. The
 > > > third one is L3 based. It's used by some ISPs to
 > > > load balance their DNS servers, for example. The
 > > > same loopback IP address configured on every DNS
 > > > server is advertised in the backbone IGP. Clients
 > > > requests are routed to the closest (in the IGP sense)
 > > > DNS server.
 > >
 > > That is INTRA-domain based request routing... inside of
 > > a CDN, there can be many types... we are only concerned
 > > with INTER-domain
 > >
 > > > >       1. "inline" case: this is the layer-7 or proxy
 > > > >       case where URIs will be advertised
 > > > >       2. the dns case: this is the common CDN case
 > > > >       where only DNS names are visible.  this must
 > > > >       be supported as well
 > > >
 > > > Yes, of course. I guess what I'm proposing is ***to
 > > > define the protocol requirements so, that the resulting
 > > > protocol would be independent of the peering RRS types***.
 > > > This way, all (not yet even existing) RRS types would be
 > > > automatically supported.
 > >
 > > but to do that you need to have advertisements for BOTH network
 > > proximity AND "URI proximity"
 > >
 > > that is what we are arguing about
 > >
 > > and i do agree that both RRS can be supported with ONE protocol --
 > > its just that the protocol must supported multiple types of
 > > advertisements (ala MBGP)
 > >
 > > > > >    So, what the RRS peering protocol MUST
 > > > > >    do is to advertise URIs (possibly with some
 > > > > >    "default" metrics), plus to be capable of
 > > > > >    mapping {client, URI} requests to
 > > > > >    corresponding {surrogate, metric}
 > > > > >    responses.
 > > > >
 > > > > This is true for only #1 case above
 > > >
 > > > I understand that it would be only domain name
 > > > parts in the DNS based cases. Please see the
 > > > examples from my previous email to Oliver.
 > >
 > > Yes but the tricky part is in WHAT to base the
 > > advertisement on -- network proximity or URI proximity
 > > [see above]
 > >
 > >
 > > > ...along with the "route selection" rules the protocol will have
 > > > to define and implement. BTW, the route selection rules is a part
 > > > of the persistent BGP route oscillation problem, which was observed
 > > > recently in practice.
 > >
 > > True... this is the problem for policy based protocols... everyone
 > > wants their own policy which isn't necessarily compatible on a
 > > global scale...
 > >
 > > We can argue about routing research but my opinion on this
 > > manner is that the only way to solve it is to have coordinated
 > > routing policies (e.g. route servers).  There is no known "right
 > > way" to solve the policy coordination problem.
 > >
 > > > It also seems that you're mixing two protocol design models that
 > > > are quite different (if not incomparable from the protocol design
 > > > perspective) -- models for IGP and BGP.
 > > >
 > > > In the IGP model, metrics do exist. The only currently used IGP
 > > > that supports multiple metrics is (E)IGRP but all the EIGRP metrics
 > > > are mandatory.
 > >
 > > Not true... traffic engineering and QoS routing use multiple
 > > metrics
 > >
 > > > In the BGP model, there is no required or optional metrics.
 > >
 > > Most people consider AS_PATH a metric!
 > >
 > > > In fact, BGP does not used metrics at all. It just advertises NLRIs
 > > > along with a set of attributes (which can be either well-known or
 > > > optional (well-known attributes, in turn, can be either mandatory
 > > > or discretionary, and optional attributes can be either transitive
 > > > or non-transitive)). The only three attributes that have to be
 > > > present in any UPDATE message (well-known mandatory attributes)
 > > > are ORIGIN, AS_PATH and NEXT_HOP. Route selection decisions are
 > > > made based on the set of the route selection rules then.
 > >
 > > Attributes are often metrics (e.g. local pref)...
 > >
 > > You are just arguing terminology
 > >
 > > > Now, what model are we really talking about here?
 > >
 > > I think we are leaning towards a BGP model because of the policy
 > > issues.
 > >
 > >
 > > > >       3. two types of advertisements can easily be supported by
 > > > >       a type of "address family" (analogy: MBGP)
 > > > >       4. selection decisions are made on a per request routing
 > > > >       domain basis (barring loop prevention by for example a
 > > > >       path vector algorithm)
 > > > >
 > > > > In many ways the requirements can be supported by a simple
 > > > > "BGP-like" protocol... I think this is fairly straightforward
 > > > > and simple.
 > > >
 > > > Please keep in mind that loop prevention in path-vector protocols
 > > > is worst possible. The average convergence time is much longer than
 > > > even in the distance-vector case.
 > >
 > > Yes but the alternatives aren't much better... link state is
 > > definitely out and distance vector has its own problems...
 > >
 > > However, CDNs are a bit different because:
 > > 	1. they are overlay networks which
 > > 	2. means that most request routing systems will only be peered in
 > > 	a two level hierarchy -- to do more doesn't make any sense
 > > 	3. most requests should resolve in one or two request routing
 > > 	hops.  this means that oscillations and other BGP problems
 > > 	will probably not surface.  and if they do they will easily
 > > 	be coordinated to be solved.
 > >
 > > > Moreover, it was shown back in 1996
 > > > that BGP can form persistent route oscillations. Persistent oscillations
 > > > were *observed in practice* in special conditions when route reflectors
 > > > or confederations were used along with MEDs. I can provide you with
 > > > the references if you're interested.
 > >
 > > I'm well aware of the problems with BGP... most don't have anything
 > > to do with path vector but have to do with incompatible policies on
 > > a global scale (see infocom 2001 paper).
 > >
 > > -brad
 >