[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Fwd: Re: Distribution CPG Protocol - Some Thoughts




Since we're approaching a stack overflow on the email nesting, I'll back 
out of the in-line comments. I think we're on the same page on terminology 
(or close enough, anyway), but I think I have a (slightly) different 
business model in mind. (My model may be too simple.) To test that 
assertion, let me try to restate Oliver's model:

CDNs advertise their surrogates to content providers. Content providers use 
some criteria (not yet discussed in detail) to identify surrogates that 
they would like to use. Content providers then advertise their content to 
that subset of surrogates. Surrogates use some criteria (best-fit is an 
example, but there are plenty of others) to determine which content they 
want to cache, and they retrieve the content from the origin (I suppose, 
either as-needed or as a pre-fetch.)

If that's a reasonable approximation, then the part I'm not sure about is 
the content providers identifying which surrogates they'd like to use. My 
assumption was that a provider would do that without any assistance from a 
distribution protocol. E.g. a provider looks over the terms and conditions, 
SLAs, scale, reach, etc. of any particular CDN and makes the choice to sign 
up with that CDN or not. "I've checked out ACME CDN and they've got caches 
in all the right places for me, the price is right, etc."

Of course, in both approaches there's always the issue of a surrogate 
deciding (or not) to actually cache content, so just because a content 
provider signs up with ACME CDN, there's no guarantee that the provider's 
content actually ends up in an ACME surrogate. But you can never use a 
protocol to fix that. (ACME's surrogate might blow a power supply.) The 
parties will have to address that in the terms and conditions and/or SLA.

The crux, I think, is whether a content provider needs information from a 
distribution protocol to make a decision to use a CDN or not. Presumably 
we'd agree that protocol-provided information is never sufficient, but is 
it necessary? Another way to look at the question might be: does a content 
provider need to make decisions on whether to use a CDN in real-time, or 
can that be handled off-line. If real-time is a requirement, then I agree 
that simply advertising content won't work. Since I couldn't come up with 
any glaringly obvious need for real-time. (Well, obvious to me, but since 
I'm pretty dense that may not mean much ;^), I wonder if anyone else has 
some specific examples. Concrete examples would also be a good start on 
understanding what, specifically, a surrogate needs to say in its 
advertisement.

Stephen


At 09:47 AM 2001-01-04 -0500, Oliver Spatscheck wrote:
>Stephen Thomas writes:
>  > At 11:43 AM 2000-12-28 -0500, Oliver Spatscheck wrote:
>  > >Stephen Thomas writes:
>  > >  >
>  > >  > On the assumption that the WG goes forward, here are some initial
>  > > thoughts
>  > >  > on protocols. Some (most?) of this is possibly obvious, or 
> perhaps some
>  > >  > (most?) is brain-damaged. I'm interested to hear either way.
>  > >  >
>  > >  >
>  > >  > Distribution CPG Protocol. This has been likened to BGP several 
> times, so
>  > >  > it seems like a good place to start is looking at what BGP offers 
> (and
>  > > what
>  > >  > it doesn't) that appear to be relevant to CDNs.
>  > >  >
>  > >  > First, BGP is an advertising protocol. BGP peers advertise autonomous
>  > >  > system paths that reach CIDR IP subnets. In our case (again, thinking
>  > > only
>  > >  > of distribution), two options are available. Distribution CPGs could
>  > >  > advertise surrogates, or they could advertise content. If DCPGs 
> advertise
>  > >  > surrogates, it would be up to the recipient CPG to arrange to have
>  > > content
>  > >  > pushed to the surrogates. Alternatively, if DCPGs advertised content,
>  > > then
>  > >  > it would be up to the recipient to arrange to have its surrogates 
> pull
>  > > that
>  > >  > content. I suppose there's no technical reason to limit the protocol
>  > > to one
>  > >  > of these options, but, in the interest of schedule and focus, at 
> least
>  > >  > picking one to start with seems preferable. My own instinct is that
>  > >  > advertising content works better. There's sort of a one-to-many
>  > >  > relationship (one content to many surrogates) that makes it more 
> natural.
>  > >  > (In the reverse, you have to worry about different content providers
>  > >  > contending for the same advertised surrogate space.)
>  > >  >
>  > >  > Proposal 1: The protocol should advertise the availability of 
> content.
>  > >
>  > >
>  > >Actually I like more the model of advertising surrogates. Advertising 
> content
>  > >alone prevents the owner of making the decision of which surrogates 
> to use
>  >
>  > I think you can look at this either way. If a content owner doesn't like
>  > the surrogates operated by a particular CDN, it just makes sure that none
>  > of its advertisements are sent to that CDN.
>
>I start to think we are interpreting the terminology differently. So before I
>try to counter your arguments let me make sure we are talking about the same
>things.... . In my definition the advertiser of content has no knowledge of
>which surrogates the content will end up on (similar to BGP if a route
>gets advertised to a peer the AS advertising the route can not be sure
>who will be using that route at the end....).
>
>If I advertise surrogates the surrogate is limited in a similar fashion.
>
>To illustrate it on the point you made above.  If I only advertise content to
>a surrogate how would I know what surrogates I advertise to. This information
>is needed to not advertise content to a surrogate as you suggested above.  The
>only way to gather that information is if somebody tells me about the presence
>of the surrogates or groups of surrogates (I call that advertising the
>surrogate).  I think making an intelligent decisions on which surrogate to use
>is important for:
>
>A: cost (if settled peering is used the prices will be different)
>B: Performance
>C: Limited scope for low volume Web sites (we don't want to cache
>    spatscheck.com on all caches in this world ....)
>
>I think the cost factor is a major one here.  So far CDN peering is
>settled. So letting the person you pay pick who to use just doesn't seem
>right even if you trust them to account correctly. So looking at your and my
>argument I believe we need a more complex multi phase protocol.
>
>
>PhaseI: the surrogates (or groups of surrogates) advertise themselves
>         to the content provider.
>PhaseII: the content provider advertises the content to a subset of
>         surrogates.
>PhaseIII: the surrogates accept
>
>An alternative approach would be:
>
>PhaseI: content provider advertises content
>PhaseII: Surrogate advertises willingness to take content
>PhaseIII: content provider selects surrogates
>
>I am unsure which of the two should be used. Comments?
>It might also be that only phase one in either proposal uses
>a BGP like protocol.
>
>  > >
>  > >I also disagree with this point. One of the main features of paths in 
> BGP is
>  > >the detection of routing loops (routing itself depends more heavily
>  > >on policy ....). We have a similar problem. We have to eliminate 
> advertisment
>  > >loops. It is also a good debugging tool. Debugging problems in peered
>  > >CDNs is one of the main challanges of CDN peering.
>  >
>  > Actually, path-vectors a la BGP are one of the least efficient (in 
> terms of
>  > bandwidth, storage, and computation) ways to detect loops. Both RIP
>  > (distance-vector) and OSPF (link-state with reliable flooding) avoid loops
>  > quite nicely without paths. Of course, there are always trade-offs, and we
>  > might decide that we like path-vector better than the alternatives.
>  >
>
>Agreed here.
>
>
>Oliver

____________________________________________________________________
Stephen Thomas                                       +1 770 671 1888
TransNexus, Chief Technical Officer    stephen.thomas@transnexus.com