[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: hard questions: request routing





Let me try to make an example of RR loops. I think Eric and I have slightly
different scenarios in mind. (Standard Disclaimer: As I think the
example below illustrates an important scenario it is most certainly
not the only one we consider for CDI..).

Assumptions:

	- We want to deliver content for www.foobar.com.

	- The content is available on CDN A, B and C and all of those
	  CDNs have reported readiness to deliver the content.

	- DNS based CNAME redirection is used.

	- CDN A is authoritative for www.foobar.com

	- All the CDNs have business agreements with each other.

	- CDN B and C both serve region X 

	- All CDNs constantly report delay metrics for each region
          they serve (this metric needs updating since server and network
          loads change even if all CDNs have the content ready)

Time 1: (stable state)

     DNS request from region X is received by A. At this point in time
     CDN B reports the lowest delay for region X and therefore, A
     redirects (using CNAME) the request from region X to CDN B.

     CDN B now assigns a surrogate to the DNS request and the content 
     will be served.

Time 2: (transient loop)

     CDN B has sent an update (delay increased) of its delay 
     for region X to CDN C and CDN A. The update is in transit. 
     At this point, CDN B already recognizes that it is now slower 
     for region X than CDN C. However, CDN C still believes that CDN B is
     faster for region X. So at this point CDN B has the path to
     the final CDN for region X as B,C , CDN C has the path to
     the final CDN for region X as C,B and CDN A has the path A,B.

     If at this point a request is received by CDN A it will be CNAME redirected
     to CDN B. CDN B will CNAME redirect it to CDN C and CDN C will CNAME
     redirect it to CDN B etc ... which is a transient request routing
     loop. This will only get fixed after CDN C has received the new delay 
     advertisement of CDN B.

As said in my other email. For now I would not allow CDN B to redirect
any further which solves the problem. However, it also restricts
the topology of peering arrangements.

Also one argument why a delay metric alone might not fly is that
cost is a mayor factor for CDI. However, I would still be curious
which common delay based metric people have in mind in more detail.
(Or to say it differently it is hard do disagree that delay has to play
 a role. It is more easy to argue about a particular proposal.)

Oliver