[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ISP failures and site multihoming [Re: Enforcing unreachabilityof site local addresses]



On Thu, 20 Feb 2003, Iljitsch van Beijnum wrote:
> On Thu, 20 Feb 2003, Pekka Savola wrote:
> > > We are _very_ far from a situation where even the best ISP provides a
> > > service level that is better then the one you get from multihoming even
> > > if you consider failover delays.
> 
> > In some cases, this may be better.  In some others, not.
> 
> > It's not IMO necessary to get significantly better but "roughly equal".
> 
> Don't forget that failover is just one of the benefits of multihoming.
> Two important others are protection against losing service when ISPs
> go out of business 

Sure, independence is another issue -- not related to ISP failures, and 
there are ways to protect against this.

> and more optimal traffic flow.

I agree, but other than really simple traffic flow balancing (read: 
multiple addresses) is a difficult thing to do, and people don't often 
really do it.
 
> > > And the single service provider thing doesn't scale anyway: the end
> > > result would have to be a single global ISP.
> 
> > It does scale, pretty well actually.  I'm not talking about your average
> > neighborhood ISP's with 100 customers, though.  Currently in DFZ, there
> > are about 3500 (ONLY!) AS numbers which transit at least one other AS
> > number.
> 
> I don't see your point. What you were saying is that using a single
> reliable ISP would be better than multihoming.

Considering the tradeoffs, yes.

> Now obviously an ISP can
> only control the QoS parameters inside its own network so if you want to
> do reliable and high quality VoIP you need to use the same ISP
> end-to-end. In other words: just one ISP.

This seems like a ridiculous argument or I'm missing something.  You could
just use VoIP up to the ISP, have ISP's coordinate the parameters (if you
need any), etc.  This is no different than multihomed scenario.
 
> > > Has the end-to-end principle failed to teach us anything? Reliability
> > > begins and ends in the end hosts. If each host is connected over two
> > > service providers there are four possible paths the hosts can switch
> > > between on a per-packet basis. Then the only problem becomes detecting
> > > failure. The end hosts are in an excellent position to do this without
> > > having to generate keepalive messages; a well designed protocol could
> > > switch to an alternate path within a few round trip times when a path
> > > failure occurs.
> 
> > Compare this to a solution where the site has two connections to the same
> > ISP, and you're left with major ISP backbone failure or upstream failure
> > (any relevant ISP's have only one upstream)?
> 
> So ISPs get to multihome but not end-users?

Yes.
 
> There are many ways in which an ISP network can fail, as the large scale
> Worldcom and AT&T outages six months ago illustrate. 

I'm not aware of the case, so I can't really comment.  Pointers?

> More common is the
> situation that an ISP network has trouble reaching a certain destination
> because the only link to that destination has failed (which in itself
> may not be their fault) or there is congestion somewhere. 

This seems no different than the case when using BGP site multihoming -- 
unless you want the fine-tune your policy per destination -- a 
non-starter.

> And don't
> forget maintenance windows.

No real ISP has maintenance windows that seriously affect all
communications at once.
 
> > A solution without multi-connecting, ie. only one L1 connection to one
> > ISP, is naturally out of question.
> 
> Ok, so why is multihoming to a single ISP better than multihoming to
> several ISPs?

Fewer tradeoffs: you can protect against most failure modes, while not 
having to hae AS, your own IP addresses etc -- that is, it doesn't require 
bloating the DFZ!
 
> > > Multi6 has been gravitating towards multi-address multihoming solutions
> > > for a while now, but unfortunately it seems impossible to move foward.
> 
> > Multi-address solutions solve certain problems well, but leave some
> > unsolved.
> 
> Like what?

Solves: operator independence

Doesn't solve: connection survivability, short-term failures, more than 
basic TE [mostly a non-issue IMO]

Mixing multiconnecting to one ISP and having a backup to the second one
gives you mostly everything.

-- 
Pekka Savola                 "You each name yourselves king, yet the
Netcore Oy                    kingdom bleeds."
Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings