[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ISP failures and site multihoming [Re: Enforcing unreachabilityof site local addresses]



On Fri, 21 Feb 2003, Kurt Erik Lindqvist wrote:

> > There is no technical reason why a single service provider network can
> > do better than a similar network that consists of several smaller

> See Abha and Craigs paper on convergence of BGP. Personally I would go
> for a large provider with multiple connections.

Based on this paper? What I see is rarely as bad as what they describe.
However, I had the chance to experiment a little with revoking a
longer prefix and then see how soon the shorter prefix would "catch" the
traffic a while back, and this was certainly interesting: the state goes
back and forth between "working" and "not working" several times over
the course of two minutes. But simple failover is usually pretty fast.

If failover times when a circuit goes down are your main concern,
connecting to the same ISP twice makes a lot of sense. (Use a
non-switched network for this, though, or you'll be at the mercy of the
hold time.) On the other hand your failover time when in case of an ISP
failure is much worse this way. Personally, I'd rather eat the two
minute failover and be protected against the two week one than the other
way around, even if the former happens once every few months and the
latter may not happen at all.

> Last fall I was invited to a conference in Sweden to debate multihoming
> and the enterprise. Before me was this enterprise IT manager who showed
> how much more resilient his network was with two BGP sessions. While he
> talked I checked his announcements just to find that one of the
> providers bought transit from the other. You can't buy clue.

You can buy a good book that explains it all.  :-)

Did you check to see if the second ISPs also had additional upstreams?

> > But IGPs have the same
> > fundamental problem (although the details may differ). OSPF for
> > instance takes 40 seconds to detect a dead circuit.

> there was a fix proposed in San Diego (although for IS-IS) but that was
> voted down. There was pros and cons.

Just type:

 ip ospf hello-interval 1
 ip ospf dead-interval 3

But do it on ALL your boxes in the subnet or you'll live to regret it.