[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ISP failures and site multihoming [Re: Enforcing unreachability of site local addresses]



I'll take one particular issue, and Cc: to multi6 as I believe it is a
very important thing to consider.

On Fri, 14 Feb 2003, Alan E. Beard wrote:
Most of the end-user-network managers among my clients now multihome,
and
will continue to require multihomed service in future. In every case
where the user's network is multihomed, the multiple independent
connections are seen as crucial for maintenance of high availability of
[Kurtis:]
I find this funny. A number of studies have shown that if this is what
you are after, multihoming and BGP is the wrong way to go - but never
mind.

Your comment may be true, but my clients are nonetheless unwilling to risk
the possibility of an extended network outage on a single ISP (while not
frequent, these events are far from unprecedented) rendering their online
customer-support environment unavailable for several hours, much less for
a day. Shorter outages (on the order of minutes in the single digits) are
tolerated, provided that such outages are infrequent.
This is a very problematic approach IMO.

Need more resiliency? Network outages unacceptable?

The right place to fix this is the network service provider, period.
Nothing else seems like a scalable approach.

Consider a case when many companies _phone_ services would have been
changed to VoIP. IP would be a critical service. Do the enterprises
protect against failures by getting more ISP's? Unscalable. No, the
ISP's _must_ get better. Pick one well when choosing them.

When ISP's have SLA's, a lot of customers for which continued service is
of utmost importance, the networks *will* work. There is just no other
choice. If the mobile phone of CTO, CEO or whatever rings after (1)5
minutes of network outage, things _will_ happen.

It just seems the mentality in some networks is that network outages are
ok, networks don't have to be designed with multiple connections, etc.etc.

That must change if we want to build a mission-critical IP infrastructure.
Instead of making every site try to deal with the problems themselves.

This is my view as ISP and an end-user.

This highlights another issue with the solution to the multi6 problem - convergence time. We need a solution that also improves this, besides scales in the DFZ.

- kurtis -