[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: failure detection

To: Erik Nordmark <erik.nordmark@sun.com>
Subject: Re: failure detection
From: Paul Jakma <paul@clubi.ie>
Date: Mon, 29 Aug 2005 22:54:01 +0100 (IST)
Cc: marcelo bagnulo braun <marcelo@it.uc3m.es>, shim6 <shim6@psg.com>
In-reply-to: <431379EB.4030905@sun.com>
Mail-copies-to: paul@hibernia.jakma.org
References: <8622E6A4-B0D7-4C9B-B184-8EB2A7C2738E@muada.com> <Pine.LNX.4.63.0508141523170.7023@sheen.jakma.org> <efebcb5728efd81901d5357b3993b6db@it.uc3m.es> <Pine.LNX.4.63.0508171556080.5353@sheen.jakma.org> <efa6464a563345cc24542d6ab48f3538@it.uc3m.es> <Pine.LNX.4.63.0508171932550.5353@sheen.jakma.org> <0f13bcc353755a4b9b965267a6a7ffb1@it.uc3m.es> <Pine.LNX.4.63.0508181034240.5291@sheen.jakma.org> <d1bbabb2d2a04821223d24f940796d23@it.uc3m.es> <Pine.LNX.4.63.0508181513480.5291@sheen.jakma.org> <4eb5dc3a95d2217a22ab1d81e23fd10d@it.uc3m.es> <Pine.LNX.4.63.0508191456120.5291@sheen.jakma.org> <431379EB.4030905@sun.com>

On Mon, 29 Aug 2005, Erik Nordmark wrote:

When talking about actual uni-directional failures due to links or routers going bad, I'd agree.

But one assumption we've been making in the multi6/shim6 work is that we want to cover the SOHO multihoming case.


Indeed.

Using basically retain ISP service from more than one ISP, where the ISPs might apply ingress filtering, and the site can't get the ISP to relax their ingress filtering.


Right.

I think ISP ingress filtering should be assumed, it's unlikely to /not/ be in place these days.

If we don't do something in this case, then we'll have the routing (and selecting of exit ISP) be completely independent of the (hosts') source address selection,


But SAS is inherently tied in with routing. Let it do its magic.

and we will have a very high incidence of packets being dropped by the ingress filters.


Ie, you mean the case of:

                     ____ISP1
                    /
host---<SOHO router>
                    \____ISP2

Where the SOHO router is just a normal SOHO "router" (ie a wee DSL router) doing ye normal forward-by-destination, the host is shim6'ing? You want to cover the case where the router is 'dumb' and still have things work, regardless of which ISP it forwards to?

Things that come to mind:

1. A dual connection SOHO DSL router likely would have to support
  source-prefix routing anyway, to allow non-shim6 IP to work

  - in which case, the shim6 host could use either prefix, and
    relying on some other prefix-availability information would be
    much better (eg, SOHO DSL often uses PPP, which includes its own
    keepalive information. Which can be propogated via RAs, as
    per our other emails)

2. If the SOHO router doesn't support source-routing to deal with
  ingress filtering, we don't know what, if any, balancing it does.

  - best case: forwarding is pinned to one ISP while links dont
    change
    - one source always works (while links dont change)
    - other does not

  - worst case: it tries to load balance amongst links:
    - both sources the shim6 host could use will experience regular
      drops, lots of 'retraining' of the 'shim6' protocol and pretty
      pathological performance one suspects
    - such a router would suck for multihoming, regardless of shim6,
      hence why I suspect 1 would be more likely

The other possibility for SOHO:


     host
      |
 -----------
   |     |
 SOHO1  SOHO2
   |     |
 ISP1   ISP2

In which case the /host/ must pick which router to send to. In which case everything is under its control - shim6 can easily leave source unspecified. IP output then does SAS using routing information as normal, picks a prefix and uses a source route to send it to the right SOHO router.

So, I think for the former case (one SOHO router), the router almost certainly will have to support source-prefix routing to be a viable consumer product. The shim6 layer can use unspecified address, regardless which prefix the underlying IP layer chooses (hopefully using RA information, which hopefully the SOHO router is good enough to update according to state of links to ISPs), the SOHO router will send it to right ISP.

In the latter case, its under the hosts control, and again the underlying IP SAS will do the right thing if shim6 leaves the address unspecified.

Some options for what to do is in draft-huitema-multi6-ingress-filtering-00 (now expired). Even if such a scheme is standardized and deployed, it might not be deployed everywhere. Which is why some of see utility in having the capability to try all <source,dest> combinations in both directions of the communication. But I sure do hope that in normal deployment we don't have link/router failures that cause shim6 to have to try close to n^2 locator pairs before finding a working one.

Well, I've run an IPv4 site with PA prefixes and ISPs which did ingress filtering, unspecified addresses + source routing works nicely..

In the face of ingress filtering, the /only/ correct outward path is easily determined by the source. Given the source can easily be influenced by adding/removing addresses (or depreferring in IPv6), *IF* the upper layer leaves the address unspecified, seems like that is best way to tackle problem to me.. See:

	http://hibernia.jakma.org/~paul/rc.iprules

That was in production for several years for an *IPv4* multi-PA address SMTP and HTTP service site. It worked for inbound, and it worked for outbound (SMTP) because those services did not try to get clever about source-address selection.

It didn't try to do failover, cause our links were reliable enough. If they had not been, that could have done by adding/removing the relevant PA addresses based on whatever link/path availability information you had to hand (even a 'ping' script - there are /lots/ of possibilities).

Eg, Solaris has the same kind of thing, but more sophisticated: IPMP.

No n^2 needed.

I think one can come up with strategies for the order in which locator pairs are tried that will work efficiently for the normal case of some source address dependent exit router selection (in the cases this is necessary to avoid ingress filtering dropping the packets), and link/router failures. For instance, if A has locators A1, A2 and B has B1, B2, B3, and A is communicating with B using <A1, B1> then when A suspects that the locator pair has stopped working it can try to maximize the probability of using a different path by trying to combinations which change both the source and destination first, and then the pairs that only change one of them. For example, try in the order of

	<A2, B2>
	<A2, B3>
followed by
	<A1, B2>
	<A1, B3>
	<A2, B1>


I contend you just need, worst case, B1...Bn, probe packets.

Just let the existing mechanisms for source selection decide the output path.

I will try write up my view of things sometime, ;) just busy with other stuff at the moment.

regards,
--
Paul Jakma	paul@clubi.ie	paul@jakma.org	Key ID: 64A2FF6A
Fortune:
"`The best way to get a drink out of a Vogon is to stick
your finger down his throat...'"

- The Book, on one of the Vogon's social inadequacies.

Follow-Ups:
- Re: failure detection
  - From: Erik Nordmark <erik.nordmark@sun.com>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>

References:
- failure detection
  - From: Iljitsch van Beijnum <iljitsch@muada.com>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: Erik Nordmark <erik.nordmark@sun.com>

Prev by Date: Re: Thoughts about layering multi-addressing
Next by Date: Re: Thoughts about layering multi-addressing
Previous by thread: Re: failure detection
Next by thread: Re: failure detection
Index(es):
- Date
- Thread