[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: failure detection

To: marcelo bagnulo braun <marcelo@it.uc3m.es>
Subject: Re: failure detection
From: Paul Jakma <paul@clubi.ie>
Date: Thu, 18 Aug 2005 16:06:11 +0100 (IST)
Cc: shim6 <shim6@psg.com>
In-reply-to: <d1bbabb2d2a04821223d24f940796d23@it.uc3m.es>
Mail-copies-to: paul@hibernia.jakma.org
Mail-followup-to: paul@hibernia.jakma.org
References: <8622E6A4-B0D7-4C9B-B184-8EB2A7C2738E@muada.com> <Pine.LNX.4.63.0508141523170.7023@sheen.jakma.org> <efebcb5728efd81901d5357b3993b6db@it.uc3m.es> <Pine.LNX.4.63.0508171556080.5353@sheen.jakma.org> <efa6464a563345cc24542d6ab48f3538@it.uc3m.es> <Pine.LNX.4.63.0508171932550.5353@sheen.jakma.org> <0f13bcc353755a4b9b965267a6a7ffb1@it.uc3m.es> <Pine.LNX.4.63.0508181034240.5291@sheen.jakma.org> <d1bbabb2d2a04821223d24f940796d23@it.uc3m.es>

On Thu, 18 Aug 2005, marcelo bagnulo braun wrote:

Hi Paul,
My understanding is that you seem to be uncomfortable with the fact that n^2 probes may be needed when we try with different source and destination locators, right?

If needed == MUST or SHOULD, then not uncomfortable but vehemently opposed. It's not needed at all imho.

My reason for this particularly is because I fear this will risk precluding (*better*) mechanisms external to shim6 for detecting failures.

Suppose you have a multihomed site with ISPA and ISPB and that they have assigned PrefA and PrefB respectivelly

Suppose you have Host1 in the multihomed host and that it is communicating with Host2 outside the multihomed site. For that communication, Host1 is using address PrefA:host1 both as locator and as ULID. I assume that host2 has a single address host1

Is host2 shim-aware? It must be for scenario in next sentence to work. Particularly given RPF filtering at ISPs on "PA" customers.

Now, suppose that an outage in the link between the multihomed site and ISPA occurs.

Ok.

Host1 detects and needs to do something about it, how can he try with an alternative path?


Why must host1 detect this? Host2 could also ;).

If, as part of shim6 control, host1 had, /in advance/, informed host2 that prefB:host1 was /also/ a valid locator, then:

- if host2 was first to detect the problem, it could simply switch to
  using the prefB:host1 locator

- if host1 notices there is a problem with bidirectional
  communication, it can simply start using the PrefB based locator
  (which host2 already knows about and should accept)

This presumes host1 is actively managing its source address according to state specific to shim6. However, there is a *better* way:

- dont have shim6 pick the ISP dependent part of the locator
- let the OS's existing SAS mechanisms pick the source address
  according to routing policy, as per normal, for each and every
  packet
- let some external mechanism detect the local failure

Some (non-exhaustive) examples of such mechanisms:

- link status information

(eg PPP has its own heartbeat mechanism. PPP is often used for DSL connections, which will be a common case for the scenario described above for host1, today at least)

- Routing information

(eg, host1 /could/ get routing information from either or both of ISP1 and ISP2. Be it just a RIP default announced regularly on up to full *read-only* BGP feeds)

- Link probing / dead-gateway detection software

(At least one OS provides software to actively probe gateways and adjust routes as/when required)

- Well-known remote host probing software

And so on.

So, I'm not sure shim6 is at all the best place to take on detection of /local/ failures.

Ie, in your scenario above, what I'm saying is:

- Shim6 on host1 doesnt and shouldn't be involved in detecting local
  failures

- Only host2 needs to know and detect whether it must use PrefA or
  PrefB based locator for host1.

Ie, this should all be about reachability of the *remote* locator. Which local locator to use should simply *not* be in scope for shim6 - then things get easier, and you can avail of one or more existing mechanisms for detecting local failures.

Well, it needs to retry using an alternative source address,

Indeed, but as per above, let /something else/ figure out which local path (and hence which source) is the best to use.

so that packets can be routed through ISPB (in the outgoing direction this is due to ingress filtering compatibility and in the incoming direction becuase of the usage of PA addressing)

The keyword here is "routed", of (to paraphrase) the local output to take. That implies routing, that implies (to my mind) "Not shim6's job".

What shim6 /does/ need to concern itself with is which /remote/ locator address it needs to use.

This implies that when a host within a multihomed site needs to try alternative paths, it needs to use different source addreses, and of course different destiantion addresses in a more general scenario.

This implies that in order to explore all the possible paths, we need to make n^2 probes.

Yes, if you refuse to consider the option of leaving SAS to something else (a wealth of mechanisms exist /already/), then you'll become convinced you need n^2 probes. :)

Now, it is important to realize that n^2 is just an upper bound, and that n^2 probes will only be performed when all paths have failed except one and this is the last one you have tried with

Even better is to realise that local address selection simply does not need to be something shim6 has to concern itself with. Then the number of probes, in the worst case, goes down to n, where n is the number of locators you have to hand for the remote shim6.

Don't forget, n^2 is per side. Each side will have to have to do it. So you're looking at, worst case, for host1 with n locators and host2 with y locators, by my potentially incorrect algebra:

shim6 doing SAS and probing each side:

	y^2 + n^2

In the worse case, both y and n tend to infinity, so generally n = y, so the worst case therefore is:

	2*n^2

as n and y -> infinity.

If you figure out that the local source to use is better left to other, existing, mechanisms, and only probe the remote side, worst case is:

	n + y

similar to above, this worst case will tend towards:

2n

as n->infinity, y->infinity.

O(n^2) scaling is horrid, it causes big performance problems even with relatively low orders of N. Be it in run-time or space. (Eg, you may have to keep state for those probes. Imagine a shim6 'translator' shimming for a big network, or a tiny little embedded consumer device, eg a DSL/802.11 gateway device).

I strongly urge n^2 probing to /not/ be considered, to instead examine possibility to *not* do SAS in shim6. It just simplifies everything, allows *better* mechanisms (like routing, PPP link-status, etc.) to do their job, and scales linearly.

(which may occur very often according to Murphy's law :-)


I live in Ireland, so I run into Murphy a lot ;).

One of the main concerns of the people designing this mechanisms is how to achieve clever mechanisms to reduce as much as possible the number of probes. The idea is not to send the n^2 probes at once, but to perform some form of exploration phases in which different combinations are tried.

How about considering /not/ doing SAS in shim6 at all? For the reasons I describe above? :)

Even simpler.

sure, n^2 is just the worst case

My initial thoughts, as above, simply is that it is not required. We can get linear scaling if we just drop the desire for shim6 to do SAS. :)

no, i am saying that source address selection is influenced by RFC3484 policy table and that this table is the right place to express policy.


Agreed.

In addition, that SHIM can honor this table as much as possible, so that policy can be expressed when using the shim

Disagree. Shim6 shouldn't know /anything/ about this table. SAS shouldn't be its concern in the first place, imho.

no, shim can try to use first the addresses as expressed in the policy table


That's great, but you have 0 idea whether its the right address.

Determining the right address involves:

- route lookup on the destination to find the local output
- lookup source for this output

Shim6 /could/ do this, but on the other hand it could just leave the local address in packets sent down the stack as unspecified, and the OS will just do it for shim6.

What is the benefit for shim6 to do this, that it MUST? (It doesn't need to. Any such optimisations are purely implementation specific).

Ergo: No need for SAS in shim6.

No, shim will try to honor the policy table (or any other tool to express policy we need to define) Obviously, if the path preffered by the policy is not available, then shim will have to use others of course

See, there's an existing way to send a packet to a remote host and have the source be set appropriately for the preferred path - don't specify the source address...

No complications needed.

That's entirely scaleable and well within reason for deployment at 'enterprise' shim6 sites.

agree, but this is not enough to preserve established communications


Why exactly?

Are you considering the case that full BGP feed is injected to the hosts, so that the hosts can find out which path is available towards its final destination?

Well, in a BGP fed and shimmed network, I would imagine the shimming would be done just on one or two hosts (or routers) at the edge of the shimmed network. The BGP feed would be confined to those.

As far as I know, the ULID and shimming host need not be same machine, is my understanding correct?

when i said that is not specific i meant that other protocols may require similar procedures, but it is a key component of the shim protocol, so that is why it needs to be performed.


I dont think it needs to be, imho.

can you expand on that?

I'll try to write down my thought on a specific mechanism. It's been forming as part of these discussions.

As i pointed out in the example above, different source/destiantion address combinations are required to deal with failures in the edges (i.e. ISP - multihomed site links)

Yep. And how to do that tends to be described in routing tables :) (Of which, in many implementations, the source-address-selection policy is also described).

I'll try write down my thought on a specific mechanism in a seperate mail/doc. Bit busy this week though.

regards,
--
Paul Jakma	paul@clubi.ie	paul@jakma.org	Key ID: 64A2FF6A
Fortune:
We don't really understand it, so we'll give it to the programmers.

Follow-Ups:
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>

References:
- failure detection
  - From: Iljitsch van Beijnum <iljitsch@muada.com>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>

Prev by Date: Thoughts about layering multi-addressing
Next by Date: Re: failure detection
Previous by thread: Re: failure detection
Next by thread: Re: failure detection
Index(es):
- Date
- Thread