[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: failure detection

To: Paul Jakma <paul@clubi.ie>
Subject: Re: failure detection
From: Iljitsch van Beijnum <iljitsch@muada.com>
Date: Fri, 19 Aug 2005 22:36:04 +0200
Cc: shim6 <shim6@psg.com>
In-reply-to: <Pine.LNX.4.63.0508191816080.5291@sheen.jakma.org>
References: <8622E6A4-B0D7-4C9B-B184-8EB2A7C2738E@muada.com> <Pine.LNX.4.63.0508141523170.7023@sheen.jakma.org> <efebcb5728efd81901d5357b3993b6db@it.uc3m.es> <Pine.LNX.4.63.0508171556080.5353@sheen.jakma.org> <efa6464a563345cc24542d6ab48f3538@it.uc3m.es> <Pine.LNX.4.63.0508171932550.5353@sheen.jakma.org> <0f13bcc353755a4b9b965267a6a7ffb1@it.uc3m.es> <Pine.LNX.4.63.0508181034240.5291@sheen.jakma.org> <d1bbabb2d2a04821223d24f940796d23@it.uc3m.es> <Pine.LNX.4.63.0508181513480.5291@sheen.jakma.org> <4eb5dc3a95d2217a22ab1d81e23fd10d@it.uc3m.es> <Pine.LNX.4.63.0508191456120.5291@sheen.jakma.org> <9F62897E-8A0C-4588-9C54-842E6C988A0F@muada.com> <Pine.LNX.4.63.0508191816080.5291@sheen.jakma.org>

On 19-aug-2005, at 20:57, Paul Jakma wrote:

- Use IPv6 RAs to advertise both prefixes, with a preferred lifetime
  set according to how quickly you want to switch, eg set it equal
  to twice the RA interval, or even equal to it.

RAs have very long lifetimes. I think the Cisco default is a week. You can't bring this down to a minute or less without all kinds of interesting side effects.

Anyway: Then with *one* message (or lack of, to be more specific), from your DSL router all X hosts on your network deprecate use of the ISP-B addressm and can start using others. Far more efficient than *all* your X hosts doing n^2 probes.

I agree that when certain information is available, it makes sense to distribute it locally rather than have every host go out and discover the same facts for itself. We'll have to come back to this at some point.

Yes, I do. As a BGP jockey, I'm kind of like the health inspector who never eats out... There is a lot going on that regular users don't really know about.

So the answer is:

- have users run around every restaurant to try divine which one
  serves edible food (n^2 probing)

Well, that's better than coming back to the one that serves crappy food each time because you don't want to visit so many restaurants. :-)

rather than:

- fix the problems in internet routing


We await your suggestions...

The trouble is that you need aggregation to make routing scale, and with aggregation you lose all this interesting info that would have been useful. Routing can still tell you some interesting things when there are wide-spread catastrophes, but I'm not sure it's worth the trouble to optimize for that. (Or rather: I'm pretty sure it isn't.)

[test setup]

wow 2*4^2, ie 32 packets to complete probing (worst case). Imagine 50 such shim6 hosts on your network.

Well you really want to send at least 3 probes to account for random packet loss. :-)

A initiates a TCP session with destination address B1. Let's assume that the system chooses interface 1 for output and A1 as a source address, so the packets have address pair A1-B1

Now it's entirely possible that B's default route is over ISP Q. So when B sends a reply to the A1-B1 session setup request, it sends a B1-A1 packet out on interface 2. Now either the site exit router will filter it, or it will end up at ISP Q or ISP R, which will filter it. This is the infamous ingress filtering problem that we have to figure out.

But it's easy. Shim6 is *not* TCP, it doesnt need to maintain any / specific/ consistency of addresses. Eg, in this example, why on earth is B replying with (B1,A1)? The reply from B (in my mind) would be (B3,A1).

The reason why it's not easy is that at this point, the shim hasn't been activated yet, we're just doing regular TCP. This is necessary to maintain backward compatibility.

And even if you activate the shim at this point, the two sides haven't been able to compare notes yet, so you can't start doing strange tricks yet, or at least you run into security complications.

BTW: Note that I strongly disagree with trying to solve all the internet's routing problems by making every end-host do n^2 probing.

Ah, good that you said so because we all thought you were supporting this.

So A tries:

A1-B2
A2-B1
A1-B3
A3-B1
A1-B4
A4-B1

and on and on and on, until it eventually determines that A4-B4 works.

You don't want this to happen. So what's the alternative? Give up after the second try? The fourth? The n^2/2th?

IMHO, you should only try:

(unspecified) -> B1
(unspecified) -> B3
(unspecified) -> B4


Ah, ok.

Don't forget that if the site exit router does its own version of the ingress filtering, it can send back ICMP messages so the host knows that this source address doesn't work and move on without much delay. So after a maximum of 3 messages with incorrect source addresses A knows it should use A4, and then it only has to do B1, B2, B3 and B4 to find the working A4-B4 pair.

Also, if the host has several sessions towards different destinations, it may observe that if 2001:a900:456::1 isn't working, so if it has to choose between trying 2001:a900:789::1 and 3ffe:ffff: 789::1 it will choose the latter because there is a chance the whole 2001:a900::/32 block is affected.

So in reality having to test 2*n^2 will be extremely unlikely.

In my universe (which happens to correspond vaguely to how the internet works today ;) ), when AMS-IX dies, within 1 to 3 minutes or so, your ISP, in conjunction with other ISPs start propogating WITHDRAWs and UPDATEs and converge so that packets flow again.

The trouble is that many small ISPs around here connect to the rest of the world through one location in Amsterdam, so when there is a power failure at that location, not only their AMS-IX stuff goes down but also their transit. Last time there was an AMS-IX power failure (a month before the generator they were installing because of the one- but-last power failure went online) about 25% of all AMS-IX members were completely unreachable for me.

Remember that while all of this is going on, the transport protocol sees a black hole. So at any time, the transport can decide to time out. The shim doesn't do anything that actually _hurts_ regular transport protocols.

If you call the potential for a smallish network sending out near to 1k probe packets "doesn't hurt" for not much gain, sure.

Well, I send out that many packets in 10 seconds over my DSL line when everything works, so why not do the same when there is a problem? :-)

BTW, my ISP just had a very big DoS attack. The shim would have enabled me to keep working to the extent possible, routing can't really do anything in these cases.

Follow-Ups:
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>

References:
- failure detection
  - From: Iljitsch van Beijnum <iljitsch@muada.com>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: Iljitsch van Beijnum <iljitsch@muada.com>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>

Prev by Date: Re: shim-aware transports
Next by Date: Re: failure detection
Previous by thread: Re: failure detection
Next by thread: Re: failure detection
Index(es):
- Date
- Thread