[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: failure detection
On Fri, 19 Aug 2005, Iljitsch van Beijnum wrote:
The 2*n^2 probing isn't necessary to detect failures (local or
otherwise), but to detect what's still working.
I don't agree.
And even if side A can easily detect a failure at site A (which
isn't a given, if my DSL line goes down my router knows it but my
hosts don't),
We already have protocols for this, there is at least one
possibility, IPv6 RAs, so:
- Use IPv6 RAs to advertise both prefixes, with a preferred lifetime
set according to how quickly you want to switch, eg set it equal
to twice the RA interval, or even equal to it.
When the DSL router detects a link has gone down, it simply stops
advertising the relevant prefix.
- We may need to invent protocols to carry valid-source prefix
information across multiple routers in a site. (to update
prefix-advertisement information). I think that's a general IPv6 RA
problem though.
- Other stateful configuration protocols may be used, eg DHCPv6
the server would need to become multi-address aware, an
implementation detail.
NB: The latter two points would not be required if we allow shim6 to
work in a 'split' mode. For then the "shimmed" network would only use
ULIDs which would always remain valid (no need to change).
Anyway: Then with *one* message (or lack of, to be more specific),
from your DSL router all X hosts on your network deprecate use of the
ISP-B addressm and can start using others. Far more efficient than
*all* your X hosts doing n^2 probes.
Eg, imagine a site multihomed with 3 ISPs, and 50 hosts, which are
communicating with a similar network in site-B using shim6 (50 hosts
with 3 locators each). That's possibly 50*3^2 packets to send - 450
packets. Even though all that's needed is for the DSL router to
broadcast locally "seems I lost my connection to ISP-A" - one packet
(or even, lack of one packet).
How can you consider this probing to be at all sane? ;)
how does side B learn this fact?
It starts receiving packets from Site-A with a new locator address.
A host doesn't necessarily know which exit path a router will choose.
That's true. We can fix that in a better way than n^2 probing surely?
So what happens when through means outside our view a packet gets a
destination address routed over ISP X, but a source address from
address space from ISP Y, and X filters Y's addresses?
See above.
- Host2 shim6 to detect host1's valid locators have changed
- Maybe because it receives a packet from Host1 with a new
source
This doesn't allow for unidirectional reachability.
How so?
Host2 need not use the source address in its packets which Host1 is
using.
You want to specify that shim6 be able to work around /any/ kind of routing
failure, anywhere on any part of the internet affecting any path between
Host1 and Host2.
Yes, I do. As a BGP jockey, I'm kind of like the health inspector
who never eats out... There is a lot going on that regular users
don't really know about.
So the answer is:
- have users run around every restaurant to try divine which one
serves edible food (n^2 probing)
rather than:
- fix the problems in internet routing
Maybe those are a bit more common, but it's not like failures in
the core never happen.
To be honest, I don't know, it's my gut feeling. But I'm not the one
arguing on a gut feeling to have the IETF specify a host protocol
that mandates n^2 probing.
If we don't know enough about internet failures to say whether it is
or is not worth building n^2 probing per-default into shim6, then we
shouldn't do it. It can always be 'added in' later, or done by
implementations if left out, but if you put it in and it gets
deployed - it's hard to take back.
Ingress filtering has the potential to create lots of
unidirectional reachability for a given address combination.
I don't buy that at all.
If your ISP-A filters out ISP-B sourced packets, then it wont be
routing ISP-B destined packets to you.
Nonsense. Congestion is rare these days, and the levels necessary
to break connectivity wholesale are almost unheard of.
No it's not. Go live at the edges. I experience congestion
*regularly* on my DSL link. I did say "at the edges".
And "break connectivity wholesale .. unheard of" is precisely part of
my point - TCP doesn't break. But what will shim do exactly in the
face of lossage?
- n^2 probing in shim6 is simply introducing huge expense in order
to
solve a very uncommon problem
Yeah I don't get this point you're arguing so energetically.
:)
Exponential scaling scares me - particularly when built-in to a
protocol. As does the SAS stuff, but you want that because of the n^2
probing issue i think.
Let's build a test network. (I'll be talking about hosts, but
obviously many aspects are side-wide.)
Host A has two interfaces, that both eventually connect to a router that
connects to two ISPs. So:
Addr A1: int 1 - ISP K
Addr A2: int 1 - ISP L
Addr A3: int 2 - ISP M
Addr A4: int 2 - ISP N
Sanme thing for its correspondent host B:
Addr B1: int 1 - ISP O
Addr B2: int 1 - ISP P
Addr B3: int 2 - ISP Q
Addr B4: int 2 - ISP R
wow 2*4^2, ie 32 packets to complete probing (worst case). Imagine 50
such shim6 hosts on your network.
Let's assume that each router will do source address based routing for the
two ISPs it connects to, but the ISPs all do ingress filtering.
Ok.
A initiates a TCP session with destination address B1. Let's assume
that the system chooses interface 1 for output and A1 as a source
address, so the packets have address pair A1-B1
Now it's entirely possible that B's default route is over ISP Q. So
when B sends a reply to the A1-B1 session setup request, it sends a
B1-A1 packet out on interface 2. Now either the site exit router
will filter it, or it will end up at ISP Q or ISP R, which will
filter it. This is the infamous ingress filtering problem that we
have to figure out.
But it's easy. Shim6 is *not* TCP, it doesnt need to maintain
any /specific/ consistency of addresses. Eg, in this example, why on
earth is B replying with (B1,A1)? The reply from B (in my mind) would
be (B3,A1).
No filtering problem at all. When this packet arrives at A, it
associates (B3,A1) with the right mapping. Simple to do, given its
own locator is in there. Why should A care whether replies with the
same IP address?
You're thinking, it seems, as if shim6 is like TCP, where each
different tuples of (source,destination) *must* refer to different
connections. But there is no need for such a restriction.
But let's assume we somehow fix this problem, and packets flow
without trouble between A1 and B1.
Yes we can easily fix it. The flows will look like:
A -> B using (A1,B1)
A <- B using (B3,A1) (note, B need not even use A1, it could use A2)
The TCP session continues for a bit, and at some poin the shim
wakes up and decides that this is a long-term session that should
be protected from failures. So the shim layer on host A sends out a
packet with source A1 and destination B1 (= addresses from the TCP
session) which includes security stuff and the list of local
alternative locators: A2, A3 and A4. B also happens to implement
the shim, so it answers with some security stuff of its own and its
list of alternative locators: B2, B3 and B4.
Ok.
So now we're ready for the internet to fail.
Scenario 1: A's link to ISP K fails.
Since this is something A's router can detect, presumably any
packets from A1 to B2 will get back an ICMP message, and after a
few RTTs TCP becomes really unhappy. The shim may also observe that
there are packets going from A1 to B1, but there is nothing coming
in from B1 to A1. Maybe the shim decides to fire off a probe from
A1 to B1 for good measure. But eventually, it's clear that A1 to B1
doesn't work anymore.
Ok.
Now suppose that the reachability detection subsystem at A decides
to see if B2 works. If A sticks to source address A1, then the
packet will also incur an ICMP and not make it. So either A sees
the ICMP and selects a different source address, or it decides that
A1-B2 doesn't seem to work either and goes on to the next address
pair. For instance A could try A2-B1. And this one works!
So from now on any outgoing packets with addresses A1-B1 in them are
rewritten into A2-B1 and sent on their way.
Any complaints so far?
Well, I wouldn't have /shim/ in A picking A2, but I wouldn't preclude
it either. So no complaints.
Scenario 2: big failure, and everything is wiped out except A4-B4. (From
where I sit 99% of all traffic flows through Amsterdam, and most of that 99%
over the AMS-IX. A nice big power failure there really hurts my
connectivity.)
Ha, an AMS-IX outage would hurt me too ;). There are intra-Ireland
paths that actually go via european IXes, typically LINX but I've
seen AMS-IX paths too. A lot of european traffic goes via IXes.
BTW: Note that I strongly disagree with trying to solve all the
internet's routing problems by making every end-host do n^2 probing.
So A tries:
A1-B2
A2-B1
A1-B3
A3-B1
A1-B4
A4-B1
and on and on and on, until it eventually determines that A4-B4 works.
You don't want this to happen. So what's the alternative? Give up after the
second try? The fourth? The n^2/2th?
IMHO, you should only try:
(unspecified) -> B1
(unspecified) -> B3
(unspecified) -> B4
In my universe (which happens to correspond vaguely to how the
internet works today ;) ), when AMS-IX dies, within 1 to 3 minutes or
so, your ISP, in conjunction with other ISPs start propogating
WITHDRAWs and UPDATEs and converge so that packets flow again.
I much prefer that ISPs take care of the business of getting packets
from A to B than specify an n^2 end-host probing protocol within the
IETF.
What next? What if there are multiple failures between A4-B4, such
that somewhere between A4 and B4 there is one path which works and
one which does not. One which we could work around by doing
source-specified hop-by-hop routing?
Let's have shim6 probe all the intermediary paths too. And let's
abolish the IETF routing area while we're at it :). (kidding, but you
see the stretched point I'm making I hope).
Remember that while all of this is going on, the transport protocol
sees a black hole. So at any time, the transport can decide to time
out. The shim doesn't do anything that actually _hurts_ regular
transport protocols.
If you call the potential for a smallish network sending out near to
1k probe packets "doesn't hurt" for not much gain, sure.
regards,
--
Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A
Fortune:
Man usually avoids attributing cleverness to somebody else -- unless it
is an enemy.
-- Albert Einstein