[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
survivability, rewriting
About the session survivability: I think Kurtis is right that we must
not let ourselves get caught up in unrealistic expectations. I believe
a useful lower limit would be five seconds. This gives just over two
round trips when both parties are using GSM/GPRS and/or satellite, a
few more for more reasonable link technologies. And two missed acks is
the absolute minimum we must require before even considering a rehoming
event, as a single missed ack can very easily happen for many reasons
that don't warrant rehoming. So if an application can't handle a 5
second gap in the communication, it shouldn't count on general
multihoming mechanisms to provide failover.
I don't think this is too unreasonable. Many people have mentioned
VoIP, often in the same sentence with extremely unrealistic failover
expectations. I believe 5 seconds is workable for VoIP: this is
certainly close to the time a user will continue to shout "hello, are
you still there?" when using a cell phone with less than perfect
reception.
I maintain that having the transport layer provide hints to the
multihoming layer about when a rehoming would be desired is the right
approach. Yes, this will give us some trouble in the beginning, as
existing upper layer protocols don't provide these hints yet. So we
implement additional heuristics so the multihoming layer can rehome on
its own. But this is never going to be as efficient as having the
transport layer do it, as transport protocols have very good knowledge
about what's happening end-to-end. TCP for instance goes through great
lengths to be able to determine when to retransmit and whether only a
single packet was lost or more. Streaming protocols on the other hand
have a pretty good idea when new data should be arriving, so here the
receiver is in a good position to send nacks or hints.
One thing we haven't discussed so far: if upper layers provide us with
a hint, what exactly does the multihoming layer do after receiving such
a hint? It would make sense to peform some kind of check to see what
kind of reachability exists, but this means we need some kind of
ping-like functionality. Is it reasonable to depend on such a mechanism
in this age of ICMP paranoia?
About the rewriting: why again are we making life difficult for
ourselves? The obvious place to put an indication that the address may
be rewritten is... in the address. Is there any reason why we can't
have one or more special prefixes that indicate that a router should
fill in the source address?
However, this doesn't solve what we should do when rewriting isn't
permitted. Obviously we could come up with a multihoming mechanism
where rewriting is always allowed, trading off complexity in this area
against complexity in recognizing a correspondent and accepting some
types of spoofed packets. But "legacy" IPv6 also doesn't permit
rewriting, so we must be prepared to handle this. The obvious solution
is source address based routing, but it doesn't seem like everyone is
convinced.
I don't really see any workable alternatives, though. Even if we can do
ICMP or NAROS magic to make sure that new sessions magically use the
right source address so initially they're able to pass ingress
filtering, it's always possible that halfway through the session is
rerouted over another ISP and the source address is filtered. This
means we can't reroute traffic based on changes in BGP the way we're
used to, the ultimate consequence of which is that we must hardcode all
routing decisions. In practice this probably means using one ISP as a
primary and the other one only as a backup.
Another problem is that if we depend on BGP to determine which ISP
provides the shortest path to a destination, this effectively blocks us
from using the other ISP to reach the same destination. For instance,
if X has ISPs A and B, and Y has ISPs C and D, then it's entirely
possible that X will use ISP A to reach both Y(C) and Y(D), so that
when something bad happens with ISP A both of Y's addresses become
unreachable. With source address based routing this isn't an issue as X
can reach each of Y(C) and Y(D) over both A and B by simply using a
different source address.
I do agree that source address based routing (which is in effect a
limited form of source routing) doesn't mesh well with our hop by hop
forwarding paradigm, but again: I don't see any alternatives.