[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: survivability, rewriting



I agree with this, and I'd add that many applications can survive much
longer glitches than 5 seconds, and even TCP resets, by putting some fairly
trivial retry logic in the right place. In the VoIP case, the limit is indeed
the social one; there is no technical reason a VoIP solution can't
keep trying to reconnect indefinitely.

However, I suspect that this whole discussion belongs in a future
working group called multi6-ops. Once we get a basic mechanism agreed,
there will be many operational and implementation issues to be
followed up. Should we concentrate now on the basic mechanism?

   Brian 

Iljitsch van Beijnum wrote:
> 
> About the session survivability: I think Kurtis is right that we must
> not let ourselves get caught up in unrealistic expectations. I believe
> a useful lower limit would be five seconds. This gives just over two
> round trips when both parties are using GSM/GPRS and/or satellite, a
> few more for more reasonable link technologies. And two missed acks is
> the absolute minimum we must require before even considering a rehoming
> event, as a single missed ack can very easily happen for many reasons
> that don't warrant rehoming. So if an application can't handle a 5
> second gap in the communication, it shouldn't count on general
> multihoming mechanisms to provide failover.
> 
> I don't think this is too unreasonable. Many people have mentioned
> VoIP, often in the same sentence with extremely unrealistic failover
> expectations. I believe 5 seconds is workable for VoIP: this is
> certainly close to the time a user will continue to shout "hello, are
> you still there?" when using a cell phone with less than perfect
> reception.
> 
> I maintain that having the transport layer provide hints to the
> multihoming layer about when a rehoming would be desired is the right
> approach. Yes, this will give us some trouble in the beginning, as
> existing upper layer protocols don't provide these hints yet. So we
> implement additional heuristics so the multihoming layer can rehome on
> its own. But this is never going to be as efficient as having the
> transport layer do it, as transport protocols have very good knowledge
> about what's happening end-to-end. TCP for instance goes through great
> lengths to be able to determine when to retransmit and whether only a
> single packet was lost or more. Streaming protocols on the other hand
> have a pretty good idea when new data should be arriving, so here the
> receiver is in a good position to send nacks or hints.
> 
> One thing we haven't discussed so far: if upper layers provide us with
> a hint, what exactly does the multihoming layer do after receiving such
> a hint? It would make sense to peform some kind of check to see what
> kind of reachability exists, but this means we need some kind of
> ping-like functionality. Is it reasonable to depend on such a mechanism
> in this age of ICMP paranoia?
> 
> About the rewriting: why again are we making life difficult for
> ourselves? The obvious place to put an indication that the address may
> be rewritten is... in the address. Is there any reason why we can't
> have one or more special prefixes that indicate that a router should
> fill in the source address?
> 
> However, this doesn't solve what we should do when rewriting isn't
> permitted. Obviously we could come up with a multihoming mechanism
> where rewriting is always allowed, trading off complexity in this area
> against complexity in recognizing a correspondent and accepting some
> types of spoofed packets. But "legacy" IPv6 also doesn't permit
> rewriting, so we must be prepared to handle this. The obvious solution
> is source address based routing, but it doesn't seem like everyone is
> convinced.
> 
> I don't really see any workable alternatives, though. Even if we can do
> ICMP or NAROS magic to make sure that new sessions magically use the
> right source address so initially they're able to pass ingress
> filtering, it's always possible that halfway through the session is
> rerouted over another ISP and the source address is filtered. This
> means we can't reroute traffic based on changes in BGP the way we're
> used to, the ultimate consequence of which is that we must hardcode all
> routing decisions. In practice this probably means using one ISP as a
> primary and the other one only as a backup.
> 
> Another problem is that if we depend on BGP to determine which ISP
> provides the shortest path to a destination, this effectively blocks us
> from using the other ISP to reach the same destination. For instance,
> if X has ISPs A and B, and Y has ISPs C and D, then it's entirely
> possible that X will use ISP A to reach both Y(C) and Y(D), so that
> when something bad happens with ISP A both of Y's addresses become
> unreachable. With source address based routing this isn't an issue as X
> can reach each of Y(C) and Y(D) over both A and B by simply using a
> different source address.
> 
> I do agree that source address based routing (which is in effect a
> limited form of source routing) doesn't mesh well with our hop by hop
> forwarding paradigm, but again: I don't see any alternatives.