[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: failure detection



Paul Jakma wrote:

Just not sending RAs (or not including the affected prefix in RAs) would achieve same effect though wouldn't it?

I'm just curious what would be wrong with a setup like:

- valid lifetime: very high (say weeks)
- preferred: low, eg 10s or 15s
- RA interval: very low, 5s or so.

If link goes down, just stop sending RAs/including prefix in RAs. In 15s hosts start preferring other prefixes. For datagramme protocols with no kernel flow state (eg, potentially, shim6), the next packet after preferred times out would use a new source address.

The problem is, as preferred/valid is defined and used in RFC 2462, a host would interpret the prefix becoming deprecated as
There is some graceful renumbering going on. Make sure that new
communication avoid the deprecated address. But existing
communication can continue to use the deprecated address until
the valid lifetime expires.


The behavior we'd want in multihoming is instead
	The address prefix has failed (at least for external
	communication). If there is a mechanism (like shim6) which can
	be used to quickly failover to some other address, then it makes
	sense to invoke it. Also, new communication which picks a source
	address should avoid using the failed prefix. But for existing
	communication when there isn't an easy way to switch to another 	
	address, it isn't clear what to do. (In some cases it might make
	sense to reset the ULP connection and recreating it, which will
	make it get a non-failed source address, but in other cases it
	would be better to wait for the failed address to start working
	again.)

If we overload the "failed" semantics on the "deprecated" notion, then the effect is that during graceful renumbering there will be a storm of shim6 traffic when everybody tries to immediately switch over, even though they might have a week during which the deprecated address will continue to work.

   Erik