[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Comments on draft-ietf-shim6-failure-detection
I would therefore argue that the important issue is not action
in multiple layers, but rather the avoidance of race conditions; a
well-defined communication mechanism between the IP and
transport/application
layer can help with this.
Hmm. Yes. "Link Up" ... "Link Temporary Problem" (here shim
is exploring for other alternatives)... "Link Up"...
But it is less clear which protocol(s) should discover end-to-end
connectivity problems or recover from them. One answer is that this
is clearly within the domain of multihoming protocol. By performing
testing and failure detection of the used path and switching to a new
path if necessary, the transport and application protocols can work
unchanged.
I am not clear that the "multi-homing protocol" necessarily has the right
information to do testing and failure detection correctly.
For example, it does not make sense to diagnose a "connectivity problem"
on a time scale less than RTO.
Yes, very small timescales would indeed be problematic.
Having said that a lot of the discussion around this is
centered around the path failures. I see path failure recovery
as a necessary component, but I would argue that local
failures (such as the little green light in the interface
card going blank) are likely to be more common. They are also
treated in a very different way. You KNOW you have a
problem and often have a good idea also about what other
things might be working (e.g., the interface whose green
light is still on). In local failures, even sub-second time
scale is achievable (depending on, of course, how fast
the green light reacts).
But for the rest, we can only do operations that are relatively
slow. My view of what's achievable is somewhere between
RTO and the time TCP gives up. Interestingly, on this timescale
TCP has probably already slowed down.
One can also envision that applications would be able to tell the IP
or transport layer that the current connection is unsatisfactory and
an exploration for a better one would be desirable. This would
require an API to be developed, however.
The application layer does have the ability to diagnose connectivity
problems on the order of seconds, through keep-alives. The IP layer
generally does not have the ability to detect whether a connection is
"satisfactory" since it does not have access to the TCB, only
knowledge of potential causes of connectivity problems (such as path
changes or missing routes), which it can provide to the transport layer
or to applications.
We do have direct information in some cases, see above.
But in general... I think we should stay away from
trying to defining "satisfactory", and simply work on
a binary model where there's either connectivity or
there isn't.
--Jari