[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Comments on draft-ietf-shim6-failure-detection



On 31-okt-2005, at 7:04, Jari Arkko wrote:

I am not clear that the "multi-homing protocol" necessarily has the right
information to do testing and failure detection correctly.

Since the multihoming / shim layer is the only one that is equipped to take action to repair such failures I don't see how this could work elsewhere. (There are some corner cases such as SCTP or applications that can repair broken connectivity in some cases.)

For example, it does not make sense to diagnose a "connectivity problem"
on a time scale less than RTO.  Yet only the tranport layer typically
possesses the RTO estimate.

RTO == ? RT must be "roundtrip" but the O?

Trying to repair failures on a subsecond scale has many undesirable side effects. Until these are studied we should stay well away from this.

Similarly, if the cause of the connectivity loss is a route flap, then
only the routing layer might have knowledge of the loss of the route, and
only if it is participating in the routing mesh.

And if a failure occurs because the company went bankrupt only the CTO knows the details... But who cares? The point is that there is no longer any connectivity and it's impossible to make sure that the information about this finds its way to the hosts involved, so these hosts must discover this by themselves.

For example, in adhoc
networks, missing routes are a frequent contributor to packet loss, so
that integration of the routing and transport layers is required to be
able to respond  appropriately.

What would a host do differently if it has this knowledge?

The application layer does have the ability to diagnose connectivity
problems on the order of seconds, through keep-alives.

Keepalives are bad; they waste bandwidth. Applications shouldn't use them.