[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Failure Detection (was Re: soft state (was Re: shim6 and bit errors in data packet headers



Hi Erik, Iljitsch,

my understanding is that the two mechanisms that Erik proposes seems to be the base mechanims fro detecting that a pair of addresses is not working any longer, that is:

No it isn't hard to avoid. What we need is more or less exactly what we have in NUD in RFC 2462 but applies end to end (and as a result using a different timeout behavior etc):
- A probe using the current address pair which is sent when a ULP packet is sent, but only if there is no e2e reachability confirmation in the last N seconds.
- Optional ULP positive advice which suppresses the need to probe since it provides an e2e reachability confirmation



imho this is enough, as Erik says, and it should be the building blocks of the failure detection


However, my understanding is that a mechanism in the lines of what Iljitsch suggests (based on the observation of the existence of a bidirectional flow of packets) could be an interesting optimization in the bidirectional case.

My understanding of how such mechanism would work is the following:

The shim layer observes the amount of traffic exchanged during the last T seconds, being Tx the number of packets transmitted and Rx the number of packets received during the last T seconds.
If Tx>0 and Rx>0, then no problem
If Tx=0, then no problem
If Tx>0 and Rx=0, then perform a reachability test to verify the current locator pair, and eventually a path exploration exchange


Upon the reception of a path exploration by the peer, the node must perform a reachability test to verify the current locator pair.


So, imho there are at least two questions to answer in order to see if this is interesting:
1- is this really an optimization? i.e. does it provides some form of improvement (and is it worthy?)
2- How frequent are we expecting this case to be, so that it is worthy to optimize it?


My opinion about these two are the following:

About 1: i guess this mechanism would improve the efficiency of the solution because it would reduce the number of reachability test exchanged to verify the current locator pair in the case of and UDP bidirectional flow (or any other bidirectional flow where the ULP does not provide positive feedback)

About 2: this mechanism optimizes the case of bidirectional flow that belongs to ULPs that not provide positive feedback (e.g. UDP)
However, this mechanism only work in the case where at least one path with bidirectional connectivity is available. This mechanism fails in the case where the only paths available are two different unidirectional paths. This means, that if we want to support such case (i.e. if we want to be able to preserve the communication in this case), then we need to be able to detect and overrule such case.


Regards, marcelo