[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Question about REAP state transition (draft-ietf-shim6-failure-detection-09)



On 12 dec 2007, at 17:15, Alberto García wrote:

First, my understanding of the specification is that Probe Exploring
messages do not carry neither "probe received information source address"
nor "probe received information destination address", etc.

In the specification it's required for probe packets to contain the source and destination addresses that are used to send the current probe as well as "probe nonce" and "probe data" fields. It is allowed, but not required, for a probe to also contain the source/dest/nonce/ data values from probes that were sent earlier. The reason to include this is because that way, the receiver knows which address pair combinations were tried by the sender in previous packets, and if the probes with those address pairs didn't make it to the receiver, that's a hint that those address pairs won't work in the opposite direction either.

It's also allowed but not required to copy back the source/dest/nonce/ data fields from one or more received probes. This tells the receiver which probes it sent previously were successful so it can select a working address pair. Without this information, it's probably hard for a receiver to draw definitive conclusions about which address pairs work.

Would you say that we should require that at least the set of source/ dest/nonce/data fields from the last successfully received probe is included in outgoing probes?

If this is true, I wonder if it is possible that the exploration process could not find two valid unidirectional paths (on each direction) even if
they exist. Suppose that a node A in Exploring state receives a Probe
Exploring, so it moves to Inbound_OK.

Ok.

For the next Probes it sends, it
includes the information about the valid locators for its incoming paths (B to A), but it is not able to find a working path from A to B for some time.

Hm, if there is a unidirectional path, wouldn't A find it at some point?

And if A inclues source/dest/nonce/data fields from B's probe then B knows which address pair from B to A works, so both unidirectional paths will be found.

The trouble would be when A doesn't include information from pobes it received from B, so that by the time that B sees a successful packet from A, B has already moved on to a new pair so both ends think there is working connectivity but there is only connectivity from A to B but not from B to A.

In B, the Retransmission Timer of B expires because a valid path from A to B
was not found, so B starts testing other paths that are not working.

Retransmission timer = a new probe is sent? Note that we don't explicitly use this terminology in REAP. Think of each probe as a new probe without relationship to earlier ones.

Then, A
stops receiving data from B, so the Send timer expires (I don't find any
reason why all the possible paths should be explored in less than Send
Timeout time, so A could not test all possible paths from A to B in this
time).

The Send Timeout keeps track of when to send a keepalive. This is part of the failure detection mechanism, which runs when data is flowing. Only when there has been no incoming traffic (but there is outgoing traffic) the reachability exploration procedure is started, which doesn't use the Send Timeout. In practice, data traffic will continue to be generated concurrently with the reachability exploration, but the two types of packets don't interact.

Then, A falls to the Exploring state, and (in the supposition of the
previous paragraph) forgets about the working path from B to A.

A host should only enter the Exploring state when it has been sending outgoing packets but isn't seeing incoming packets. So it SHOULD be impossible to get into the Exploring state when there is working reachability in both directions.

May be now A
sends a probe to B through a working path. but in B happens the same (it tries now with different paths from B to A that are no valid, so A tries another paths from A to B abandoning the good one...). In this case, two valid unidirectional paths existed, one from A to B, and other from B to A, but the protocol could not find them, because they not were known at the
same time by both nodes.
I'm missing something?

I'm thinking something like this could happen if one end doesn't copy back the other's probe information, but not in any other circumstance.