[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

About draft-ietf-shim6-failure-detection-07 {2}



Hi,

I have another question regarding the failure detection draft. This is about the end of an exploration. I think it is useful to have a notion of
- when an exploration process begins.
- when an exploration process terminates.

* Motivation :
- An implementation might want to have a specific thread/process for each exploration. It is useful to know when the exploration thread may be terminated. - We need to store information about each probe sent/received (because reports are included in the probes). It is useful to know when the memory used by this information can be released. This would ensure that only information relative to the current exploration is stored in memory. Also, a context in operational state would need less space than a context in course of exploration, which is more scalable if lots of contexts are present.

* The problem :

While it is rather evident when we must trigger an exploration (send timer expiry or incoming probe exploring), we are not so certain about the end of an exploration. More precisely *one of the peers* is not certain, while the other is. Here is an example situation with the currently defined scheme for the end of an exploration :

Peer A                                        Peer B
     |                                             |
   State:                                        State:
 Inbound_OK                                    Exploring
     |                                             |
     |           Probe Inbound OK                  |
     |-------------------------------------------->|
     |                                           State:
     |                                         Operational
     |           Probe Operational                 |
| /---------------------------------| path working, but probe lost (because of | | congestion for example)
     |           Probe Inbound OK                  |
|-------------------------------------------->| | |
     |           Probe Operational                 |
     |<--------------------------------------------|
| | Now B can forget its received/sent probe reports, State : | because A is in state operational, but B has no way to know it.
Operational

As illustrated in this scenario, when A receives a probe Operational, it knows for sure that B is operational, and so that the exploration process is terminated. But B won't receive such a probe. B enters into the Operational state when he receives a Probe Inbound OK. This means that he knows the conversation will work from now, but this doesn't mean, however, that A won't ever ask B to send its list of sent/recvd probe reports, as is the case in the above scenario (because of the first probe operational being lost).

* One possible (and simple :-) ) solution :

Because the only problem is for B to know for sure that A is operational. Just let it know this fact by sending another Probe Operational. Of course, with such a rule, we could end up with A and B infinitely sending Probes Operational. This can be simply solved by having something different in the last Probe Operational, such as a flag. This can also be solved by sending no probe report (psent=0 and precevd=0). This may seem contradictory with my previous mail, but it isn't : we are here in the special case where A *knows* that B is in operational state, so he doesn't need anymore to have any probe information, his only need is now to be sure that A is also in operational state. In fact, the only thing that B will do when receiving the last probe is check whether this is an operational probe or an Inbound OK probe :
   - in the first case : stop the exploration process gracefully
- in the last case : The last probe operational from B has been lost, send another one. This results in the following (very similar, but much simpler/efficient from an implementation point of view, IMHO) scenario :

Peer A                                        Peer B
     |                                             |
   State:                                        State:
 Inbound_OK                                    Exploring
     |                                             |
     |           Probe Inbound OK                  |
     |-------------------------------------------->|
     |                                           State:
     |                                         Operational
     |           Probe Operational                 |
| /---------------------------------| path working, but probe lost (because of | | congestion for example)
     |           Probe Inbound OK                  |
|-------------------------------------------->| | |
     |           Probe Operational                 |
     |<--------------------------------------------|
     |	                                            |
State : | Operational |
     |						    |
     |           Probe Operational|final           |
     |-------------------------------------------->|
| | Now B can forget its received/sent probe reports,

In fact, the final flag is not necessary, because there is no ambiguity, here would be the difference in the state machine :
event : Reception of the probe message State Operational
if in state Operational : just update timers (see draft) -> *end of exploration* if in state Inbound_OK : goto Operational, update timers AND send a probe operational if in state Exploring : goto Operational, update timers AND send a probe operational (mmm, in fact i guess this case should not occur, unless one of the peers is buggy, but this answer to this little probable event seems OK).

What is your opinion ?
Thanks for any comments,

Sébastien.


--
Sébastien Barré
Researcher,
CSE department, UCLouvain, Belgium