[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Question about REAP state transition (draft-ietf-shim6-failure-detection-09)



Some commments inline.

|  -----Mensaje original-----
|  De: Iljitsch van Beijnum [mailto:iljitsch@muada.com]
|  Enviado el: sábado, 15 de diciembre de 2007 14:41
|  Para: Alberto García
|  CC: shim6@psg.com
|  Asunto: Re: Question about REAP state transition
(draft-ietf-shim6-failure-
|  detection-09)
|  
|  On 12 dec 2007, at 17:15, Alberto García wrote:
|  
|  > First, my understanding of the specification is that Probe Exploring
|  > messages do not carry neither "probe received information source
|  > address"
|  > nor "probe received information destination address", etc.
|  
|  In the specification it's required for probe packets to contain the
|  source and destination addresses that are used to send the current
|  probe as well as "probe nonce" and "probe data" fields. It is allowed,
|  but not required, for a probe to also contain the source/dest/nonce/
|  data values from probes that were sent earlier. The reason to include
|  this is because that way, the receiver knows which address pair
|  combinations were tried by the sender in previous packets, and if the
|  probes with those address pairs didn't make it to the receiver, that's
|  a hint that those address pairs won't work in the opposite direction
|  either.
|  
|  It's also allowed but not required to copy back the source/dest/nonce/
|  data fields from one or more received probes. This tells the receiver
|  which probes it sent previously were successful so it can select a
|  working address pair. Without this information, it's probably hard for
|  a receiver to draw definitive conclusions about which address pairs
|  work.
|  
|  Would you say that we should require that at least the set of source/
|  dest/nonce/data fields from the last successfully received probe is
|  included in outgoing probes?
|  

Ummm. My understanding of the protocol was that it was important for Probes
sent by nodes in Inbound_OK or in Operational state (at least - maybe also
in the Exploring state) to inform the other side which incoming paths were
successful. Otherwise, it is not easy for the other side to assume any
outgoing path as successful, specially when we are considering that
bidirectional communication is not working.
I also think that this information must also be included in the Probes sent
when the node is in Exploring state, if it knows it (because it was in the
past in the Inbound_OK state)
May be this should be more explicitly stated.
However, if this is the case, I don't fully understand why the Send timer is
used in the Inbound_OK state (as I comment at the end of the email)

|  > If this is true, I wonder if it is possible that the exploration
|  > process
|  > could not find two valid unidirectional paths (on each direction)
|  > even if
|  > they exist. Suppose that a node A in Exploring state receives a Probe
|  > Exploring, so it moves to Inbound_OK.
|  
|  Ok.
|  
|  > For the next Probes it sends, it
|  > includes the information about the valid locators for its incoming
|  > paths (B
|  > to A), but it is not able to find a working path from A to B for
|  > some time.
|  
|  Hm, if there is a unidirectional path, wouldn't A find it at some point?
|  And if A inclues source/dest/nonce/data fields from B's probe then B
|  knows which address pair from B to A works, so both unidirectional
|  paths will be found.
|  
|  The trouble would be when A doesn't include information from pobes it
|  received from B, so that by the time that B sees a successful packet
|  from A, B has already moved on to a new pair so both ends think there
|  is working connectivity but there is only connectivity from A to B but
|  not from B to A.
|  
|  > In B, the Retransmission Timer of B expires because a valid path

OK, "Initial Probe Timeout". 
Regardless the name, probes could be sent one after the other, with some
interval in between, so it would take some time to test all possible paths.

|  > from A to B
|  > was not found, so B starts testing other paths that are not working.
|  Retransmission timer = a new probe is sent? Note that we don't
|  explicitly use this terminology in REAP. Think of each probe as a new
|  probe without relationship to earlier ones.
|  
|  > Then, A
|  > stops receiving data from B, so the Send timer expires (I don't find
|  > any
|  > reason why all the possible paths should be explored in less than Send
|  > Timeout time, so A could not test all possible paths from A to B in
|  > this
|  > time).
|  
|  The Send Timeout keeps track of when to send a keepalive. This is part

Umm, that is the Keepalive timer, isn't it? The Send Timeout keeps track of
when you should consider that the incoming path is no longer valid: 
"6.  Send Timeout seconds after the transmission of a data packet with
       no return traffic on this context, a full reachability
       exploration is started."

And it is running when the node is at the Inbound_OK state (=it thinks that
the incoming path is working... but after Send timer time without receiving
anything, it gives up, and moves again to the Exploring state) 

|  of the failure detection mechanism, which runs when data is flowing.
|  Only when there has been no incoming traffic (but there is outgoing
|  traffic) the reachability exploration procedure is started, which
|  doesn't use the Send Timeout. In practice, data traffic will continue

The exploration process does use the Send timeout.

|  to be generated concurrently with the reachability exploration, but
|  the two types of packets don't interact.
|  
|  > Then, A falls to the Exploring state, and (in the supposition of the
|  > previous paragraph) forgets about the working path from B to A.
|  
|  A host should only enter the Exploring state when it has been sending
|  outgoing packets but isn't seeing incoming packets. So it SHOULD be
|  impossible to get into the Exploring state when there is working
|  reachability in both directions.
  
The problem is that may be reachability for both directions may not occur
during the same period (being this period related to the Send Timeout
duration). So I think it CAN be possible to get into the Exploring state
although valid paths for both directions exist.

|  > May be now A
|  > sends a probe to B through a working path. but in B happens the same
|  > (it
|  > tries now with different paths from B to A that are no valid, so A
|  > tries
|  > another paths from A to B abandoning the good one...). In this case,
|  > two
|  > valid unidirectional paths existed, one from A to B, and other from
|  > B to A,
|  > but the protocol could not find them, because they not were known at
|  > the
|  > same time by both nodes.
|  > I'm missing something?
|  
|  I'm thinking something like this could happen if one end doesn't copy
|  back the other's probe information, but not in any other circumstance.=

That's my point. It should be stated that this information should be
included in all Probes.

However, then another question arises: then, what is the difference moving
from the Inbound_OK state to the Exploring one? Since the Probes should
contain the same information (even if the state is back to Exploring), why
moving to the Exploring state when the Send timer expires? (or what is being
made different depending on if the Send timer has not expired yet in the
Inbound_OK state, or if it has expired and we have moved to the Exploring
state?).

Regards,
Alberto