[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: soft state (was Re: shim6 and bit errors in data packet headers




El 10/05/2005, a las 23:28, Erik Nordmark escribió:

I mean, in this case, probably the reachability test will be performed, because an outage will be detected, and if no reply is received, the the path exploration process will begin, sending packets with alternative locators, right?
I mean, failure detection timers should be much shorter than session lifetime timers, right?

Why would we want to couple the state management aspects of shim6 but the shim6 test protocol? To me any such coupling seems undesirable, especially since the parameters for the test protocol (how quickly to detect failures) might be a function of upper layer advise, as well as upper layer hints of "working" or "not working".



Well, i guess that the situation when one of the nodes has lost the shim state can be seen as a form of failure and my assumption is that failure detection mechanisms will likely detect it first


I think that the protocol behaviour would be something like this.

A communication is established between node A and node B
Later on, a shim context is created between those two nodes.
The parameters for that context are:
  ULIDs: IPA1 and IPB1
  Locators: for IPA1 (IPA1,...,IPAn)
            for IPB1 (IPB1,...,IPBm)

Suppose that for some reason node B losses the shim context (and only the shim context, i.e. the application and transport state about ongoing communications is preserved)

I guess that at this point we have several scenarios to consider:

Scenario a): the communication between A and B is still using IPA1 and IPB1 as locators.
This scenario has two subcases:
Scenario a.1) The communication is bidirectional and e.g.
TCP is providing ack of the progress of the communication
this means that no periodic reachability test
nor any other shim signaling is being exchanged.
In this scenario, a lost of SHIM context would remain
undetected until there is a failure and node A detects it
and tries to explore alternative paths. This is so because
data packets will carry ULIDs and will be passed successfully
to the upper layers. Once that there is a failure, then
reachability test packets won't be recognized as belonging
to any existent shim context and the problem can be detected.


Scenario a.2) The communication is unidirectional
In this case, periodic reachability test need to be
performed in order to verify that the path is still working
If the node B losses its shim state, it won't recongnize
the reachability test packets, and the lost of context can
be detected



Scenario b) the communication between A and B is using alternative locators.
In this case, when node B losses the context, data packets won't be properly delivered in node B, because it won't be properly demuxed.
At this point, the reachability test will be performed to verify the locator pair being used



I don't know if i am missing something, but AFAICS, all the situations when the shim context is lost result in a reachability test exchange, and that is why i was wondering if it wouldn't make sense to define a "no-context" error message as a rply to a reachability test request packet.



again i think that this is very much related with the failure detection mechanism... i mean, in this case, the node that still has the state (both for the shim and for TCP) will detect an outage and perform the reachability test process and eventually the path exploration process...
>
I am wondering if all these issues could be addressed by defining an error message as a possible response for the reachability test request message, if you see what i mean.

Even if we decide to couple the state management with the test protocol there would still be an issue. The test protocol is there to discover a working locator pair. Just because a locator pair is working doesn't mean that the peer has any context state for any particular shim6 context.



agree, but as i describe above, i think that the situation where context is lost will result in a reachability test.


imho, this is so, because the failure detection mechanisms will react fast.

Thus you'd need a "context alive" probe in addition to the locator pair test message.


I don't think so... imho, if when a node receives a reachability test request packets for a context that it does not have any state, the node can reply with a no-context error, would be enough. imho i cannot see the need for a periodic context alive probe protocol (yet)


I think we can come of with way to do state management which doesn't require such added complexity.

I think that the problems with potential attacks using an error message to cause nodes involved in a shim communication to discard their state could be avoided by defining an error message that cannot be sent spontaneously by one of the hosts (or an attacker) but that need to be sent as a reply of a reachability test request message (including some random seq number or similar)

Or sent in response to a data packet, but including enough of the data packet (the locator pair and context tag) that are hard for an off-path attacker to guess.



But i fail to understand how the node that has lost the state can identify that a data packet belongs to a non existent shim state....


I mean, i guess that a first element that is relevant here is where are we going to carry the context tag.
If the context tag is carried in a extension header or dest option, then i can see that if a node receives an packet with one of those, can easily detect that there is no context associated. (note that in this case, the context loss is only detected in the case where the locators used for the communication differ from the ULIDs, i.e. the extension header dst option is included in the packet)


If the context tag is included in the flow label, then i don't see how a node that receives the data packet can determine that the packet is associated to a shim context that is no longer there. At this point, i gues that as you mentioned in a previous mail, the data packet would be silently discarded, right?


i am not sure what do you have in mind when stating that the error message could only be generated by a mitm (considering that one of the nodes has lost its state, so no previous cookie is available) but i think that the defining an error message that can only be issued as a reply of a previous packet could do the trick

I meant sent in response to a packet, so that only an attacker on the path can construct a forged error message, as above.



I think that at this point is clear to me that if we define a no-context error message, this message should be defined as a reply to a packet that refers to that context and it should include enough information about this initial packet to verify that is a reply to that packet.


The no-context error message cannot be issued spontaneously by a node.


Regards, marcelo


   Erik