[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: soft state (was Re: shim6 and bit errors in data packet headers



Hi Erik,

FWIW i am not proposing not even defending this approach, just trying to explore how this would look like...

El 09/05/2005, a las 19:37, Erik Nordmark escribió:

marcelo bagnulo braun wrote:


But, the state can be rebuild using the initial exchange again, right?

How will A know in a timely manner that B has lost state, if B never responds?


I mean, i am thinking in the following scenario (that i think Brian suggested some time ago)
Node A and Node B start a communication using IPA1 and IPA2 respectively,
A while after, they decide to create a SHIM context, for this communication. For that, the perform the initial exchange and the continue the communication for a while.
At this points i can think about 3 scenarios:
1- the "normal" situation i.e. there no abrupt lost of context for external reasons. In this case, each packet received corresponding to that context would extend the lifetime of the context information. If no packets are exchanged for a given period T, then both ends discard the state. I guess T should defined in a conservative way, i.e. long enough.

Such an approach isn't robust in the presence of packet loss.
A could be sending a few packets, but B might never receive them due to random packet loss. So you end up with B discarding the state while A thinks B should still have the state, since A sent packets to B.

I am not sure about that...

I mean, in this case, probably the reachability test will be performed, because an outage will be detected, and if no reply is received, the the path exploration process will begin, sending packets with alternative locators, right?

I mean, failure detection timers should be much shorter than session lifetime timers, right?

My concern now, as i think you suggest below, is that before trying to reestablish the context (if no no-context-available error message is defined), all alternative paths need to be explored, which is at least a time consuming task, which will delay the reestablsihemnt process for a probably unacceptable period.


FWIW the CLOSE exchange in draft-ietf-hip-base-02.txt avoids this problem by basing the "should I time this out" on the receipt of packets, and not due to packets being sent.

I guess this heavily depends on the failure detection mechanism... From previous discussions on this, i think that the one that would detect the outage and perform a reachability test will be the node that has packets to send...


I think you'd need a similar approach in a semi-hard-state approach to shim6.

2- One of the nodes reboots and all state is lost. In this case, it doesn't make much sense to rebuild the context, i think, because the upper layer state has also been lost, so i guess it doesn't make much sense to try to resume the SHIM context when there is no upper layer that wants to preserve a communication

The problem if you do nothing, is that the upper layer would not find out in a timely manner. For instance, TCP would not receive a reset packet when the peer has rebooted, because the shim layer would pass up a TCP packet with a bad checksum (due to not having replaced the locators with the ULIDs).


So if you do nothing you are most likely making the notification of a upper layer state loss a lot worse than it is without shim6.


right

again i think that this is very much related with the failure detection mechanism... i mean, in this case, the node that still has the state (both for the shim and for TCP) will detect an outage and perform the reachability test process and eventually the path exploration process...

I am wondering if all these issues could be addressed by defining an error message as a possible response for the reachability test request message, if you see what i mean.

I think that the problems with potential attacks using an error message to cause nodes involved in a shim communication to discard their state could be avoided by defining an error message that cannot be sent spontaneously by one of the hosts (or an attacker) but that need to be sent as a reply of a reachability test request message (including some random seq number or similar)

3- There is some abrupt lost of context state in one of the nodes (and only the context state. (i am not sure how likely this is, i mean i guess it could happen when because of the soft state, one of the nodes discards the state before its peer, and after that the peer wants to resume the communication associated with this context. I guess the occurrence of such situation becomes less likely, as the inactive time required to discard the context grows). In any case, suppose that suddenly node A losses its state about the SHIM context. At this point, node B would detect that the context has been lost. How this is detected, is to be analyzed in depth.
- If the communication was still using the ULIDs as locators,
then data packets will flow without problems and no problem
is detected. I guess that for this case, a SHIM keepalive
may be needed to detect this case
- If the communication is using locators that differ from ULIDs
then, i guess that upper layers will send an error message
back, like port unreachable or similar, so that the context
loss can be detected

NO, NO, NO.
The upper layer packets will be silently discarded with a checksum error.

right


- If SHIM signaling packets are sent by node B, the absence
of replies would indicate that an the context has been lost
At this point, when node B detects that the context state has been lost, node B can try to perform the initial exchange again, using the same information that was used during the initial exchange the first time. I think this would permit node A to rebuild the lost state
I think this soft state could work without requiring the error message.

And I think it is at least very hard. The protocol would be more robust with an error message.


And since we are not trying to build a security protocol - we just don't want to make things less secure than they are today - an error message which will only be believed when generated by an on-path node, should be ok even if it makes the peers re-establish the shim6 state.


i am not sure what do you have in mind when stating that the error message could only be generated by a mitm (considering that one of the nodes has lost its state, so no previous cookie is available) but i think that the defining an error message that can only be issued as a reply of a previous packet could do the trick


Regards, marcelo


But in any case, note that through error message spoofing the attacker may manage to impose that the context is re initiated while the communication still flows (in the case that the ULIDs are still being used as locators) That would allow an attacker that has a compatible CGA parameter data structure for instance to place himself in the middle of an ongoing communication.

But getting a compatible CGA structure is hard.

   Erik