[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: address pair exploration, flooding and state loss



marcelo bagnulo braun wrote:

but this would require to include in all data packets of a context session the following information:
- some indication that this is a shim data packets (note that until now, data packets exchanged within a shim session that used ULIDs as locators were not necesarily tagged as shim packets i.e. there were simple IP packets. this would need to change if these data packets are to be used to detect context loss)
- some context identifier in order to properly id to involved context
- some random nonce in order to limit TCP rst type of attacks to mitm


Note that the flow label option for carrying context id can carry a context id, perhaps a shim flag bit but i guess it cannot carry the random nonce.
The extension header can carry the three of them, but it may be required to carry it in all shim data packets

I'm not sure what type of state loss you are trying to guard against, because there are at least two.


1. B crashes and reboots hence it looses all shim6 as well as transport and application state.
2. The shim6 layer on B garbage collects some context too early - when there is still some use of it.


In #1 it is key that we can get A to realize that the context has been lost on B, so that we can get to the point where the shim on B can pass up the packets to the ULPs. This will cause the ULPs to generate errors (e.g. TCP RST) and we are back to what we have today without a shim.

For #2 I think we should figure out how an implementation can minimize the occurrence. But in many implementations I suspect we can't completely get rid of them when there are long lived UDP "sessions" in the applications; the kernel TCP/UDP/IP stack wouldn't be aware of such "sessions".

Above you seem to be concerned about case #2, with the added twist that the context state has been established but the original ULIDs still work as locators.
I think I commented before that the heuristics on B for deferred context establishment might recover from this. If the heuristic is to try to set up a shim6 context after N packets, then after N packets have passed, B would try to set up a context with A, at which point A would see that it already has a context with B.


Should the <A1, B1> locator pair fail before those N packets, then if A detects the failure it will start probing additional locator pairs. As you've pointed out earlier, it makes sense for this probing to also be able to detect when the context is unknown to B.

So I think the only case which might be problematic is when
 - B garbage collects the context state while it is still used
 - <A1, B1> stops working before N packets have been exchanged after the
   garbage collection
 - A doesn't detect that there was a failure (because A isn't trying to
   send something)

In this case B wouldn't be able to recover, because it doesn't have any context state hence doesn't know the alternate locator pairs.

I don't think this case is common enough to warrant sticking a lot of extra bits in each data packet.
I wonder if it is even worth while to include one extra bit in the data packets (this would be a bit which asserts that the sender has shim6 state for the context).
It's hard to find a free bit, and the "usual" :-) place to steal one is the nexthdr field, which would raise a bunch of issues about firewall uniformity. (If a TCP packet is 6 before the context state is setup, and e.g. 250 after it has been setup, then firewalls might drop the packets after the setup even though the initial TCP packets made it through.)


So I'd punt on this rare case. If B drops context state too early then it shouldn't be surprised if failover to a different locator pair doesn't work.

   Erik