[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: address pair exploration, flooding and state loss
El 27/05/2005, a las 22:57, Erik Nordmark escribió:
marcelo bagnulo braun wrote:
In #1 it is key that we can get A to realize that the context has
been lost on B, so that we can get to the point where the shim on B
can pass up the packets to the ULPs. This will cause the ULPs to
generate errors (e.g. TCP RST) and we are back to what we have today
without a shim.
besides at this point, there is no possible recovery, since ULP state
also has been lost
Yes, but "do no harm" presumably includes "don't slow down current
failure detection".
The recovery, once the RST comes back, could be to open a new
connection, or something else in the application layer. (Think of an
application with the same behavior as wget -c)
I think there is a significant quality difference between
1. The box reboots in 30 seconds. The first TCP retransmission after
30 seconds results in a RST coming back. This triggers application
recovery.
2. The box reboots in 30 seconds. TCP retransmissions for the next 10
minutes are silently dropped because the retransmissions arrive at the
peer's TCP with a bad checksum (due to "missing" shim6 rewrite).
Thus any RST doesn't arrive until 10 minutes or so later!
agree
so the conclusion is that reboots and lost state have to be detected at
least as fast as what is available today
right, the only point that i wonder w.r.t. this is if it is wise to
base the context loss detection procedure on the heuristics to
establish shim context... i mean, the heuristics for establishing
shim context may greatly vary, i guess. For instance, i think it may
be a possibility that some heavy loaded servers use the policy to
never initiate shim session establishment procedure, but they only
accept establishment request from clients. In such a case, they
wouldn't detect context loss.
So we can point out in our RFCs why such a behavior would be
suboptimal.
Ensuring that one end can quickly detect when the peer has lost the
context state, even when the state isn't used (when the ULIDs are used
as locators), is far from inexpensive.
may be but it does constraint the possible heuristics used for
establishing shim sessions.
Maybe it would be enough to mention that the heuristics for
establishing shim sessions also are used for recover from those
situations where the context has been lost. However, i guess that need
to take into account where the heuristics for establsihing shim
sessions are not useful for recovering from lost state and provide a
worst case recovery for this case.
In any case, if the procedure you described in the previous mail for
including both the context identifier and the nonce in a compact way
in 20 bits, we could stuff all we need in all data packets, i guess,
so if the flow label approach is used, all data packets of
established shim sessions can be identified as such
No, that isn't sufficient, because there is nothing in a received
packet which identifies it to the receiver (which has lost context) as
a shim6 packet. Any packet can have a non-zero flow label, so that
isn't a useful indication.
So you'd need to add at least one bit to every data packet to be able
to do this.
ok, now i am confused
AFAIU until now, if we want to detect context loss based on the
reception of data packets, we need to have some falg in the data
packet that this packet belongs to a existent shim session. If we
eliminate this bit, then we cannot detect loss context from data
packets, right? or have i lost track of our reasoning?
If B receives a packet from A1 to B1 with flow label 12345 and a
nexthdr of UDP, then if B has no context for <A1, B1, 12345>, how can
B tell whether this is due to
- there never having been a shim6 context - A might not be shim6
capable for instance
- B having had such a context but has garbage collected too early
right, this is what i have in mind.
so, so far what we have is:
- before a rehoming event, packets may not be identified as belonging
to a shim session. If this is the case, the data packets associated to
this sessions are not useful to detect context loss, so alternative
mechanims, like using the heuristics for establishing shim sessions are
used for recover from this situations
- after a rehoming event, context loss is detected upon the reception
of any packet associated with the shim session, whether signalling
packet or data packet. for that, data packets need to carry at least
one bit that identifies them as belonging to a shim session.
right?
regards, marcelo
Erik