[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: soft state (was Re: shim6 and bit errors in data packet headers
Iljitsch van Beijnum wrote:
This is the debate about positive vs. negative advise from the ULPs.
Hm, in a binary world saying no or not saying yes are the same thing...
Not sure if there is a big difference here.
Ah - but this is not binary.
A ULP can do three things at any given time:
- provide no advise
- provide positive advise (things are making progress)
- provide negative advise (I see some problems/retransmissions)
Thus the lack of positive advise is not the same as negative advise.
You are advocating that the ULPs provide negative advise.
Actually I want the shim to monitor ULP progress rather than depend on
the actual ULP to provide feedback. The problem with that would be that
in many cases (= 99% of the time when UDP is the payload), the "real"
ULP is implemented in the application.
Adding code in the shim that parses ULP headers to determine "progress"
doesn't make an implementation perform better, and requiring that the
shim understands this for all possible ULPs (think raw sockets) doesn't
make it easier to deploy the shim.
A technique based on positive advise from the ULP is robust against the
case when the ULP doesn't provide any advise, since this would trigger
the shim to do it's own data driven probes.
So we can handle the UDP case just fine in this approach. The positive
advise from the ULP is a performance optimization; when the advise
arrives in the shim it removes the need for the shim to probe.
But things are problematic on B, because there isn't an (efficient)
strategy for the TCP on B to generate negative advise - it doesn't
run a retransmit timer.
Ah, but if A can detect the failure in this case, then that would be
good enough if A can tell B about it at some point.
Yes, but this assumes that packets from A to B in fact get delivered.
Since the failure could be for the A->B direction, the B->A direction,
or both, there are cases when B would not be informed that A is seeing
problems,
A long time ago in a galaxy far away, I thought it would be important
to handle the case where only one end has the knowledge to repair the
failure. However, I don't think this applies to the shim as all
addresses for both sides are communicated during the shim negotiation
so the possibility of a successful repair shouldn't depend on which end
detects the failure.
I agree that we can have either or both ends try all the N*M locator
pairs, i.e. that the technique works. But the issue is how efficient it
would be and whether positive vs. negative advise makes a difference
here on what we can do in the shim.
So IMO, it is good enough when just one side is able to detect the
failure.
Thus when something fails it will always be up to A to initiate the
exploration of alternate locator pairs. Also, the time at which the
exploration of alternates start is a function of the retransmit
behavior of the ULP, which makes it harder to tightly control the
failover time.
No, in my plan the shim wouldn't know about retransmissions, it only
looks for return traffic. So either the timeout is relatively long to
accommodate ULPs that don't send traffic in the low-traffic direction
very often (I think streaming A/V protocols send an ack every 10
seconds or so) or relatively short but then there would be almost
continuous reachability probes in at least one direction.
But then the best you can do is determined by the ULPs (re)transmission
behavior, which is why I say that you can't control the failover time in
the shim.
If the ULP retransmits 10 times with binary exponential backoff starting
with a timeout of 4 seconds, and it has been told to send negative
advise after consuming half the retransmits, then the shim will see
negative advise after 4+8+16+32+64 seconds.
But once the ULP has a better RTT estimate it might retransmit with a
timeout starting at 500 ms or less. In that case the negative advise
will arrive to the shim a lot sooner.
With positive advise the shim can always operate with "if I send a
packet and >10 seconds have passed since I have positive advise or a
successful probe, then send a probe" logic.
In this case the shim will always start looking for alternatives after
10 seconds.
What kind of failover time do you imagine, BTW?
10 seconds for sending a probe might not be a bad default. If we think
we can send a small number of probes in parallel (3 or so) with binary
exponential backoff for the probes, we might be able to recover from the
failure in one RTT after those 10 seconds, but if there are lots of
address pairs to try and more than one has failed, it can take a lot
longer. If the shim itself is conservative and has a 4 second probe
timeout with exponential backoff, then it would
- send 3 probes at time 10 seconds
- send 3 other probes at time 14 seconds
- send 3 more at time 22 seconds
But the shim can probably use an initial timeout based on the RTT
measured when then shim state was setup, so that things can be sped up.
With such a strategy the shim implementation can do a check after
sending a ULP packet: "how long time ago since some positive advise?"
AFter every packet...?
Yes, but a node which implements the mandatory NUD in RFC 2461 already
has that test in the code path, so it might very well be possible to
implement the shim6 liveness check without adding an extra test to the
code path.
Erik