[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Shim6 failure recovery after garbage collection



Scott Leibrand wrote:
In a private discussion at IETF in Dallas, I was discussing with someone
the impact on a content provider's server of implementing shim6.  I
expressed the opinion that it would be nice if a heavily loaded server
could aggressively garbage collect shim6 state after initial context
establishment, and rely on the client to perform failure and reachability
detection and initiate context re-establishment if a failure is detected.

If such behavior is supported by the protocol, we're much more likely to
be able to convince potential implementors to turn on basic shim6 support
by default, with the understanding that implementors can aggressively
discard shim6 context state (and not initiate shim6 context negotiation)
if the implementation doesn't have multiple locators or otherwise doesn't
need the capabilities shim6 provides.  This would (properly, IMO) push the
responsibility for context tracking, failure detection and reachability
exploration to the multihomed host that stands to benefit most from shim6
and wants to run it in the first place.

Scott,

One disadvantage of this is that there are some failures for which there is no recovery.
Imagine client C and server S.
C establishes a shim6 context with the server S, and sets up some TCP connection(s). S then discards the shim6 state.

C sends the HTTP request, and gets the TCP ACK back.
S starts sending the (possibly very large HTTP response).
Somewhere part through packets don't make it any more.

The problem is that S doesn't have any alternate locators for C.
And C will not see a problem since its TCP is happy as a clam - it doesn't have any unacknowledged data. Perhaps the application (if content-length was set) can wonder why things stalled, but having to push the failure recovery to every application would be undesirable.


Taking a step back, why do we think that having a few bytes of shim6 context on the server as a problem? A http server is likely to have many TCP connections in TIME_WAIT state for every client IP address, but it will have at most one shim6 context state. And with TCP a shim6 implementation can probably discard the shim6 state when the last TCP socket closes (i.e. long before the TIME_WAIT connections can be discarded).

Having an implementation of shim6 and running it with realistic (web) server workloads would help answer the question of the impact on the memory footprint, but from my thinking of how to implement shim6, I don't see it as being significant.

   Erik