Tony Li wrote:
Between the PMTUD problem, and the "elephant in the room" (state signaling that needs to be handled *somewhere*), I am coming to a rather uncomfortable conclusion:I guess my instinct is that this problem can only be generically solved at transport level and above, as RFC 4821 begins to recognize.Agreed. If we knew that 4821 was going to make it into a predominance of hosts, this wouldn't be nearly the issue that it is...But it wouldn't, if we simply placed a (recursive) requirement on tunnels to deliver adequate MTU at the innermost level.True, but it's not a reasonable operational requirement due to the large deployed base of 1500B MTU media.I can't disagree with the principle. But to my taste, we're headed towardscomplexity instead of simplicity.Agreed. I hope that we can do better.The big difference, the reason why RRG has to solve this if we want to tunnel, is that we're making tunneling a first class part of the architecture. If LISP were to be deployed ubiquitously, for example, we'd end up with something like 99.99% of all Internet core traffic being tunneled and being susceptible to the PMTUD problems that we've discussed. This would include tunneling a large percentage of the folks without their knowledge and consent. This is very different than inflicting pain on a small subset of sophisticated folks who are using a specialized application.
With or without LISP itself, the real problems that need to be solved, are intrinsically "transport"-ish. Which means: if we want to have a "better than we got here" solution, which scales, works well, and has longevity (even across subsequent generations of network protocols), the place to tackle the problems is at the transport layer. (ugh.)
The "elephant in the room":While LISP et al appear to solve some problems, when you look at it just the right way, the real problem is - how do you multihome, and detect, handle, and respond to failures on one of many paths available (i.e. "route around" the outage), in a scalable fashion? In BGP with PI, i.e. using BGP for multihoming, the state is carried via BGP. In LISP, the presumption is that the mapping stuff handles the state - but does that differ much from the BGP solution, other than moving the problem somewhere else? The state is still needed, and needs to be propogated in a timely fashion.
Is it (socially/politically/procedurally) possible for RRG to collectively say, we want the problem to be fixed, we are willing to work on fixing it, but it's not a routing problem?
(Consider the can of worms opened. Sorry.) Brian Dickson -- to unsubscribe send a message to rrg-request@psg.com with the word 'unsubscribe' in a single line as the message text body. archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg