[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[RRG] LISP-NERD reachability and MTU detection

To: Routing Research Group list <rrg@psg.com>
Subject: [RRG] LISP-NERD reachability and MTU detection
From: Iljitsch van Beijnum <iljitsch@muada.com>
Date: Sat, 15 Dec 2007 17:53:25 +0100

After Eliot's presentation today, I started thinking that LISP-NERDcould benefit greatly from something like the shim6 REAP reachabilityevaluation mechanism. However, with shim6 we have the limitation thatwe have no bits to play with for datapackets belonging to sessionsthat haven't encountered any failures. Not so with NERD: here we haveadditional bits in every packet. I'm assuming that we can use a few ofthose for reachability and MTU detection. It would work like this:

A LISP-NERD ITR chooses an ETR/locator and assumes it's reachable. Itsets a "please respond" code point in the NERD header and starts atimer. The ETR receving the packet sees the "please respond" messageand sends back info to the originating ITR that could encompasscurrent locator preference information (for traffic engineering), theup/down status of other ETRs, the maximum packet size the ETR isprepared to receive, possibly (if the ETR supports reassembly) themaximum packet size the ETR is prepared to reassemble from fragments.

The ITR receives the message and updates its mapping cache accordingly.

If there is no response before the timer expires, the ITR switches toa different ETR.

When mapping state is created and outgoing traffic is flowing, the ITRmay observe return traffic (if the same ITR and ETR function as ETRand ITR, respectively, for traffic in the other direction) and deducethat there is adequate reachability. If the ITR doesn't see any returntraffic, on the other hand, it sets the "please respond" code point inthe LISP header periodically and awaits replies. Again, if none areforthcoming, it switches to another ETR.

Because LISP packets contain a nonce, ITRs can correlate incomingresponses to their response requests with the original packets, sothey are in the position to do RFC 4821 path MTU discovery without thehelp from ICMP messages. (They may be limited somewhat because theycan't decide on the packet size on their own unless we add extra stuffhere.)

In my book, this is a big win, because it means that the ETRs can becompletely stateless so it's easy for ISPs to run them for theircustomers and on the ITRs the state required for reachabilitydetection is extemely basic: simpler than shim6 and nowhere nearwhat's in TCP. It's also soft state that can be discarded andrecreated without penalty when all ETRs for a prefix (or at least theone the ITR will be selecting as the one to use the next time around)are up.

If an ITR notices that there is reachability for small packets, it canthen keep a copy of a large packet that it sends with reply requested,and if there is no reply, or if there is an ICMP packet too big, itcan generate an ICMP too big towards the source of the originalpacket. It doesn't actually know the packet size that can be used onthe link in the former case, but it could use heuristics or stick to aconservative value.

It would of course also be possible to advertise ETR MTU sizes in themapping database (but that doesn't tell us the path MTU).

The question is whether we want to support ITRs and/or ETRs sitting infront of / behind 1500 byte MTU links. Having ITRs with thislimitation is probably doable because as per the above, the ITRsSHOULD be able to generate the too big messages back to the sourcehosts and if end-users deploy ITRs in their own networks, they'llquickly discover that they'll have to un-break PMTUD. For ISPs we'llprobably have to mandate the larger packet size because thecorrelation between the deployment of a new box and the start ofproblems won't be as obvious or easy to reverse.

For ETRs, having an incoming MTU of 1500 means that unacceptable PMTUDblackholes will happen, or ITRs have to fragment packets and the ETRhas to reassemble them (for DF=1 or IPv=6). I'm assuming this isunacceptable but I'm certainly interested to hear from vendors aboutthis.

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg

Follow-Ups:
- RE: [RRG] LISP-NERD reachability and MTU detection
  - From: "Templin, Fred L" <Fred.L.Templin@boeing.com>
- Re: [RRG] LISP-NERD reachability and MTU detection
  - From: Eliot Lear <lear@cisco.com>
- Re: [RRG] LISP-NERD reachability and MTU detection
  - From: Dino Farinacci <dino@cisco.com>

Prev by Date: Attractiveness of LISP-NERD, was Re: [RRG] Is 12 bytes really so scary?
Next by Date: Re: [RRG] The use of UDP in LISP
Previous by thread: [RRG] Process proposal: agenda admission control
Next by thread: Re: [RRG] LISP-NERD reachability and MTU detection
Index(es):
- Date
- Thread