[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RRG] LISP-NERD reachability and MTU detection

To: Iljitsch van Beijnum <iljitsch@muada.com>
Subject: Re: [RRG] LISP-NERD reachability and MTU detection
From: Dino Farinacci <dino@cisco.com>
Date: Sun, 16 Dec 2007 10:23:45 -0800
Cc: Routing Research Group list <rrg@psg.com>
In-reply-to: <EAB3BF96-D438-459E-A753-F9D72B1FE5B6@muada.com>
References: <EAB3BF96-D438-459E-A753-F9D72B1FE5B6@muada.com>

A LISP-NERD ITR chooses an ETR/locator and assumes it's reachable.It sets a "please respond" code point in the NERD header and startsa timer. The ETR receving the packet sees the "please respond"message and sends back info to the originating ITR that couldencompass current locator preference information (for trafficengineering), the up/down status of other ETRs, the maximum packetsize the ETR is prepared to receive, possibly (if the ETR supportsreassembly) the maximum packet size the ETR is prepared toreassemble from fragments.

Iljitsch, you really don't to do this. In a very well behavingscenario, this will cause way too much control traffic going to thesite. Before we put in the loc-reach-bits into LISP this was anobvious first thought but was discarded because polling typicallydoesn't scale well when being polled from a million places. Could youimagine if everyone polled the DNS root servers *in addition to*sending queries for name translations!

Why not just use TTL timeouts and have the ITR, when it needs themapping, send a query or Data Probe and get a new Reply back withupdated information?

And to be clear, what problem are you trying to solve? Are you tryingto get EID-to-RLOC record changes to ITRs as soon as possible?

Are you trying to get each ITR a copy of their own mapping withdifferent preference and weight information contained in the record?

If there is no response before the timer expires, the ITR switchesto a different ETR.

This is polling for reachability. I don't think you can do much betterthan what we have spec'ed out in the main LISP spec with the loc-reach-bits design.

When mapping state is created and outgoing traffic is flowing, theITR may observe return traffic (if the same ITR and ETR function asETR and ITR, respectively, for traffic in the other direction) anddeduce that there is adequate reachability. If the ITR doesn't seeany return traffic, on the other hand, it sets the "please respond"code point in the LISP header periodically and awaits replies.Again, if none are forthcoming, it switches to another ETR.

That is what the loc-reach-bits do in LISP. Not only does returntraffic tell you the encapsulator is up, but that encapsulator istelling you the reachability status for the other ETRs in the site.

Or are you trying to solve the locator reachability problem when thexTRs are placed in PEs?

Because LISP packets contain a nonce, ITRs can correlate incomingresponses to their response requests with the original packets, sothey are in the position to do RFC 4821 path MTU discovery withoutthe help from ICMP messages. (They may be limited somewhat becausethey can't decide on the packet size on their own unless we addextra stuff here.)
In my book, this is a big win, because it means that the ETRs can becompletely stateless so it's easy for ISPs to run them for theircustomers and on the ITRs the state required for reachabilitydetection is extemely basic: simpler than shim6 and nowhere nearwhat's in TCP. It's also soft state that can be discarded andrecreated without penalty when all ETRs for a prefix (or at leastthe one the ITR will be selecting as the one to use the next timearound) are up.

It might be better if you had xTRs deployed at PEs that they ping eachother to determine if they are reachable, then they can continue withthe loc-reach-bits algorithm. That would scale much better but stillgives me the heebie geebies. ;-)

Maybe simply a multi-hop BGP session that transmits no routes amongeach other could work, but I still think there is too muchcoordination required of competing ISPs.

For ETRs, having an incoming MTU of 1500 means that unacceptablePMTUD blackholes will happen, or ITRs have to fragment packets andthe ETR has to reassemble them (for DF=1 or IPv=6). I'm assumingthis is unacceptable but I'm certainly interested to hear fromvendors about this.

It's amazing how people are so fascinated with this MTU issue. Sowhat's wrong with fragmentation?

As long as the ITRs fragment before encapsulation, the host and notthe ETR will reassemble.

Dino

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg

Follow-Ups:
- Re: [RRG] LISP-NERD reachability and MTU detection
  - From: Iljitsch van Beijnum <iljitsch@muada.com>
- Re: [RRG] LISP-NERD reachability and MTU detection
  - From: Tony Li <tli@cisco.com>

References:
- [RRG] LISP-NERD reachability and MTU detection
  - From: Iljitsch van Beijnum <iljitsch@muada.com>

Prev by Date: Re: [RRG] The use of UDP in LISP
Next by Date: Re: [RRG] LISP-NERD reachability and MTU detection
Previous by thread: [RRG] LISP-NERD reachability and MTU detection
Next by thread: Re: [RRG] LISP-NERD reachability and MTU detection
Index(es):
- Date
- Thread