[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RRG] LISP-NERD reachability and MTU detection



A LISP-NERD ITR chooses an ETR/locator and assumes it's reachable. It sets a "please respond" code point in the NERD header and starts a timer. The ETR receving the packet sees the "please respond" message and sends back info to the originating ITR that could encompass current locator preference information (for traffic engineering), the up/down status of other ETRs, the maximum packet size the ETR is prepared to receive, possibly (if the ETR supports reassembly) the maximum packet size the ETR is prepared to reassemble from fragments.

Iljitsch, you really don't to do this. In a very well behaving scenario, this will cause way too much control traffic going to the site. Before we put in the loc-reach-bits into LISP this was an obvious first thought but was discarded because polling typically doesn't scale well when being polled from a million places. Could you imagine if everyone polled the DNS root servers *in addition to* sending queries for name translations!

Why not just use TTL timeouts and have the ITR, when it needs the mapping, send a query or Data Probe and get a new Reply back with updated information?

And to be clear, what problem are you trying to solve? Are you trying to get EID-to-RLOC record changes to ITRs as soon as possible?

Are you trying to get each ITR a copy of their own mapping with different preference and weight information contained in the record?

If there is no response before the timer expires, the ITR switches to a different ETR.

This is polling for reachability. I don't think you can do much better than what we have spec'ed out in the main LISP spec with the loc-reach- bits design.

When mapping state is created and outgoing traffic is flowing, the ITR may observe return traffic (if the same ITR and ETR function as ETR and ITR, respectively, for traffic in the other direction) and deduce that there is adequate reachability. If the ITR doesn't see any return traffic, on the other hand, it sets the "please respond" code point in the LISP header periodically and awaits replies. Again, if none are forthcoming, it switches to another ETR.

That is what the loc-reach-bits do in LISP. Not only does return traffic tell you the encapsulator is up, but that encapsulator is telling you the reachability status for the other ETRs in the site.

Or are you trying to solve the locator reachability problem when the xTRs are placed in PEs?

Because LISP packets contain a nonce, ITRs can correlate incoming responses to their response requests with the original packets, so they are in the position to do RFC 4821 path MTU discovery without the help from ICMP messages. (They may be limited somewhat because they can't decide on the packet size on their own unless we add extra stuff here.)

In my book, this is a big win, because it means that the ETRs can be completely stateless so it's easy for ISPs to run them for their customers and on the ITRs the state required for reachability detection is extemely basic: simpler than shim6 and nowhere near what's in TCP. It's also soft state that can be discarded and recreated without penalty when all ETRs for a prefix (or at least the one the ITR will be selecting as the one to use the next time around) are up.

It might be better if you had xTRs deployed at PEs that they ping each other to determine if they are reachable, then they can continue with the loc-reach-bits algorithm. That would scale much better but still gives me the heebie geebies. ;-)

Maybe simply a multi-hop BGP session that transmits no routes among each other could work, but I still think there is too much coordination required of competing ISPs.

For ETRs, having an incoming MTU of 1500 means that unacceptable PMTUD blackholes will happen, or ITRs have to fragment packets and the ETR has to reassemble them (for DF=1 or IPv=6). I'm assuming this is unacceptable but I'm certainly interested to hear from vendors about this.

It's amazing how people are so fascinated with this MTU issue. So what's wrong with fragmentation?

As long as the ITRs fragment before encapsulation, the host and not the ETR will reassemble.

Dino

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg