[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RRG] LISP-NERD reachability and MTU detection

To: Dino Farinacci <dino@cisco.com>
Subject: Re: [RRG] LISP-NERD reachability and MTU detection
From: Iljitsch van Beijnum <iljitsch@muada.com>
Date: Mon, 17 Dec 2007 15:35:24 +0100
Cc: Routing Research Group list <rrg@psg.com>
In-reply-to: <EC1BB972-6F21-4CBC-B827-BB1840C25AE8@cisco.com>
References: <EAB3BF96-D438-459E-A753-F9D72B1FE5B6@muada.com> <EC1BB972-6F21-4CBC-B827-BB1840C25AE8@cisco.com>

On 16 dec 2007, at 19:23, Dino Farinacci wrote:

A LISP-NERD ITR chooses an ETR/locator and assumes it's reachable.It sets a "please respond" code point in the NERD header and startsa timer. The ETR receving the packet sees the "please respond"message and sends back info to the originating ITR

Iljitsch, you really don't to do this. In a very well behavingscenario, this will cause way too much control traffic going to thesite. Before we put in the loc-reach-bits into LISP this was anobvious first thought but was discarded because polling typicallydoesn't scale well when being polled from a million places. Couldyou imagine if everyone polled the DNS root servers *in addition to*sending queries for name translations!

Well, it's one thing or the other: either you make availableinformation about the currently reachable locator set for any givenEID in the mapping system = the mapping system is just as volatile asBGP, or you do reachability testing of some kind between ITRs andETRs. I'd say that if you can talk to millions of places, you can alsorespond to reachability probes from millions of places. However, thisdoes suggest that something LISP-like isn't the best choice for thehighest traffic destinations connected to the internet.

Why not just use TTL timeouts and have the ITR, when it needs themapping, send a query or Data Probe and get a new Reply back withupdated information?

I'm not sure how long you want to make these TTLs, but obviously thatwould be longer than a 10 second or so polling interval which meansthat it's going to take you much longer to detect and recover fromfailures.

And to be clear, what problem are you trying to solve? Are youtrying to get EID-to-RLOC record changes to ITRs as soon as possible?

When I talk to an ETR, I want to know if it's alive fast enough tofail over to another one before the transport, app or user stops trying.

Are you trying to get each ITR a copy of their own mapping withdifferent preference and weight information contained in the record?

If ETRs are sending packets to ITRs anyway it makes sense to add thisinformation, yes. Not sure if I'd want to work very hard to make itavailable if it wasn't "free" to do so, though.

If there is no response before the timer expires, the ITR switchesto a different ETR.

This is polling for reachability. I don't think you can do muchbetter than what we have spec'ed out in the main LISP spec with theloc-reach-bits design.

You assume that each ETR knows whether the other ETRs for that EID arereachable. On friday, Eliot said he'd like to see ETR functionality inLinksys boxes, so if a user has both cable and DSL connectivity (wecalled this a "basement multihomer" in multi6) the one box wouldhandle RLOCs from both so this requirement is easy to meet. However,if ISPs run ETRs for the benefit of their customers, this means that asingle ETR needs to keep track of very many other ETRs in differentadministrative domains and this wouldn't work very well. Also, thefact that ETR A can reach ETR B doesn't mean that a given ITR can alsoreach it, especially if ETRs are located at end-user sites where lastkilometer and POP outages will happen rather than in ISP datacenterswith multiple connections where uptimes are high.

For ETRs, having an incoming MTU of 1500 means that unacceptablePMTUD blackholes will happen, or ITRs have to fragment packets andthe ETR has to reassemble them (for DF=1 or IPv=6). I'm assumingthis is unacceptable but I'm certainly interested to hear fromvendors about this.

It's amazing how people are so fascinated with this MTU issue. Sowhat's wrong with fragmentation?

It's basically part of our internet engineering taboos at this stage,just like packet reordering. See http://citeseer.ist.psu.edu/335647.html

As long as the ITRs fragment before encapsulation, the host and notthe ETR will reassemble.

You can't fragment IPv6 packets or IPv4 packets with DF=1. So whatwould have to happen is that the LISP tunnel does the fragmenting/reassembly. In this case it would be helpful to know what size packetsan ETR is willing to reassemble.

I think the reason why Fred is "fascinated" with the MTU issue isbecause he's been trying to solve this for the general tunneling caseand has found that to be quite hard. For me, I've often been bitten byPMTUD black holes, both as a user and as someone who had to make ISPinfrastructure work without overloading the support lines. Inaddition, I would very much like us to move towards a situation where1500 bytes is no longer the de facto IP MTU but people can usesomething larger if their hardware supports it.

We have both the potential to do very quite things (trigger brokenPMTUD) and very useful things (give people an incentive to deployjumboframes, create the first MTU-robust tunneling mechanism) here sowe should aim to get things right the first time rather than repeatthe mistakes made with RFC 1191.

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg

Follow-Ups:
- Re: [RRG] LISP-NERD reachability and MTU detection
  - From: Dino Farinacci <dino@cisco.com>

References:
- [RRG] LISP-NERD reachability and MTU detection
  - From: Iljitsch van Beijnum <iljitsch@muada.com>
- Re: [RRG] LISP-NERD reachability and MTU detection
  - From: Dino Farinacci <dino@cisco.com>

Prev by Date: [RRG] implementations, experiments, simulations, progress
Next by Date: Re: [RRG] LISP-NERD reachability and MTU detection
Previous by thread: Re: [RRG] Tunnel fragmentation/reassembly for RRG map-and-encaps architectures
Next by thread: Re: [RRG] LISP-NERD reachability and MTU detection
Index(es):
- Date
- Thread