[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RRG] LISP-NERD reachability and MTU detection
A LISP-NERD ITR chooses an ETR/locator and assumes it's reachable.
It sets a "please respond" code point in the NERD header and starts
a timer. The ETR receving the packet sees the "please respond"
message and sends back info to the originating ITR that could
encompass current locator preference information (for traffic
engineering), the up/down status of other ETRs, the maximum packet
size the ETR is prepared to receive, possibly (if the ETR supports
reassembly) the maximum packet size the ETR is prepared to
reassemble from fragments.
Iljitsch, you really don't to do this. In a very well behaving
scenario, this will cause way too much control traffic going to the
site. Before we put in the loc-reach-bits into LISP this was an
obvious first thought but was discarded because polling typically
doesn't scale well when being polled from a million places. Could you
imagine if everyone polled the DNS root servers *in addition to*
sending queries for name translations!
Why not just use TTL timeouts and have the ITR, when it needs the
mapping, send a query or Data Probe and get a new Reply back with
updated information?
And to be clear, what problem are you trying to solve? Are you trying
to get EID-to-RLOC record changes to ITRs as soon as possible?
Are you trying to get each ITR a copy of their own mapping with
different preference and weight information contained in the record?
If there is no response before the timer expires, the ITR switches
to a different ETR.
This is polling for reachability. I don't think you can do much better
than what we have spec'ed out in the main LISP spec with the loc-reach-
bits design.
When mapping state is created and outgoing traffic is flowing, the
ITR may observe return traffic (if the same ITR and ETR function as
ETR and ITR, respectively, for traffic in the other direction) and
deduce that there is adequate reachability. If the ITR doesn't see
any return traffic, on the other hand, it sets the "please respond"
code point in the LISP header periodically and awaits replies.
Again, if none are forthcoming, it switches to another ETR.
That is what the loc-reach-bits do in LISP. Not only does return
traffic tell you the encapsulator is up, but that encapsulator is
telling you the reachability status for the other ETRs in the site.
Or are you trying to solve the locator reachability problem when the
xTRs are placed in PEs?
Because LISP packets contain a nonce, ITRs can correlate incoming
responses to their response requests with the original packets, so
they are in the position to do RFC 4821 path MTU discovery without
the help from ICMP messages. (They may be limited somewhat because
they can't decide on the packet size on their own unless we add
extra stuff here.)
In my book, this is a big win, because it means that the ETRs can be
completely stateless so it's easy for ISPs to run them for their
customers and on the ITRs the state required for reachability
detection is extemely basic: simpler than shim6 and nowhere near
what's in TCP. It's also soft state that can be discarded and
recreated without penalty when all ETRs for a prefix (or at least
the one the ITR will be selecting as the one to use the next time
around) are up.
It might be better if you had xTRs deployed at PEs that they ping each
other to determine if they are reachable, then they can continue with
the loc-reach-bits algorithm. That would scale much better but still
gives me the heebie geebies. ;-)
Maybe simply a multi-hop BGP session that transmits no routes among
each other could work, but I still think there is too much
coordination required of competing ISPs.
For ETRs, having an incoming MTU of 1500 means that unacceptable
PMTUD blackholes will happen, or ITRs have to fragment packets and
the ETR has to reassemble them (for DF=1 or IPv=6). I'm assuming
this is unacceptable but I'm certainly interested to hear from
vendors about this.
It's amazing how people are so fascinated with this MTU issue. So
what's wrong with fragmentation?
As long as the ITRs fragment before encapsulation, the host and not
the ETR will reassemble.
Dino
--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg