[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [RRG] Re: LISP PMTU & fragmentation problems

To: "Dino Farinacci" <dino@cisco.com>, "Robin Whittle" <rw@firstpr.com.au>
Subject: RE: [RRG] Re: LISP PMTU & fragmentation problems
From: "Templin, Fred L" <Fred.L.Templin@boeing.com>
Date: Sun, 9 Mar 2008 17:39:06 -0700
Cc: "Routing Research Group" <rrg@psg.com>
In-reply-to: <7EC200C2-CCE0-4E12-8E1F-76CD52BA6240@cisco.com>
References: <47D16205.1080406@firstpr.com.au> <7EC200C2-CCE0-4E12-8E1F-76CD52BA6240@cisco.com>

Dino,

I have to agree that the new LISP text reads like a
"slacker's guide" to MTU handling. First, it only does
the splitting for IPv4 packets that had DF=0 from the
original host, and there really aren't very many of those
in-the-wild these days. Second, for IPv6 and IPv4 with DF=1,
the original source will be told a degenerate MTU that would
go against the principle of least surprise ("I expected 1500,
but only got 1464") plus it *always* results in packet loss
until MTU discovery has converged. Finally, there is no
provision for discovering MTUs larger than 1500 even though
the core may soon transition to all-GigE.

You have said that you believe the core is comprised almost
completely of links that can do quite a bit more than 1500.
Why not trust your judgement and just run SEAL; that way, you
get to enjoy the larger MTUs and will have no fragmentation
unless a degenerate link shows up on the path - in which case
SEAL will detect and correct it.

Thanks - Fred
fred.l.templin@boeing.com 

>-----Original Message-----
>From: Dino Farinacci [mailto:dino@cisco.com] 
>Sent: Saturday, March 08, 2008 8:47 PM
>To: Robin Whittle
>Cc: Routing Research Group
>Subject: [RRG] Re: LISP PMTU & fragmentation problems
>
>> Short version:    lisp-06's new material on Path MTU limits includes
>>                  some text on how to resolve the problems if they
>>                  are deemed to be bad enough to need a solution
>>                  within LISP.  I can't understand this text in any
>>                  way which makes practical sense.
>
>Well, I'm sorry about that. The text is written in such simple and  
>precise terms, I am surprised you wouldn't understand it.
>
>>                  Also, it would be good to publish the research
>>                  which indicates that 1500 byte MTU limits are
>>                  relatively rare.
>
>There wasn't research. It was a survey. I asked 10 people, and all 10  
>made this statement. So there isn't much to publish.
>
>>                  If the problems of PMTU are deemed not worth
>>                  solving within LISP, then LISP would be deployed
>>                  on the assumption that all transit links would
>>                  be capable of some much higher than 1500 byte
>>                  PMTU.
>
>That is correct.
>
>>                  This would tend to constrain the locations of ITR
>>                  and ETR functions to be at or near border routers,
>>                  in order that they have unfettered access to  
>> jumboframe
>>                  capable links to the core of the Internet.
>
>Right, or you do fragmentation. We did have an out.
>
>>                  This would seem to be a major restriction on the
>>                  ability of operators to place ITRs and ETRs wherever
>>                  they like.
>
>For Loc/ID purposes (there are several other reasons to use LISP than  
>what is intended by this venue), we want to strongly suggest 
>that xTRs  
>be placed on CE (CPE) routers. We think that is the best balance of  
>tradeoffs.
>
>>                  Likewise, it would seem to reduce the number of
>>                  devices which could do ITR or ETR functions and
>>                  thereby lead to bottlenecks and to these devices
>>                  needing to be large and expensive.
>
>No, not bottlenecks, easier deployability.
>
>You make it sound that by adding LISP to a router, it will get  
>overloaded. You capacity design a router to deal with the input rate  
>and density of the box and the amount of work you have to do. ITRs  
>won't attract more traffic when they are at the CE. They get traffic  
>based on who wants to send data to external destinations.
>
>>
>> L = 1500
>> H =   36
>> S = 1464
>>
>>> 1.  Define an architectural constant S for the maximum size of a
>>>    packet, in bytes, an ITR would receive from a source inside of
>>>    its site.
>>
>> S is 1464 bytes.  But an ITR could receive a packet of any length  
>> from a
>> source inside its site.  So this sentence makes no proper 
>sense to me.
>
>It means that if the packet from the source is >= 1464, the packet  
>will be fragmented.
>
>>> When an ITR receives a packet of size greater than L on a 
>site-facing
>>> interface and that packet needs to be encapsulated, it resolves the
>>> MTU issue by first splitting the original packet into 2 equal-sized
>>> fragments.  A LISP header is then pre-pended to each fragment.
>>
>> I guess you meant 'S' (1464) rather than 'L' (1500).
>
>The total packet size after the ITR is finished with it is L.
>
>> However, all this assumes that the ITR has a 1500 byte PMTU to the  
>> ETR.
>> In many cases, the PMTU will be a lot higher.  So the above  
>> algorithm does
>> not allow the ITR to send longer packets without fragmentation.
>
>That is correct. But maybe a source site that talks to a destination  
>site of the same ISP that advertises support for larger MTUs, 
>then the  
>ITR be configured with an L value of 4470 or 9182 perhaps.
>
>> IPv6 or IPv4 with DF = 1
>> ------------------------
>>
>>> ...  the ITR will drop the packet when the size is greater than L,  
>>> and
>>> sends an ICMP Too Big message to the source with a value of S,  
>>> where S
>>> is (L - H).
>>
>> This makes no sense to me.  It would make more sense if this  
>> occurred when
>> the packet length was greater than S (1464 bytes for IPv4 or 1444  
>> for IPv6).
>
>You want the ITR to tell the host to send a size so when the outer IP  
>header, UDP header, and  LISP header are prepended, the size of the  
>packet is L.
>
>Dino
>
>--
>to unsubscribe send a message to rrg-request@psg.com with the
>word 'unsubscribe' in a single line as the message text body.
>archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg
>

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg

Follow-Ups:
- Re: [RRG] Re: LISP PMTU & fragmentation problems
  - From: Dino Farinacci <dino@cisco.com>
- Re: [RRG] Re: LISP PMTU & fragmentation problems
  - From: Scott Brim <swb@employees.org>

References:
- [RRG] LISP PMTU & fragmentation problems
  - From: Robin Whittle <rw@firstpr.com.au>
- [RRG] Re: LISP PMTU & fragmentation problems
  - From: Dino Farinacci <dino@cisco.com>

Prev by Date: Re: [RRG] Are host-stack modifications allowed or disallowed ?
Next by Date: Re: [RRG] Re: LISP PMTU & fragmentation problems
Previous by thread: [RRG] Re: LISP PMTU & fragmentation problems
Next by thread: Re: [RRG] Re: LISP PMTU & fragmentation problems
Index(es):
- Date
- Thread