[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[RRG] MTU/fragmentation AGAIN



Sorry about that.

As I said a few days ago, Fred's fragmentation model makes a lot of sense if we want to allow fragmentation in the first place. Let's see where we stand if we don't want to allow fragmentation.
If the path between the ITR and ETR supports 1500 bytes +  
encapsulation there shouldn't be any issues with path MTU discovery  
that aren't there without the encapsulation, too. (I'm using LISP  
terminology but this applies to all map/encap schemes.)
There are two possible other cases:

- the path doesn't support 1500+E and the ITR knows this
- the path doesn't support 1500+E but the ITR doesn't know this

In the first case, the ITR can simply return a too big message, and hosts/sites implementing PMTUD correctly won't have a problem. Hosts/ sites that don't will experience a PMTUD black hole. I'll get back to this.
In the latter case, the ITR would have to do PMTUD towards the ETR and  
after that send too bigs based on that result to the source hosts that  
send packets through the ITR. This is unwanted complexity for the ITR  
and it means that a relatively high number of packets that flow  
through the ITR could be dropped during this process. As such, I'd say  
that it's probably unacceptable to design a network such that this  
situation is common. I.e., having an ITR on network A that supports  
1500+E and an ETR on network B that supports 1500+E but then have the  
interconnection between these networks happen over an internet  
exchange with a 1500-byte MTU would trigger PMTUD and lost packets on  
ALL sessions between these ISPs, which I wouldn't find acceptable. In  
other words: internet exchanges used between two 1500+E networks that  
do map/encap must be upgraded to support at least 1500+E.
The reason why PMTUD works so badly today is most likely because it  
doesn't have to: all hosts connected to the internet that advertise  
TCP MSS of 1500 - overhead can successfully receive 1500-byte IPv4  
packets with DF=1. However, if a site wants to deploy an ITR and/or  
ETR, and suddenly, PMTUD black holes happen, the site has a very  
strong incentive to make the changes necessary to make those black  
holes go away. In the case of an ITR, this means making sure the  
source host receives the too big messages. In the case of an ETR, this  
means announcing a smaller TCP MSS.
So I'd say that SITES should be able to deploy xTRs even though they  
can't support a 1500+E MTU. (There needs to be a way to communicate  
the ETR MTU limiation back to ITRs, though.) However, the same is not  
true for ISPs: in that case, the ISP operators create the problem, but  
site operators / host admins (for a large number of sites) need to fix  
the problem. This is something that isn't going to work in practice.  
So ISPs MUST support 1500+E both on their xTRs and on peering links to  
other ISPs that run xTRs.
--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg