[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[RRG] MTU/fragmentation AGAIN
- To: Routing Research Group list <rrg@psg.com>
- Subject: [RRG] MTU/fragmentation AGAIN
- From: Iljitsch van Beijnum <iljitsch@muada.com>
- Date: Sun, 23 Dec 2007 00:51:54 +0100
Sorry about that.
As I said a few days ago, Fred's fragmentation model makes a lot of
sense if we want to allow fragmentation in the first place. Let's see
where we stand if we don't want to allow fragmentation.
If the path between the ITR and ETR supports 1500 bytes +
encapsulation there shouldn't be any issues with path MTU discovery
that aren't there without the encapsulation, too. (I'm using LISP
terminology but this applies to all map/encap schemes.)
There are two possible other cases:
- the path doesn't support 1500+E and the ITR knows this
- the path doesn't support 1500+E but the ITR doesn't know this
In the first case, the ITR can simply return a too big message, and
hosts/sites implementing PMTUD correctly won't have a problem. Hosts/
sites that don't will experience a PMTUD black hole. I'll get back to
this.
In the latter case, the ITR would have to do PMTUD towards the ETR and
after that send too bigs based on that result to the source hosts that
send packets through the ITR. This is unwanted complexity for the ITR
and it means that a relatively high number of packets that flow
through the ITR could be dropped during this process. As such, I'd say
that it's probably unacceptable to design a network such that this
situation is common. I.e., having an ITR on network A that supports
1500+E and an ETR on network B that supports 1500+E but then have the
interconnection between these networks happen over an internet
exchange with a 1500-byte MTU would trigger PMTUD and lost packets on
ALL sessions between these ISPs, which I wouldn't find acceptable. In
other words: internet exchanges used between two 1500+E networks that
do map/encap must be upgraded to support at least 1500+E.
The reason why PMTUD works so badly today is most likely because it
doesn't have to: all hosts connected to the internet that advertise
TCP MSS of 1500 - overhead can successfully receive 1500-byte IPv4
packets with DF=1. However, if a site wants to deploy an ITR and/or
ETR, and suddenly, PMTUD black holes happen, the site has a very
strong incentive to make the changes necessary to make those black
holes go away. In the case of an ITR, this means making sure the
source host receives the too big messages. In the case of an ETR, this
means announcing a smaller TCP MSS.
So I'd say that SITES should be able to deploy xTRs even though they
can't support a 1500+E MTU. (There needs to be a way to communicate
the ETR MTU limiation back to ITRs, though.) However, the same is not
true for ISPs: in that case, the ISP operators create the problem, but
site operators / host admins (for a large number of sites) need to fix
the problem. This is something that isn't going to work in practice.
So ISPs MUST support 1500+E both on their xTRs and on peering links to
other ISPs that run xTRs.
--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg