[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[RRG] MTU/fragmentation AGAIN

To: Routing Research Group list <rrg@psg.com>
Subject: [RRG] MTU/fragmentation AGAIN
From: Iljitsch van Beijnum <iljitsch@muada.com>
Date: Sun, 23 Dec 2007 00:51:54 +0100

Sorry about that.

As I said a few days ago, Fred's fragmentation model makes a lot ofsense if we want to allow fragmentation in the first place. Let's seewhere we stand if we don't want to allow fragmentation.

If the path between the ITR and ETR supports 1500 bytes +encapsulation there shouldn't be any issues with path MTU discoverythat aren't there without the encapsulation, too. (I'm using LISPterminology but this applies to all map/encap schemes.)

There are two possible other cases:

- the path doesn't support 1500+E and the ITR knows this
- the path doesn't support 1500+E but the ITR doesn't know this

In the first case, the ITR can simply return a too big message, andhosts/sites implementing PMTUD correctly won't have a problem. Hosts/sites that don't will experience a PMTUD black hole. I'll get back tothis.

In the latter case, the ITR would have to do PMTUD towards the ETR andafter that send too bigs based on that result to the source hosts thatsend packets through the ITR. This is unwanted complexity for the ITRand it means that a relatively high number of packets that flowthrough the ITR could be dropped during this process. As such, I'd saythat it's probably unacceptable to design a network such that thissituation is common. I.e., having an ITR on network A that supports1500+E and an ETR on network B that supports 1500+E but then have theinterconnection between these networks happen over an internetexchange with a 1500-byte MTU would trigger PMTUD and lost packets onALL sessions between these ISPs, which I wouldn't find acceptable. Inother words: internet exchanges used between two 1500+E networks thatdo map/encap must be upgraded to support at least 1500+E.

The reason why PMTUD works so badly today is most likely because itdoesn't have to: all hosts connected to the internet that advertiseTCP MSS of 1500 - overhead can successfully receive 1500-byte IPv4packets with DF=1. However, if a site wants to deploy an ITR and/orETR, and suddenly, PMTUD black holes happen, the site has a verystrong incentive to make the changes necessary to make those blackholes go away. In the case of an ITR, this means making sure thesource host receives the too big messages. In the case of an ETR, thismeans announcing a smaller TCP MSS.

So I'd say that SITES should be able to deploy xTRs even though theycan't support a 1500+E MTU. (There needs to be a way to communicatethe ETR MTU limiation back to ITRs, though.) However, the same is nottrue for ISPs: in that case, the ISP operators create the problem, butsite operators / host admins (for a large number of sites) need to fixthe problem. This is something that isn't going to work in practice.So ISPs MUST support 1500+E both on their xTRs and on peering links toother ISPs that run xTRs.

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg

Prev by Date: Re: [RRG] draft-farinacci-lisp-05
Next by Date: Re: [RRG] Tunnel fragmentation/reassembly for RRG map-and-encaps architectures
Previous by thread: [RRG] Finding DNS servers
Index(es):
- Date
- Thread