[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RRG] MTU/fragmentation AGAIN



Templin, Fred L wrote:
Brian,

Thanks for the thoughts, but you are not saying anything
here that isn't already well understood and/or has been
said many times in the past. First, path MTU from A->B
does not necessarily have any relation to path MTU from
B->A; it is well understood that the paths can easily
be asymmetric and so the MTU discovery is simplex in
both the A->B and B->A directions.
Yep, I'm well aware. I worked with Geoff Huston on some asymmetric BGP stuff, in the late 1990's, involving simplex satellite links from Canada to Australia,
between Teleglobe and Telstra. Now *that* was asymmetric. :-)
but this gives me an idea for a minor change which
entails respecifying the sprite-reply to be minimum-sized
rather than requiring perishable padding for the
sprite-request. Thanks for that.

Glad to help. :-)
About MTU changes due to path changes, that is a problem
that needs to be dealt with in any case and not just the
map-and-encaps case. The two ways of detecting this are
to begin receiving packet-too-bigs when none were
received previously, or to proactively probe the path,
or both. (In the map-and-encaps case, sprite-mtu probing
can be used for proactive probing by the ITR.)
The problem I see there, specific to map-and-encaps, is the detection of path MTU changes. The xTRs are the only things to get the PTB's, and the hosts are the ones who need to do PMTUD.

Probing can detect the new MTU, but what triggers the probing?
About end-to-end MTU involvement from TCPs, RFC4821 already
provides a published standards-track specification for just
such a method.

Thanks - Fred
fred.l.templin@boeing.com
Thanks. I wasn't aware of 4821.

There was some interesting tidbits in there, that piqued my interest.

One of those was "IP Host Fragmenting" - avoiding middle-of-the-path fragmentation.

It could be very useful as an implementation scheme when probe packets are lost
due to MTU exceeded, where no PTB ICMP is received.

Specifically, sending the probe packet again as two fragments, with DF=1 set. It guarantees no further fragmentation, validates the PMTU, and avoids having to
do anything other than a retransmit of an existing packet.

Do you know if anyone is working on implementations of 4821? I'd like to pass this
idea along to them...

Brian
-----Original Message-----
From: Brian Dickson [mailto:briand@ca.afilias.info] Sent: Wednesday, January 02, 2008 12:20 PM
To: Iljitsch van Beijnum
Cc: Routing Research Group list
Subject: Re: [RRG] MTU/fragmentation AGAIN

Iljitsch van Beijnum wrote:
Sorry about that.

As I said a few days ago, Fred's fragmentation model makes a lot of sense if we want to allow fragmentation in the first place.
Let's see
where we stand if we don't want to allow fragmentation.

Here, here! :-)
If the path between the ITR and ETR supports 1500 bytes + encapsulation there shouldn't be any issues with path MTU discovery that aren't there without the encapsulation, too. (I'm using LISP terminology but this applies to all map/encap schemes.)

There are two possible other cases:

- the path doesn't support 1500+E and the ITR knows this
- the path doesn't support 1500+E but the ITR doesn't know this
Just to dig a bit deeper, is my understanding on PMTUD correct, as follows?

The MTU to be discovered from A->B is not necessarily related to the MTU to be discovered B->A A needs to know MTU-A for A->B, and vice versa, to avoid fragmentation. If A can do PMTUD, A does (or should) not depend on B in any way to accomplish this.

So, PMTUD can be considered two simplex problems - A->B and B->A.

Okay so far?

But, with encaps/decaps, we have A->ITR->path-1->ETR->B,
which can change to A->ITR->path-2->ETR->B, at any time.
And it is possible that MTU(path-1) != MTU(path-2).

Does this do bad things to PMTUD?

If we are able to produce a newer, better, PMTUD, that requires deployment of new TCP stacks, and achieves 99.9% of the desired benefit (no fragmentation), when deployed only on the server-side of things, would that be worthy of spending time on?

Here's what I'm thinking:

For TCP, all the window-size stuff is based on bandwidth*delay. Number of packets and rate of transmission.

If the slow-start and back-off stuff, also used MTU as a parameter, PMTUD could become an implicit part of TCP.

By shrinking packet size, the effective window shrinks. Grow packet size, grow window. Adapt the packet size first, then the rate, to achieve the optimum TCP window.

And, if MTU is exceeded, without having (or needing!) any ICMP or fragmentation, packet loss would result in lowering of MTU until the right MTU was used.

It's a bit wasteful in the presence of true PMTUD, but in its absence, it makes things "just work".

Thoughts?

Brian Dickson

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg



--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg