[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RRG] PMTUD, Sprite & IPTM; Outer src-addr = sending host's addr

To: Robin Whittle <rw@firstpr.com.au>
Subject: Re: [RRG] PMTUD, Sprite & IPTM; Outer src-addr = sending host's addr
From: Iljitsch van Beijnum <iljitsch@muada.com>
Date: Sat, 24 Nov 2007 23:10:22 +0100
Cc: Routing Research Group list <rrg@psg.com>
In-reply-to: <474792F8.2060006@firstpr.com.au>
References: <4746809F.5020604@firstpr.com.au> <47005293-509C-4E35-8D06-4E0F69A514C1@muada.com> <474792F8.2060006@firstpr.com.au>

On 24 nov 2007, at 3:56, Robin Whittle wrote:

An ITR can't set its inbound MTU to the lowest value of any ETR it
might need to send packets to.

Hm, you're right: in a point-to-point tunnel this is trivial, but in apoint-to-multipoint tunnel the MTU will be different for differentdestinations. This complicates matters, but I would still like tostick to the basic principle that the MTU the tunnel presents toencapsulated packets is a simple reflection of the path MTU discoveredthrough the sending of tunneled packets.

is different from what you wrote.  So is Fred's, I think.  My page
above contains a complete description of what I think is the
problem, and the best solution.  I would really appreciate you or
anyone else writing a detailed critique of this.

I'm sorry, but I'm not prepared to do that at this time. I've beendiscussing tunnel MTU issues for some time recently, and my conclusionis that you either need to have a rather involved scheme that issupported on both ends (hard for existing tunneling mechanisms butdoable for something new) or it's necessary to reject a good number ofseemingly reasonable use cases to keep things workable. There doesn'tseem to be much useful middle ground here.

If you like, I can send you copies of the ~ 200 message exchangebetween a number of people that led up to Fred's sprite MTU proposal.

So I suggest sending it if it is shorter
than some assumed limit (eg. 1280) and fragmenting it if it is
longer - irrespective of whether its do not fragment bit is set.

That is a really bad solution, because this guarantees a good
amount of fragmenting.

As I point out in my proposal, fragmentation is only performed for
those initial long packets in a potential stream to an ETR which the
ITR hasn't sent packets to recently.

Obviously this would be a fraction of all packets under normalcircumstances, but it would still mean that routine packets (i.e.,1500-byte ones) would be fragmented, which I don't like.

With IPv4, this is rather problematic because of the small
ID space.

This is only likely to be for a handful of packets, though I suppose
one could construct a worst-case scenario of a sudden burst of long
packets which would need to be fragmented until the ITR could figure
out the true MTU.

Right. So you'd probably have to be prepared to deal with this eventhough it wouldn't be an issue in practice most of the time.

I haven't looked closely at the fragmentation reassembly problems of
IPv4.  Can you point to some references?

I don't have any references, but in short, the issue is that you havea 16 bit ID space with a reassembly timeout of something like a fewminutes. This means you can only send 65536 packets during that "fewminute" window or you'll incorrectly reassemble fragments fromdifferent packets if you lose a fragment. This is especiallyproblematic if the fragmented packets belong to a tunnel because inthat case the IP source/dest addresses are always the same.

It also costs you lots of CPU and could even allow for CPU
exhaustion attacks.

Yes, but I think it is better than dropping longish packets just
because we assume some too low PMTU of 1280 or whatever, when in
fact, within a second or two, the ITR will probably be able to
establish that the real PMTU is 1500 or somewhat less.

Who said anything about preemptively dropping packets? Just send 1500-byte packets + an outer header with DF set and you'll get a "too big".After that, you know the path MTU and you can in turn send too bigs tothe source of the original packets.

Yes, this allows for PMTUD black holes, but those are subject to the"so don't do that and the problem goes away" doctrine. ISPs generallyget this, unlike enterprise people and ignorant consumers who can'tlive without their firewalls.

Still, we can predict that there will be such large packets early on
in many communications.  Simply dropping them doesn't seem right to
me.  Dropping them with a too-low PMTU value being sent to the
sending host would screw up that host's later packets, making them
shorter than they really need to be.  I think fragmenting them at
first is the best approach.

If we mandate that *TRs support 1500-byte user traffic withoutfragmentation this wouldn't be any issue in practice.

Later, if more such packets need to be sent, the ITR and ETR can
work on determining the real PMTU.  I do this with probe packets,
rather than traffic packets.

Even more overhead...

Yes.  However I can't see a way of probing the PMTU in any other
way.  ICMP can't be relied upon, and if I tried to use only traffic
packets, I would have to risk those packets not arriving.  Instead,
IPTM fragments the traffic packets and sends its own probe packets.
This means there is no fancy overhead in traffic packets - they are
not intended to be used for PMTUD at all.

I REALLY don't like this: generating singalling traffic when there isno data traffic is a very bad precedent. However, we probably need toprobe for reachability in some way or another, if we can do the MTUstuff along with that i may be tolerable.

Iljitsch

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg

Follow-Ups:
- Re: [RRG] PMTUD, Sprite & IPTM; Outer src-addr = sending host's addr
  - From: Robin Whittle <rw@firstpr.com.au>

References:
- [RRG] PMTUD, Sprite & IPTM; Outer src-addr = sending host's addr
  - From: Robin Whittle <rw@firstpr.com.au>
- Re: [RRG] PMTUD, Sprite & IPTM; Outer src-addr = sending host's addr
  - From: Iljitsch van Beijnum <iljitsch@muada.com>
- Re: [RRG] PMTUD, Sprite & IPTM; Outer src-addr = sending host's addr
  - From: Robin Whittle <rw@firstpr.com.au>

Prev by Date: Re: [RRG] Idea for shooting down
Next by Date: Re: [RRG] MTU, jumboframes, ITR & ETR placement, ITR function in hosts
Previous by thread: Re: [RRG] PMTUD, Sprite & IPTM; Outer src-addr = sending host's addr
Next by thread: Re: [RRG] PMTUD, Sprite & IPTM; Outer src-addr = sending host's addr
Index(es):
- Date
- Thread