[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RRG] PMTUD, Sprite & IPTM; Outer src-addr = sending host's addr



Hi Iljitsch,

I respond to what you wrote about MTUs, jumboframes and ITR-ETR
placement in a separate message to follow.

You wrote, initially quoting me:

>>> Second, tunnel endpoints should simply implement two sides of
>>> path MTU discovery: they should discover the maximum packets
>>> they can successfully send to the other endpoint, and they
>>> should set their own inbound MTU to that value minus the size of
>>> the header that's added.
>
>> This is what Fred's and my approaches are trying to achieve.
>
> Doesn't look that way to me...

OK - I missed the part: "they  should set their own inbound MTU to
that value ...".

An ITR can't set its inbound MTU to the lowest value of any ETR it
might need to send packets to.  My IPTM proposal:

   http://www.firstpr.com.au/ip/ivip/pmtud-frag/

is different from what you wrote.  So is Fred's, I think.  My page
above contains a complete description of what I think is the
problem, and the best solution.  I would really appreciate you or
anyone else writing a detailed critique of this.

>> However, when sending the first packet to an ETR, there is no
>> time for the ITR to muck around testing the PMTU limit before
>> sending the packet.  So I suggest sending it if it is shorter
>> than some assumed limit (eg. 1280) and fragmenting it if it is
>> longer - irrespective of whether its do not fragment bit is set.
>
> That is a really bad solution, because this guarantees a good
> amount of fragmenting.

As I point out in my proposal, fragmentation is only performed for
those initial long packets in a potential stream to an ETR which the
ITR hasn't sent packets to recently.  I think it is better to do
this than to drop the packet, sending back a Packet To Big message
to the sending host with a too low (assumed, not tested) PMTU value
 - which would screw up that host's attempt to find the MTU of the
path to the destination.  The final PMTU is likely to be pretty
close to 1500 once the ITR has probed the ETR.

> With IPv4, this is rather problematic because of the small
> ID space.

This is only likely to be for a handful of packets, though I suppose
one could construct a worst-case scenario of a sudden burst of long
packets which would need to be fragmented until the ITR could figure
out the true MTU.

I haven't looked closely at the fragmentation reassembly problems of
IPv4.  Can you point to some references?


> It also costs you lots of CPU and could even allow for CPU
> exhaustion attacks.

Yes, but I think it is better than dropping longish packets just
because we assume some too low PMTU of 1280 or whatever, when in
fact, within a second or two, the ITR will probably be able to
establish that the real PMTU is 1500 or somewhat less.  Then the ITR
can switch over to dropping packets (and sending a Packet Too Big
message with the correct PMTU value the sending host should assume)
 which need to go to that ETR when they are longer than this value
after encapsulation.

I am just repeating what is on my web page.  This encapsulation,
PMTUD and fragmentation stuff is really difficult.  I would really
appreciate detailed critiques of Fred's and my proposals - rather
than critiques based on too quick a read of what we wrote.

> A few lost packets here or there because of PMTUD aren't the end
> of the world; just keep things simple.

I agree to some extent.  While I don't think it acceptable that an
ITR should generally drop all first packets to an ETR, since they
are typically short DNS, TCP open or some other initial packet -
which would delay for seconds the establishment of a communication -
I guess it would be a different class of packets we are discussing here.

They would be the first substantial data carrying packets in the
flow, since the earlier, shorter ones would have been ignored by
Sprite or IPTM.

Still, we can predict that there will be such large packets early on
in many communications.  Simply dropping them doesn't seem right to
me.  Dropping them with a too-low PMTU value being sent to the
sending host would screw up that host's later packets, making them
shorter than they really need to be.  I think fragmenting them at
first is the best approach.

>> Later, if more such packets need to be sent, the ITR and ETR can
>> work on determining the real PMTU.  I do this with probe packets,
>> rather than traffic packets.
>
> Even more overhead...

Yes.  However I can't see a way of probing the PMTU in any other
way.  ICMP can't be relied upon, and if I tried to use only traffic
packets, I would have to risk those packets not arriving.  Instead,
IPTM fragments the traffic packets and sends its own probe packets.
 This means there is no fancy overhead in traffic packets - they are
not intended to be used for PMTUD at all.

  - Robin


--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg