[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: mech-v2: static MTU scope; feedback sought



Pekka,

See below for my thoughts on this:

Pekka Savola wrote:

Hi,

There's still one (the most difficult IMO) issue in transmech-v2 update, triggered by Dave Thaler's comment:

http://ops.ietf.org/lists/v6ops/v6ops.2003/msg01717.html

That is, a node with v4-in-v6 tunnel, v4 interface, and a v6-in-v6
MIPv6 tunnel: using default MTU 1280 on v4-in-v6 tunnel causes v6 fragmentation on the v6-in-v6 tunnel for the packets whose size is greater than 1240 bytes.


On the other hand, there have been comments on the contrary, that mismatching MTU/MRU is a bad thing, like pointed out by itojun. I remember the message but have been unable to find the message referred to in:

http://ops.ietf.org/lists/v6ops/v6ops.2003/msg01718.html

Now, I'd like to move on here.  I personally don't see much of a
problem with mismatching MRU/MTU values as long as they are below
1500.  But I'd like to see whether there are any evidence indicating
otherwise.


I agree that it is OK to have the MRU slightly larger than the MTU. Suppose
the tunnel interface uses the IPv6 minMTU of 1280 bytes, but it is configured
over an end-to-end L2 VPN interface. The VPN interface will insert extra header
bytes for security headers, and the far end of the tunnel will require an
MRU > 1280 bytes on the physical interface(s) that might receive the packet.
Steve Deering gave good analysis supporting 1500bytes as the MRU for the
physical interfaces at the far end of the tunnel in a November, 1997 message
to the IPng list. See:


http://www.cs-ipv6.lancs.ac.uk/ipv6/mail-archive/IPng/1997-12/0052.html

In Dave's scenario, the obvious easy fix is to use dynamic MTU determination instead of the static one, but this avoids the real question of when it would be OK to raise the MTU w/ static MTU configuration.

Moreover, I believe the static MTU case must also be able to react to
the received ICMPv6 packet too big messages (e.g., you configure the
interface to e.g. 1500, but someone sends you a message that 1280 is
the maximum -- do you have to react to it?  I assume yes, but I guess
someone would view that this "very minimal v6 PMTU detection
mechanism"  would be out of scope.

So, to sum my questions here, asking for feedback:

- do you have references/evidence of the problems caused by mismatching MTU/MRU that should be considered here?


Another reference, perhaps, would be PPP where (I believe) it is recommended to use a slightly larger MRU than the MTU. I am interested in operational experience others may have, however.

- MUST a static tunnel MTU case also implement an algorithm to
react to the received ICMPv6 Packet too Big messages, e.g. to lower the MTU? (or is this really a subset of dynamic tunnel MTU). RFC2463 sect 3.2 seems to give an impression that this would have
to be done, but not sure.



Well, the text you are citing says only that "An incoming Packet Too Big messaage MUST be passed to the upper-layer process"; I'm not really seeing anything about changing the tunnel MTU based on the ICMPv6 messages.

- if people agree with this MUST, and no MTU/MRU strong mismatch evidence is found, I believe the "MUST default to 1280" might be too strong -- the worst that could happen is to a) create a blackhole if a router is able to forward a packet but the on-link recipient not receive it, b) if too high value is picked,
v4 fragmentation would ensue (bad).



My concern here is for nodes with long, thin (i.e., slow) physical links. Suppose a tunnel interface is configured over a physical interface that takes the recommendations of BCP 48 and configures a small IPv4 MTU, e.g., 296 bytes. The tunnel interface will still have to configure at least the minimum IPv6 MTU of 1280 bytes, but the 296 byte IPv4 MTU of the underlying physical interface will be opaque to the upper layers. In this case, two possibilities exist when the tunnel interface sends encapsulated packets larger than 296 bytes:

 1) The tunnel interface sends encapsulated packets with the DF bit
   NOT set in the IPv4 header, and the IPv4 stack performs intentional
   IPv4 fragmentation to a maximum fragment size of 296 bytes.

 2) The tunnel interface sends encapsulated packets with the DF bit
   SET, and the IPv4 stack sends back a locally-generated ICMPv4
   "fragmentation needed" message with MTU = 296.

My thoughts on 1) are that any host-based IPv4 fragmentation should
produce a controlled and minimal number of IPv4 fragments and should
not severely impact performance. This is quite different than unmitigated
IPv4 fragmentation caused by middleboxes in the network, since there is
no way of knowing whether the middleboxes will incur slow path
processing or otherwise experience performance degradation. On the
other hand, I have heard that there are devices such as security gateways,
NATs, etc. that drop incoming IPv4 fragments (perhaps only sending
the first-fragment on to the final destination), i.e., black-holes may
result. What are the operational experiences with this?

My thoughts on 2) are that the locally-generated ICMPv4 "fragment
needed" messages will be passed to the configured tunnel device driver
in the kernel, but what actions should be taken? Should the driver:

 a) reduce the configured tunnel's MTU to 296-20 (i.e., less than the
   IPv6 minimum MTU)?
 b) keep the MTU in, e.g., a neighbor cache entry to provide a maximum
   fragment size for IPv6 fragmentation?
 c) translate the ICMPv4 "frag needed" into an ICMPv6 "packet too big"
   and pass it up to the application?
 d) other?

I can imagine either a) or b) as acceptable answers. I have a hard time
believing c) could be good, because it might cause TCPs to configure
a too-small MSS. But, I'd like to hear operational experiences and/or
other alternatives I may be missing.

- do you think it would be OK to recommend dynamic MTU instead in the double encapsulation scenario described by Dave?


Possibly; the only concern is trust issues for ICMPv4 "fragmentation needed" messages generated by middleboxes. Perhaps if the ICMPv4s were only accepted from the localhost and from the far end of the tunnel it would be OK?

Fred
ftemplin@iprg.nokia.com