[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: transmech MTU comments
Erik,
I would like to discuss some aspects of the current text in [MECH],
section 3:
3.2.1. Static Tunnel MTU
A node using a static tunnel MTU MUST limit the size of the IPv6
packets it tunnels to 1280 bytes i.e., treat the tunnel interface as
having a fixed interface MTU of 1280 bytes. An implementation MAY
have a configuration knob which can be used to set a larger value of
the tunnel MTU than 1280 bytes, but if so the default MUST be 1280
bytes. A larger fixed MTU should not be configured unless it has
been administratively ensured that the decapsulator can reassemble
packets of that size. Care should be taken when manually configuring
large tunnel MTUs to only do so when the MTU of the IPv4 path to the
tunnel endpoint is large to avoid causing excessive fragmentation.
When using the static tunnel MTU the Don't Fragment bit MUST NOT be
set in the encapsulating IPv4 header. As a result the encapsulator
should not receive any ICMPv4 "packet too big" message as a result of
the packets it has encapsulated.
The latter paragraph implies that either all links on the path will
be at least as large as the static MTU, or that nodes with constricting
links will use IPv4 fragmentation to split the packet into pieces small
enough to traverse the constricting link. The former case will not be
true in general, because certain bandwidth constrained links will choose
smaller-than-1280-byte MTUs for their IPv4 interfaces if BCP 48, 50,
and 71 recommendations are followed. Also, we cannot be assured that
all forwarding nodes will correctly implement IPv4 fragmentation.
So, we have a very real possibility for black holes here.
3.2.2. Dynamic Tunnel MTU
The dynamic MTU determination is OPTIONAL. However, if it is
implemented, it SHOULD have the behavior described in this document.
The fragmentation inside the tunnel can be reduced to a minimum by
having the encapsulator track the IPv4 Path MTU across the tunnel,
using the IPv4 Path MTU Discovery Protocol [RFC1191] and recording
the resulting path MTU. The IPv6 layer in the encapsulator can then
view a tunnel as a link layer with an MTU equal to the IPv4 path MTU,
minus the size of the encapsulating IPv4 header.
Note that this does not eliminate IPv4 fragmentation in the case when
the IPv4 path MTU would result in an IPv6 MTU less than 1280 bytes.
(Any link layer used by IPv6 has to have an MTU of at least 1280
bytes [RFC2460].) In this case the IPv6 layer has to "see" a link
layer with an MTU of 1280 bytes and the encapsulator has to use IPv4
fragmentation in order to forward the 1280 byte IPv6 packets.
But, shouldn't the encapsulator send a "packet too big" to the source
in this case even if the MTU it reports is less than 1280 bytes? In
response,
the source should then include a fragment header in the packets it sends
as a signal to the encapsulator that IPv4 fragmentation is permissible
(see RFC 2460, section 5). More discussion on this below:
The encapsulator SHOULD employ the following algorithm to determine
when to forward an IPv6 packet that is larger than the tunnel's path
MTU using IPv4 fragmentation, and when to return an IPv6 ICMP "packet
too big" message per [RFC1981]:
if (IPv4 path MTU - 20) is less than 1280
if packet is larger than 1280 bytes
Send IPv6 ICMP "packet too big" with MTU = 1280.
Drop packet.
else
Encapsulate but do not set the Don't Fragment
flag in the IPv4 header. The resulting IPv4
packet might be fragmented by the IPv4 layer on
the encapsulator or by some router along
the IPv4 path.
endif
I believe the above "else" case should be re-worded as follows:
else
if packet does not contain a fragment header
Send IPv6 ICMP "packet too big" with MTU
= (IPv4 path MTU - 20). Drop packet.
else
Encapsulate and fragment the packet using IPv4
fragmentation with a maximum fragment size
of (IPv4 path MTU - 20). The lower 16 bits of
the Identification field in the fragment header
is used as the Identification field for each IPv4
fragment header, and the Don't Fragment field
is not set.
endif
endif
First, about sending the "packet too big" with an MTU size less
than 1280, this seems to me to be consistent with the expectation
specified in RFC 2460, section 5. This is what an "IPv6-to-IPv4
translator" is supposed to do, and from the perspective of the
original IPv6 host it makes no difference whether the node
that sends the packet too big is a translator or an IPv6-in-IPv4
tunnel endpoint.
As to fragmenting the packet in the enapsulator instead of
just sending it with the DF bit not set, the encapsulator has no
way of knowing whether there are forwarding nodes in the IPv4
path with broken, non-existent, or slow-path IPv4 fragmentation
implementations and so the only safe option is for the tunnel
encapsulator itself to do the fragmentation.
As to the setting of the fragment ID field, my suggested text
above reflects my best understanding of the normative ref's, but
I believe we have the following problem. What if the original IPv6
source wanted to do host-based IPv6 fragmentation (e.g., for large
UDP packets) even though the IPv6 path MTU was less than 1280
bytes?
The source would send a series of N IPv6 fragments, each of which
would have the same value in the fragment ID field. But then, the
tunnel encapsulator would use IPv4 fragmentation to split each of the
N IPv6 fragments into M IPv4 fragments again using the *same*
fragment ID value! We would then have a collision in the decapsulator's
IPv4 reassembly buffer, since there would be no way of knowing to
which one of the N IPv6 fragments a particular IPv4 fragment belonged!
So, either my understanding of the normative references is wrong,
or the normative references themselves are wrong. Can you help?
Fred
ftemplin@iprg.nokia.com