Fred Baker wrote:
On May 26, 2006, at 1:21 AM, Elwyn Davies wrote:I would suggest that rather than making this into a separate draft we can look at improving the text in the security overview (if necessary) given that we are probably going to have to do *another* round on this one.That seems rational to me.
Me too. As Pekka said any normative changes to the IPv6 protocol would probably have to go elsewhere anyway.
A quick comment:- no overlapping fragments and
Overlapping fragments is the main problem IMO. I would have liked this to be banned so that implementations can just drop such packets. Of course some hosts might still accept them, and e.g. a firewall would need to keep some state to detect overlap.
- non-last fragments to be close to the guaranteed minimum MTU.I think the latter point wants to read "non-last fragments close to the PATH MTU, and therefore at least as large as the largest fragment size less than or equal in size to the minimum MTU."
The draft doesn't talk about overlap, but talks about this latter point. Unless overlap is forbidden, a minimum size alone is not sufficient. You might have an initial fragment containing most of the datagram (of sufficient size), and then have a second overlapping one that overwrites parts of the header.
Whatever minimum size we set, one may add so many extension headers that e.g. the transport header does not go into the first fragment anyway.
If a minimum fragment size is specified, then I think it should be about half the minimum MTU. Rather than using the MTU for all fragments but the last, it might make sense to distribute the data evenly so that all fragments are roughly the same size. See below.
There is a discussion of fragmentation and reassembly in RFC 1812 section 4.2.2.7 that may be useful to reference, or at least learn lessons from. In part, this results from the behavior of NIC cards in the late 1980's that couldn't reliably receive datagrams back to back for very long due to chip or memory issues, and partly this is is due to brain-dead behaviors in early end-station OS's. One issue was that many systems sent packets at the minimum MTU size (then 576 bytes, derived from the memory structure of the Fuzzball and BBN Routers) rather than at the path MTU, which meant that there were greater opportunities for per-datagram errors because there were more of them. There were also various strategies on how to fragment - should the first datagram be the smallest, the last, or should the fragmenting system try to make them all approximately the same size? It makes some specific recommendations:4.2.2.7 Fragmentation: RFC 791 Section 3.2 Fragmentation, as described in [INTERNET:1], MUST be supported by a router. When a router fragments an IP datagram, it SHOULD minimize the number of fragments. When a router fragments an IP datagram, it SHOULD send the fragments in order. A fragmentation method that may generate one IP fragment that is significantly smaller than the other MAY cause the first IP fragment to be the smaller one. DISCUSSION There are several fragmentation techniques in common use in the Internet. One involves splitting the IP datagram into IP fragments with the first being MTU sized, and the others being approximately the same size, smaller than the MTU. The reason for this is twofold. The first IP fragment in the sequence will be the effective MTU of the current path between the hosts, and the following IP fragments are sized to minimize the further fragmentation of the IP datagram. Another technique is to split the IP datagram into MTU sized IP fragments, with the last fragment being the only one smaller, as described in [INTERNET:1]. A common trick used by some implementations of TCP/IP is to fragment an IP datagram into IP fragments that are no larger than 576 bytes when the IP datagram is to travel through a router. This is intended to allow the resulting IP fragments to pass the rest of the path without further fragmentation. This would, though, create more of a load on the destination host, since it would have a larger number of IP fragments to reassemble into one IP datagram. It would also not be efficient on networks where the MTU only changes once and stays much larger than 576 bytes. Examples include LAN networks such as an IEEE 802.5 network with a MTU of 2048 or an Ethernet network with an MTU of 1500). One other fragmentation technique discussed was splitting the IP datagram into approximately equal sized IP fragments, with the size less than or equal to the next hop network's MTU. This is intended to minimize the number of fragments that would result from additional fragmentation further down the path, and assure equal delay for each fragment.
This latter point is what I'm refering to above.Also note that as mentioned above, some implementations send fragments out of order (e.g. Linux has been known to do this). The reason is that you don't know the total size of the datagram until you receive the last fragment. Receiving the last fragment first means that you can allocate the right amount of memory immediately.
Stig
Routers SHOULD generate the least possible number of IP fragments. Work with slow machines leads us to believe that if it is necessary to fragment messages, sending the small IP fragment first maximizes the chance of a host with a slow interface of receiving all the fragments.I think the NIC card issue (where should the smallest fragment be?) is historical and not to be worried about. But the matter of generating (by whatever algorithm) the least number of fragments that can represent a transport-level message is, I think, important.