A while ago I wrote a few articles on how to interpret fragmented packets in Wireshark (part 1 and part 2). It took the fact that fragmentation exists for granted, so a reader asked me on LinkedIn: why should I care about fragmentation?

Some engineers are quite lucky and never had to troubleshoot MTU and fragmentation issues, perhaps due to good design or simply that devices just coped with it. It can be a tricky one to find, especially when all you have to go on is the end-users complaining about poor performance from time to time.

So why is fragmentation undesirable?

First of all, it adds transmission overhead, practically diminishing the efficiency of the communication (throughput is finite and increasing the amount of encapsulation decreases the actual useful data rate). When packets are fragmented, each fragment will need to have its own L2 and L3 headers so that it is transported to its destination.

There's also another type of overhead that's added: the fragmenter will have increased usage of CPU and memory, as will the destination of the traffic which has to buffer all fragments until it can reassemble the complete packet. If fragments are lost in transit then resources might be tied up for a while unnecessarily.

But these are not such a huge issue, as fragmenting a packet does not require new information and can be done efficiently by a router. Similarly, reassembly is generally done by an end-point (client, server) which theoretically has enough processing power and memory at its disposal.

If reassembly has to be done by a router, things change, as it's interfering with the router's primary goal: moving packets quickly and efficiently. It will require said router to store the fragments in its buffers (Cisco mentions it takes up the biggest buffer available as it doesn't know the total size of the fragmented packet), taking up shared resources.

Lost fragments in transit could cause an even bigger drop in efficiency than stated above, as when a fragment is lost the whole packet needs to be retransmitted (functionality that potentially exists at L4 or L7 depending on the application).

Let's do some maths

Picture a 10000 Bytes packet that needs to be sent over a 1500B link. It will be fragmented into at least 7 packets (could be worse as seen in IP Fragmentation in Wireshark). Taking a 10000B unfragmented packet as the 100% mark (total wire size with L2 and L3 10000+20+18), this fragmentation results in a total of 10266B (10000+7*20+7*18) transmitted, which is 97.777% as efficient. Not the end of the world, right? Right.

If any one of those 7 fragments gets lost in transit, then the 10000B need to be retransmitted, resulting in another 7 packets. That takes our efficiency down to: 48.889%. Ouch.

And this is the optimal fragmentation scenario, but it can get even worse if en-route to my destination there's a bit of network that instead of 1500B does only 1450B. That means each of the 7 fragments will be split into 2, resulting in 14 fragments!

Of course, you have mechanisms such as PMTUD (path MTU discovery) which, as far as I can tell, are used by very few applications. And to make things even worse, even if they could, "security" policies generally block the necessary ICMP Unreachable - Fragmentation Needed packets that have to be sent back to the source.

I said security, so what about firewalls

Any device doing filtering at the upper layers will have a problem: only the initial fragment has the L4-7 information, as the following ones only carry whatever data is in the packet.

Therefore, an inspection device will not be able to inspect the data inside the packets unless it reassembles the fragments into the initial packet. This is one factor that takes that nice paper throughput a firewall has down to abysmal values in a real life scenario with a mix of traffic.

TCP MSS

TCP has an option called Maximum Segment Size that defines how much data a host is willing to accept in a single TCP/IP segment. It is now selected as the minimum between the buffer size and the interface MTU-40 (TCP and IP headers), which definitely helps, but is not perfect (it only accounts for the MTU of the end-points, not what's in transit).

A lot of network devices can adjust this MSS in transit by fiddling with the SYN packets when they see them, so you can manually override and help avoid some fragmentation.

The ideal scenario is when PMTUD is enabled, as the TCP/IP stack will automatically adjust the MSS based on information discovered directly from the network.

They knew it back in 1987

And I'm not joking, check out this paper published in December 1987 called Fragmentation Considered Harmful. Of course, back then computing resources were so scarce and limited it must've been a bigger problem, even with the minuscule amounts of traffic (by today's standards) they had.

Here's a quote of the main ideas behind their statement:

  • Fragmentation causes inefficient use of resources: Poor choice of fragment sizes can greatly increase the cost of delivering a datagram. Additional bandwidth is used for the additional header information, intermediate gateways must expend computational resources to make additional routing decisions, and the receiving host must reassemble the fragments.
  • Loss of fragments leads to degraded performance: Reassembly of IP fragments is not very robust. Loss of a single fragment requires the higher level protocol to retransmit all of the data in the original datagram, even if most of the fragments were received correctly.
  • Efficient reassembly is hard: Given the likelihood of lost fragments and the information present in the IP header, there are many situations in which the reassembly process, though straightforward, yields lower than desired performance.

That's it for now. There's a second part to this post which contains a story of troubleshooting fragmentation and MTUs in one of the environments I worked in.

And, as always, thanks for reading.


Any comments? Contact me via Mastodon or e-mail.


Share & Subscribe!