I actually wanted to write this article the first time as an answer to the question in the title (posted by one of my readers), but I ended up with a long part 1 which talks about why you fragmentation is undesirable and some problems you might run into while it is present.

There's a real story coming below, which I've had to tweak a bit to keep it simple.

The story

This whole thing took place in a transport network that was carrying two different streams of traffic (going through the same interfaces): signaling and user data.

Both streams (starting at an MTU of 1500) would have to go through a tunnel (additional encapsulation) transported over a network that could only take 1500 MTUs. Therefore the biggest data packets would end up being fragmented, something that was known at design time and catered for. The device having to perform said fragmentation was sized accordingly to cope with the additional load (most of it being in the downlink direction, towards the user).

+------+        +------+                    +--------------+              +--------+
| User |        |Router|    Tunnel (1460)   | Fragmenting  |              | Server |
|      +--------+      +--------------------+ Device       +--------------+        |
| 1500 |  1500  | 1500 |      MTU 1500      | 1500         |   MTU 1500   | 1500   |
+------+        +------+                    +--------------+              +--------+

All nice and well, but for the fact that sometimes user data transfers would be slow or simply hang. I mentioned two different streams there... the signaling average packet size would be under 500, but the user data could be anything from tiny packets to maximum size 1500 behemoths.

Ultimately, the catch was that the server was doing something it thought reasonable, but which was very damaging: if the packet it was due to send out its interface had a total size under the interface MTU, it would set the DF bit (Don't Fragment).

All packets that were under 1460 would make it to the Fragmenting Device, be within MTU for the Tunnel interface and be sent on their merry way. But a packet of 1461 or bigger would need to be fragmented in order to fit inside the Tunnel... only the set DF bit wouldn't allow it.

What happened is that the Fragmenting Device had no alternative but to drop the packet, maaaaybe increment a counter somewhere and generate a Fragmentation Needed message. The Server, thinking it was well within its boundaries, would continue trying until the application gave up on the connection due to timeouts.

In the end, because it was nigh impossible to bump the MTU on the tunnel (which is the most reasonable solution), we had to come up with other options:

  • ignore the DF bit on the Fragmenting Device
  • force the Server to stop setting the DF bit

Neither of these was really good and it didn't solve the underlying issue, only the current observable symptoms, which were rare enough anyway and end-users couldn't even tell most of the time... you see where I'm going with this. Any bigger changes would require a much higher time and money investment than the minor impact was worth, so on to the pile of technical debt it went.

And, as always, thanks for reading.


Any comments? Contact me via Mastodon or e-mail.


Share & Subscribe!