The dark side of OSPF transit capability (3)

Here we are, the long awaited (or so I like to believe) finale of the series! You are now a veteran when it comes to OSPF transit areas, but there's still one tiny detail nagging you: ABRs are not allowed to summarize backbone prefixes nor filter them in any way when advertising them into the transit area.

Before we dive in though, in case you're not familiar with the series take a look at part 1 and part 2.

Case number 4

While all of the previous cases could be proven in a lab (no capability transit on Cisco routers), this one is purely theoretical, because we can't disable this very important loop prevention mechanism.

So let's assume this limitation is not in place and ABRs are free to summarize as they please.

The setup is pretty straight-forward and clean, as there's no need for any special effects in this example. Both R3 and R2 are configured to summarize to 100.100.0.0/16 - which they do, as Area 0 contains 100.100.100.0/24.

It all works fine, apart from the fact that Area 600 prefixes are not advertised anywhere due to R6 not being an ABR. We fix this with not one, but two virtual-links (so everything is nice and symmetric) from R6 to R2 and R3 respectively.

Once the VLs are up, R6 becomes an ABR and starts chugging out Type3 LSAs, including 100.100.100.0/24. This prefix is internal to the backbone, so R6 dutifully sends Type3 LSAs into Area 100 and Area 600.

How does the network look from R4's eyes? Well, a bit confusing. If it wants to get to a host in 100.100.100.0, it has three options:

100.100.0.0/16 via R2, cost 30
100.100.0.0/16 via R3, cost 50
100.100.100.0/24 via R6, cost 50

R4's routing table is going to have both option 1 and option 3 installed, but, for routing, it will use the longest-match: 100.100.100.0/24 via R6.

R5 is going to be in the same (mirrored) situation and will choose to route via R6's more specific /24.

How does R6 route to 100.100.100.0/24? It is a backbone prefix, so it calculates two equal-cost (40) paths through R2 and R3 (its only Area 0 neighbors). But to get to them, it needs to route via either R4 or R5 because they are transit routers towards the backbone.

Predictably, a routing loop occurs. R6 sends packets to R4 or R5 and they, deeply confused but totally innocent, send the packets back to R6 on the most specific route they have.

The fix is simply to stop R2 and R3 from summarizing prefixes into the transit area (100). When R2 and R3 summarize to 100.100.0.0/16, they stop sending the component routes, which causes this whole mess.

The bottom line

The world of virtual-links and transit areas is not a pretty one. I wrote these examples to help you gain a deeper understanding of how they function and the problems they can create, so you avoid them like the plague.

The biggest issue I have with Virtual-links is that they are a feature introduced to temporarily fix an anomaly in an OSPF topology. Then, when this fix didn't work properly, capability transit was introduced as a "feature" to help path selection and avoid routing loops. But that was not enough, so another band-aid was needed, in the form of banning summarization/filtering of prefixes into the transit area. In the end, you have to ask yourself what corner-case will you bump into when you decide to use this in your network?

So I'll nag one last time and ask you to keep Virtual-links in your labs only and outside of production networks. Never ever ever use them as a design feature and do not let a temporary fix become permanent.

And, as always, thanks for reading.

THE DARK SIDE OF OSPF TRANSIT CAPABILITY (3)

Case number 4

The bottom line

Any comments? Contact me via Mastodon or e-mail.

Share & Subscribe!