The dark side of OSPF transit capability (1)

There are lots of write-ups about this feature and the ones I looked at are fairly good at explaining what it does (with examples). That's not a very promising way to start your own article, you might say. Hear me out though: I'm here to show you what it prevents from happening with the aid of a few pretty pictures.

It took me a couple of years (during my CCIE studies) before I managed to actually properly understand and remember for more than 2 hours what this does and why it is so important to leave it enabled.

All cases in this series assume that transit area capability is not enabled!

Case number 1

Area 400 becomes isolated from the backbone and because R4 is not an ABR, no LSAs from Area 400 make it into the backbone (and vice-versa).

With a VL configured between R3 and R4, connectivity is restored between Area 400 and the backbone. Or so you'd think... until you traceroute from R4 to Vlan 100.

Because R4 is now part of the backbone area, it has the Type1 LSA for R1. Therefore, it knows it can reach Vlan100 without leaving the area (which is preferred over any inter-area option). So it tries to route via R3->R1 (its only Area 0 neighbor) and ends up with R4->R5->R3->R1.

But R5, sweet and innocent R5, has two options to get to Vlan100: a Type3 from R2 (cost 10+10+10+10) and one from R3 (cost 100+10+10). So it chooses R2 as its exit point: R5->R4->R2->R1.

Wait... that doesn't look right. So R5 goes through R4 and R4 goes through R5? Routing loop.

Remember, R4 cannot choose R2 as its exit because now both of them are ABRs and one of the loop prevention rules makes it ignore any Type3 LSAs coming from other ABRs when it runs SPF. Oh, the irony.

Our hero that saves the day is called transit-area-capability-....-man. It swoops in and says: if there's a lower-cost LSA which allows you to get to Vlan100 (and R2 is sending one) then you are allowed to use it instead of your current best-path.

That means R4 is allowed to choose a better path: R4->R2->R1. The day is saved and the evil routing loop is defeated.

Case number 2,3,4...

No, not yet. First you get to read a bit of theory!

The theoretical bits

Search for TransitCapability and read Chapter 16.3 in the OSPFv2 RFC 2328. You should read it again after finishing this article, but this time take it slowly, powering through RFCs does not work.

The main ideas are:

TransitCapability is the flag that tells you an area carries traffic that neither originates nor terminates in the area itself.
- This only happens when you are using Virtual-Links to connect isolated areas to the backbone or to reconnect a partitioned backbone.
The additional checks are done after Inter- and Intra-Area routes have been calculated and they look only at backbone prefixes that are:
- native to Area 0
- inter-area summaries (come from other areas via Area 0)
If there is a better path to reach such prefixes than the one through the Virtual-Link ABR, then use it.
If there's any summarization configured on the ABR, ignore it when originating summary-LSAs into the transit area to prevent loops. This sounds a bit weird, I know, but don't worry it's covered in part 3!

Number 3 there might seem to only prevent sub-optimal routing, but as you've seen in the example above, it also prevents routing-loops.

We're done for now

As a parting thought, I'd like to ask you to keep Virtual-links in your labs only and outside of production networks. Never ever ever use them as a design feature and do not let a temporary fix become permanent.

For more mischief, go on and take a look at part 2 and part 3.

And, as always, thanks for reading.

THE DARK SIDE OF OSPF TRANSIT CAPABILITY (1)

Case number 1

Case number 2,3,4...

The theoretical bits

We're done for now

Any comments? Contact me via Mastodon or e-mail.

Share & Subscribe!