V7.20.8, unexpected OSPF behavior, maybe BFD related, maybe just my fault somewhere

Hi fellow Network wizards,

I'm having serious “fun” after upgrading the whole fleet of routers all over Europe from 7.19.6 to 7.20.8.

I did thorough testing in lab and other networks, without any issues, but that's only because I didn't look well enough.

I’m not sure if this is version specific change of behavior or if it is a bug…. or I’m just too tired to see the mistake(s) in my analysis.

Here's my problem:

I have highly redundant setups like the one I'm describing here.

  • two routers
  • two firewalls
  • switchstacks
  • routers connected to each other (bonding_cluster) => ospf cost 1, preferred
  • routers connected to the local firewall (bonding_fw) => ospf cost 10, the firewall loopback should take that one
  • routers connected to the remote firewall (other room) via vlan42 on trunk bonding_downlink => ospf cost 100, use that one only as a last resort
  • the firewalls don't support BFD, otherwise I would use it on those connections, too.

OSPF is only used to transport the loopbacks (and transfer nets), BGP is used for the real routing information, but out of scope of my problem here.

As you can see above, the two routers have the loopback IPs 10.255.56.1 and 10.255.56.2 and here's the result of the config.

r1)


[r1] > /routing/ospf/interface print detail
Flags: D - dynamic
[...]
 1 D address=10.255.1.177%bonding_cluster area=backbonev4 state=bdr network-type=broadcast dr=10.255.1.178 cost=1 priority=128
     use-bfd=yes retransmit-interval=5s transmit-delay=1s hello-interval=10s dead-interval=40s

 2 D address=10.255.1.181%bonding_fw area=backbonev4 state=dr network-type=broadcast bdr=10.255.1.182 cost=10 priority=128
     use-bfd=no retransmit-interval=5s transmit-delay=1s hello-interval=10s dead-interval=40s

 3 D address=10.255.1.189%vlan42 area=backbonev4 state=bdr network-type=broadcast dr=10.255.1.190 cost=100 priority=128
     use-bfd=no retransmit-interval=5s transmit-delay=1s hello-interval=10s dead-interval=40s

[r1] > /routing/ospf/neighbor/print
Flags: V - virtual; D - dynamic
 0  D instance=ospfv4 area=backbonev4 address=10.255.1.178 priority=128 router-id=10.255.56.2 dr=10.255.1.178 bdr=10.255.1.177
      state="Full" state-changes=6 ls-retransmits=1 adjacency=2d21h17m30s timeout=36s

 1  D instance=ospfv4 area=backbonev4 address=10.255.1.190 priority=1 router-id=10.255.56.5 dr=10.255.1.190 bdr=10.255.1.189
      state="Full" state-changes=6 adjacency=2d21h17m35s timeout=40s

 2  D instance=ospfv4 area=backbonev4 address=10.255.1.182 priority=1 router-id=10.255.56.5 dr=10.255.1.181 bdr=10.255.1.182
      state="Full" state-changes=6 adjacency=2d21h17m23s timeout=35s
[...]


[r1] > /routing/bfd/session/print
Flags: U - up, I - inactive
 0 U multihop=no vrf=main remote-address=10.255.1.197%40-gw02 local-address=10.255.1.198%40-gw02 state=up state-changes=2
     uptime=1d8h52m28s desired-tx-interval=10s actual-tx-interval=10s required-min-rx=10s remote-min-rx=200ms
     remote-min-tx=200ms multiplier=5 hold-time=50s packets-rx=13459 packets-tx=13448

 1 U multihop=no vrf=main remote-address=10.255.1.178%bonding_cluster local-address=10.255.1.177%bonding_cluster state=up
     state-changes=2 uptime=2d21h18m9s desired-tx-interval=200ms actual-tx-interval=200ms required-min-rx=200ms
     remote-min-rx=200ms remote-min-tx=200ms multiplier=5 hold-time=1s packets-rx=1448966 packets-tx=1449207


r2)

[r2] > /routing/ospf/interface print detail
Flags: D - dynamic
[...]
 1 D address=10.255.1.178%bonding_cluster area=backbonev4 state=dr network-type=broadcast bdr=10.255.1.177 cost=1 priority=128 use-bfd=yes retransmit-interval=5s transmit-delay=1s hello-interval=10s dead-interval=40s

 2 D address=10.255.1.185%bonding_fw area=backbonev4 state=dr network-type=broadcast cost=10 priority=128 use-bfd=no retransmit-interval=5s transmit-delay=1s hello-interval=10s dead-interval=40s

 3 D address=10.255.1.193%vlan42 area=backbonev4 state=dr network-type=broadcast bdr=10.255.1.194 cost=100 priority=128 use-bfd=no retransmit-interval=5s transmit-delay=1s hello-interval=10s dead-interval=40s

[r2] > /routing/ospf/neighbor/print
Flags: V - virtual; D - dynamic
 0  D instance=ospfv4 area=backbonev4 address=10.255.1.194 priority=1 router-id=10.255.56.5 dr=10.255.1.193 bdr=10.255.1.194 state="Full" state-changes=15 ls-retransmits=1 adjacency=3d21h29m49s timeout=39s

[...]

 2  D instance=ospfv4 area=backbonev4 address=10.255.1.177 priority=128 router-id=10.255.56.1 dr=10.255.1.178 bdr=10.255.1.177 state="Full" state-changes=6 ls-retransmits=1 adjacency=2d21h15m11s timeout=35s

[r2] > /routing/bfd/session/print
Flags: U - up, I - inactive
[...]
 1 U multihop=no vrf=main remote-address=10.255.1.177%bonding_cluster local-address=10.255.1.178%bonding_cluster state=up state-changes=1 uptime=2d21h15m49s desired-tx-interval=200ms actual-tx-interval=200ms required-min-rx=200ms remote-min-rx=200ms
     remote-min-tx=200ms multiplier=5 hold-time=1s packets-rx=1448380 packets-tx=1448151


As you can see, all sessions are up, bfd is working, ospf is up, LSAs are being exchanged.

So far for the config. Now to my problem and the unexpected behavior.
I would expect because of the direct connection and the lowest cost, that r1 and r2 would choose bonding_cluster to talk to each other. Instead:

r1)

[r1] > /routing/route/print where ospf and dst-address~"10.255.56.*/32"
Flags: A - ACTIVE; o - OSPF
Columns: DST-ADDRESS, GATEWAY, AFI, ROUTING-TABLE, DISTANCE, SCOPE, TARGET-SCOPE, IMMEDIATE-GW
   DST-ADDRESS     GATEWAY                  AFI  ROUTING-TABLE  DISTANCE  SCOPE  TARGET-SCOPE  IMMEDIATE-GW
Ao 10.255.56.2/32  10.255.1.182%bonding_fw  ip   main                110     20            10  10.255.1.182%bonding_fw
Ao 10.255.56.5/32  10.255.1.182%bonding_fw  ip   main                110     20            10  10.255.1.182%bonding_fw

r2)

[r2] > /routing/route/print where ospf and dst-address~"10.255.56.*/32"
Flags: A - ACTIVE; o - OSPF
Columns: DST-ADDRESS, GATEWAY, AFI, ROUTING-TABLE, DISTANCE, SCOPE, TARGET-SCOPE, IMMEDIATE-GW
   DST-ADDRESS     GATEWAY              AFI  ROUTING-TABLE  DISTANCE  SCOPE  TARGET-SCOPE  IMMEDIATE-GW
Ao 10.255.56.1/32  10.255.1.194%vlan42  ip   main                110     20            10  10.255.1.194%vlan42
Ao 10.255.56.5/32  10.255.1.194%vlan42  ip   main                110     20            10  10.255.1.194%vlan42


==> r1 chooses the more expensive bonding_fw connection over the cheaper bonding_cluster link and the most expensive vlan connection.

==> r2 chooses the most expensive vlan connection (because fw2 is inactive → bonding_fw not an option) over the direct and cheapest bonding_cluster link.

On one of the sites, same setup, it helped to disable bfd for the bonding_cluster ospf-config, so I thought that would be the thing - but when I tried it on another site, it didn't change the routing decision. So… I am even more confused.

Do you see something like that on 7.20.x yourself? Any obvious errors in my config?

ANY help will be appreciated :slight_smile: Thanks a lot,

Best regards,
Irrwitzer

p.s.: SUP-212079

Update 2026-03-04:

I think I found the problem. LSA types… and the strategy I use to announce the loopback IPs. Instead of redistribute connected (external lsa) I need to make it internal.

/routing/ospf/interface-template/add place-before=0 area=backbonev4 interfaces=lo
/routing/ospf/instance/unset value-name=redistribute 0                           

seems to work. the loopback IP then is announced as internal stub.
Needs a lot of time for the old LSAs to expire though, as I don’t seem to find a way to flush the ospf process without disabling/re-enabling it.

So, it’s obviously not a version specific change but a problem with my config, as I feared. Why this didn’t come up earlier…. who knows.

How do you guys announce your loopback IPs? Like this? I used to do it the redistribute-connected way with cisco and juniper, that's why I implemented it this way. Adding it as “interface” seems in-intuitiv for me. So if there's a cleaner way to do it, please let me know.

Thanks,
Irrwitzer

You’re showing us the results of your config, but not the OSPF configuration itself. So it’s hard to “guess” why it’s choosing those paths without seeing if anything else in the config stands out.

I use redistribute connected to announce my loopbacks and router-to-router links with about 25 routers and generally it works just fine. The only issue I run into (not necessarily related to this) is a router randomly “hijacking” another router’s announcements, likely due to frequent bounces corrupting the LSA table somewhere on a router.

Point taken…

The pressure to get this solved clouded my view. I thought providing the output of /routing ospf interface print would provide the effective view (instead of the theoretical from the interface-templates) and so on.

What I learned in the process is, that it would have been needed to provide the LSA states as well, but I can't provide them now, that I fixed all of the routers.

For sake of completeness, here's the ospf config of both routers. It's identical on both of them, just the /routing/id/rID is set to different loopbacks.

/routing ospf instance
add disabled=no name=ospfv4 out-filter-chain=ospf-just-loopbacks redistribute=connected router-id=rID
/routing ospf area
add disabled=no instance=ospfv4 name=backbonev4
/routing ospf interface-template
add area=backbonev4 disabled=no interfaces=bonding_cluster use-bfd=yes
add area=backbonev4 cost=10 disabled=no interfaces=bonding_fw use-bfd=no
add area=backbonev4 cost=100 disabled=no interfaces=vlan42 use-bfd=no

/routing filter rule
add chain=ospf-just-loopbacks rule="if ( afi ipv4 && dst in 10.255.0.0/16 && dst-len == 32 ) { accept }"

ip address configs:

r1

[r1] > ip address print where address~"10.255.*"
Columns: ADDRESS, NETWORK, INTERFACE                        
# ADDRESS          NETWORK       INTERFACE                  
0 10.255.56.1/32   10.255.56.1   lo                         
1 10.255.1.181/30  10.255.1.180  bonding_fw                 
2 10.255.1.189/30  10.255.1.188  vlan42                     
4 10.255.1.177/30  10.255.1.176  bonding_cluster            

r2

[r2] > ip address print where address~"10.255.*"
Columns: ADDRESS, NETWORK, INTERFACE
# ADDRESS          NETWORK       INTERFACE
0 10.255.56.2/32   10.255.56.2   lo
1 10.255.1.178/30  10.255.1.176  bonding_cluster
2 10.255.1.185/30  10.255.1.184  bonding_fw
3 10.255.1.193/30  10.255.1.192  vlan42

The firewalls are Palo Alto.

I guess this is rather straight forward and thought it would be the right way to configure it.

  • Loopback IP address on “lo”
  • redistribute=connected in the ospf instance config
  • route-filter that limits announced routes to just the loopback one to unclutter the OSPF table

What this really seems to do is

as the connected (loopback) prefix is just redistributed the router will now be an ASBR

ASBR - Autonomous System Boundary Router, router connected to an external network (in a different AS). If you import other protocol routes into OSPF from the router it is now considered ASBR.”)

and will be of LSA type external type-1 (or 2):

“LSA type 5 - (External LSA) Announces the Routes learned through the ASBR, are flooded to all areas except Stub areas. This LSA divides into two sub-types: external type 1 and external type 2.”

Here's the relevant LSA entries for 10.255.56.2:

[r1] > /routing/ospf/lsa/print where id=10.255.56.2
Flags: S - self-originated, F - flushing, W - wraparound; D - dynamic
 1  D instance=ospfv4 type="external" originator=10.255.56.2 id=10.255.56.2 sequence=0x80006E29 age=620 checksum=0xDC56 body=
        options=E
        netmask=255.255.255.255
        forwarding-address=0.0.0.0
        metric=1 type-1
        route-tag=0
 4  D instance=ospfv4 area=backbonev4 type="router" originator=10.255.56.2 id=10.255.56.2 sequence=0x80006F79 age=344 checksum=0xC016 body=
        options=E bits=E
            type=network id=10.255.1.178 data=10.255.1.178 metric=1
            type=network id=10.255.1.193 data=10.255.1.193 metric=100
            type=network id=10.255.1.202 data=10.255.1.202 metric=1000
            type=stub id=10.255.1.184 data=255.255.255.252 metric=10


[r1] > /routing/ospf/lsa/print where originator=10.255.56.2
Flags: S - self-originated, F - flushing, W - wraparound; D - dynamic
 1  D instance=ospfv4 type="external" originator=10.255.56.2 id=10.255.56.2 sequence=0x80006E29 age=633 checksum=0xDC56 body=
        options=E
        netmask=255.255.255.255
        forwarding-address=0.0.0.0
        metric=1 type-1
        route-tag=0
 4  D instance=ospfv4 area=backbonev4 type="router" originator=10.255.56.2 id=10.255.56.2 sequence=0x80006F79 age=357 checksum=0xC016 body=
        options=E bits=E
            type=network id=10.255.1.178 data=10.255.1.178 metric=1
            type=network id=10.255.1.193 data=10.255.1.193 metric=100
            type=network id=10.255.1.202 data=10.255.1.202 metric=1000
            type=stub id=10.255.1.184 data=255.255.255.252 metric=10
 6  D instance=ospfv4 area=backbonev4 type="network" originator=10.255.56.2 id=0.0.0.0 sequence=0x8000011C age=123 checksum=0xE612 body=
        netmask=192.0.0.0
            router-id=10.255.56.2
 7  D instance=ospfv4 area=backbonev4 type="network" originator=10.255.56.2 id=10.255.1.193 sequence=0x8000010C age=876 checksum=0x2B89 body=
        netmask=255.255.255.252
            router-id=10.255.56.2
            router-id=10.255.56.5
 8  D instance=ospfv4 area=backbonev4 type="network" originator=10.255.56.2 id=10.255.1.202 sequence=0x80000117 age=1228 checksum=0x6444 body=
        netmask=255.255.255.252
            router-id=10.255.50.3
            router-id=10.255.56.2

So the loopback LSAs are external type 1 and the originator will be the gateway but as the originator and the loopback are identical and there is no stub LSA for it, the stub entry announced by the firewall will win and the router chooses the costly link over the cheap one. Not a question of costs, but of internal ospf logic at last.

So… question is… since you @sirbryan are using a similar or equal config for loopback redistribution and I guess others are, too…. and for you it's working…. maybe the culprit is the PA firewall announcing a stub where it shouldn't….

Nevertheless, and since I couldn't risk an other huge outage, I moved the config from redistribution to making the LO interface take part in the OSPF network itself, thus announcing the loopback IP as (internal) type 1 instead of external type 1, which now creates a stub entry with the correct cost/metric and as such will be chosen:

[r1] > /routing/ospf/lsa/print where originator=10.255.56.2                                                 
Flags: S - self-originated, F - flushing, W - wraparound; D - dynamic                                                   
 0  D instance=ospfv4 type="external" originator=10.255.56.2 id=10.255.56.2 sequence=0x80000018 age=1831 checksum=0x4A68
      body=                                                                                                             
        options=E                                                                                                       
        netmask=255.255.255.255                                                                                         
        forwarding-address=0.0.0.0                                                                                      
        metric=1 type-1                                                                                                 
        route-tag=0                                                                                                     
                                                                                                                        
 1  D instance=ospfv4 area=backbonev4 type="router" originator=10.255.56.2 id=10.255.56.2 sequence=0x8000001D age=1626  
      checksum=0x133A body=                                                                                             
        options=E bits=E                                                                                                
            type=network id=10.255.1.177 data=10.255.1.178 metric=1                                                     
            type=network id=10.255.1.194 data=10.255.1.193 metric=100                                                   
            type=network id=10.255.1.202 data=10.255.1.202 metric=1000                                                  
            type=stub id=10.255.1.184 data=255.255.255.252 metric=10                                                    
            type=stub id=10.255.56.2 data=255.255.255.255 metric=1                                                      
                                                                                                                        
 2  D instance=ospfv4 area=backbonev4 type="network" originator=10.255.56.2 id=10.255.1.202 sequence=0x80000019 age=1609
      checksum=0x6344 body=                                                                                             
        netmask=255.255.255.252                                                                                         
            router-id=10.255.50.3                                                                                       
            router-id=10.255.56.2                                                                                       
[r1] > ip route print where dst-address=10.255.56.2          
Flags: D - DYNAMIC; A - ACTIVE; o - OSPF                                 
Columns: DST-ADDRESS, GATEWAY, ROUTING-TABLE, DISTANCE                   
    DST-ADDRESS     GATEWAY                       ROUTING-TABLE  DISTANCE
DAo 10.255.56.2/32  10.255.1.178%bonding_cluster  main                110

Mission accomplished, desaster avoided.

Nevertheless I'd love to know what the real best practice for routeros and ospf loopback IPs is…

And when somebody here has enough Palo Alto skills to tell me, if this is expected behavior, I'd love to hear as well.

Thanks again for all who were interested in this,

Irrwitzer

I haven’t thought to have my lo’s participate in OSPF vs. redistributing connected interfaces. I’ll have to give that a try.

All my equipment is in a single backbone area. There are no stubs, so yes, the way the PA’s are working could be causing the issue you were experiencing.