I’m seeing an issue with a OSPF External type 1 redistributed static route not being installed in the network’s routing tables since upgrading some point to point backhauls to Ubnt hardware. I’m in the process of migrating the network’s point to point links to use NBMA OSPF as with broadcast traffic the OSPF links were dropping periodically (once or twice a day) and causing the static route to not get routed. The OSPF session would die due to a lost packet or similar link level behavior and reinitialize within a few seconds.
When the issue occurs, this particular redistributed static route is the only route that doesn’t come up. The native OSPF networks and the other redistributed connected/static (about a half dozen from this router) do come up properly. The only difference between this route and the others is that it’s announced elsewhere on the network as well.
When OSPF drops and renegotiates the Mik1-Mik2 connection, all devices on the network list the affected redistributed network (10.0.0.0/26) as an external LSA with the correct originator (Mik1 in the diagram - 192.168.100.1). However, the network does not show up in any devices’ routing table.
The issue can be fixed by disabling and reenabling the affected static routes in the originating Mik1 router. This triggers a reannouncement and reflooding and once done the network has a route to the network again.
I’ve got a simplified network map below:
Mik1 - OSPF area 1
-----------------------------------------------------
| ether1 - 192.168.100.1 |
| 192.168.100.0/24 - Network |
| 10.0.0.0/26 via 192.168.100.1 - External type 1 |
| |
| ether2 - 172.16.0.1 |
| 172.16.0.0/29 - Network |
-----------------------------------------------------
| Mik1-ether2 <-> Mik2-ether2
Mik2 - OSPF area 1
-----------------------------------------------------
| ether1 - 192.168.101.1 |
| 192.168.101.0/24 - Network |
| |
| ether2 - 172.16.0.2 |
| 172.16.0.0/29 - Network |
| |
| ether3 - 172.16.0.9 |
| 172.16.0.8/29 - Network |
-----------------------------------------------------
| Mik2-ether3 <-> Mik3-ether2 - this is the link that was updated
Mik3 - OSPF area 1
-----------------------------------------------------
| ether1 - 192.168.102.1 |
| 192.168.102.0/24 - Network |
| |
| ether2 - 172.16.0.10 |
| 172.16.0.8/29 - Network |
-----------------------------------------------------
| Mik3-ether1 <-> Cisco1-FE0/0
Cisco1 - ABR - FE0/0 area 1, FE0/1 area 0
-----------------------------------------------------
| FE0/0 - 192.168.102.2 |
| 192.168.102.0/24 - Network |
| |
| FE0/1 - 172.16.0.17 |
| 172.16.0.16/29 - Network |
-----------------------------------------------------
| Cisco1-FE0/1 <-> Cisco2-FE0/0
Cisco2 - OSPF area 0
-----------------------------------------------------
| FE0/0 - 172.16.0.18 |
| 172.16.0.16/29 - Network |
| |
| Serial0/0 - 172.16.0.25 |
| 172.16.0.24/29 - Network |
| 10.0.0.0/26 via 172.16.0.26 - External type 2 |
-----------------------------------------------------
The intended behavior is to route 10.0.0.0/26 over one of two routes out of the network, preferring the route from Mik1 if available. The Type-1 external LSA announced by that router should have priority over the Type-2 external LSA announced by Cisco2 in all situations and as long as Mik1 sees 192.168.100.1 up it will announce the route. Cisco2 fails over and announces the route through itself in case of an outage at 192.168.100.1 or if there’re broken links between it and Mik1.
I’ve tried dumping the debug for ospf while this happens to a text file so I can audit what’s going on but it’s not complete – keeps getting filled with x messages dropped lines. It appears, however, that the routes are being flooded from both Mik1 and Cisco2 simultaneously and the result is that nothing gets routed. I see the networks from Mik1 get installed to the main routing table, then an announcement from cisco2 is seen, the network gets retransmitted with that as the originator, then that LSA is deleted.
However, I’m not convinced that going unicast OSPF will resolve the issue; as I did see it crop up while the links were cycling while implementing NBMA. It may help with the random losses as I expect the OSPF session to drop less now, but it’s potentially not the root cause. Has anyone seen behavior like this before? Do Mikrotiks have any diagnostic tools available to show why a network is not being routed even though it’s showing up as a LSA?