Hi, on one of my routers I am running into issues with OSPF, every ±30min the IP->Routes table drops all OSPF routes but still appears in LSA. I have to manually disable all networks in Routing->OSPF->Networks and then enable them to pick up the routes again. I have the same configuration on all of my other routers by using a loopback interface router-id as the IP on that interface. This does not happen on my other routers and I have tried resetting the entire router and started from scratch with no luck.
PS. I have tried sing NMBA Neighbours and the same thing happens.
The router having issues is backhauled via an nv2 link with -50dBm signal and 100% CCQ and both router neighbours are running RouterOS v6.42.9 bugfix
I see a few topics on this and none of them have been solved!
OSPF Debug:
/export hide-sensitive output:
# oct/11/2018 09:59:36 by RouterOS 6.42.9
# software id = XXXXXXX
#
# model = 750UP
# serial number = XXXXXXXXXXXX
/interface bridge
add name=local
add name=loopback
/interface ethernet
set [ find default-name=ether2 ] comment=Backhaul
set [ find default-name=ether3 ] comment="Access Point"
/routing ospf instance
set [ find default=yes ] redistribute-connected=as-type-1 router-id=10.1.10.1
/interface bridge port
add bridge=local interface=ether3
add bridge=local interface=ether4
add bridge=local interface=ether5
/interface pppoe-server server
add disabled=no interface=local service-name=XXXXXXX-10-1
/ip address
add address=192.168.88.2/24 interface=ether2 network=192.168.88.0
add address=10.1.10.1/24 interface=local network=10.1.10.0
add address=10.1.10.1 interface=loopback network=10.1.10.1
/ip dns
set servers=10.100.1.1
/ppp secret
add local-address=196.23.1.1 name=admin remote-address=196.23.1.2
/routing ospf network
add area=backbone network=192.168.88.0/24
add area=backbone network=10.1.10.0/24
/system identity
set name=XXXXXXX-10-1
/system logging
add topics=ospf,debug,!raw
When it happens, do you lose connectivity for a short time on the link between the 2 routers? Is the neighbor state still at ‘Full’ or does i t go to ‘2-way’ or other.
I’ve experienced faults where if there’s enough of a time-out to drop the session, but it returns before the 40s (Default) timer runs out, then it gets ‘stuck’ and I must restart OSPF intance to get it to go from 2-Way back to FULL and import the routes. I believe this is a bug in RouterOS, although I haven’t labbed it up yet to get a supout and log a ticket.
The link stays up, it’s an ethernet connection between the two routers. I do lose connection to the other router though as it drops the routes. The route is still showing up on the other router that is working fine for the one that has the issue.
I think it stays “FULL” but I’ll double check when It happens again. This is happening on two of my routers.
But if you check the debug log I posted, state has not changed in the log and usually if it does change it logs it. The screenshot is when the routes get dropped.
Haven’t seen something like this before tbh.I’ve deffo seen the bug joe has mentioned though before in prod.
When this happens are you having an DR/BDR churn by the way ?
Just an aside by default OSPF requires that every link-state advertisement (LSA) be refreshed every 30 minutes which might explain why this happens on such a timeframe.
I converted a network to static routing for similar reasons…
Everything worked fine from minutes to days (rarely, but yes), and then it just stopped.
Disabling an enabling a OSPF instance resurrects the whole stuff for some time.
One thing I noticed was the following:
One of my subnets was fragmented. So there was a network segment which actually had the same subnet with different netmask. It triggered an error about not matching netmasks (it was some tjime ago, so I can not remember the exact circumstances). But from that point on, it didn’t accept other OSPF info (even the one which was matching).
that’s quite interesting, as we’ve contacted MikroTik regarding a very similar issue with OSPF.
It’s the same behavior somone mentioned earlier in this thread: OSPF works fine for hours, days, weeks and suddenly the “core” router (hub and spoke VPN setup) stops propagating the route of it’s own subnet.
As Murmaider wrote:
In our case, we’ve got some SSTP road warrior connections on the aforementioned core router. There is no clear causal link between the establishment of a SSTP connection and the OSPF malfunction, but as the SSTP interfaces represent a subnet of our OSPF network, they become dynamic OSPF interfaces as Murmaider described. I now created static OSPF interfaces for the VPN spoke nets and set the “all” interface to passive.
To me it look like if one of your routers have the same IP address on different interfaces, it then causes problems with OSPF, for example, on my main router, I have x3 IPIP-Tunnels, my IP config is:
On the routers connecting to the main router:
Router 1: IP: 10.255.0.1, Network: 10.100.1.1
Router 2: IP: 10.255.0.2, Network: 10.100.1.1
Router 3: IP: 10.255.0.3, Network: 10.100.1.1
IP: 10.100.1.1 is the loopback of the main router
IP’s: 10.255.0.x are the IP’s for the remote routers.
This configuration is the same way as routers apply IP configuration over PPPoE, L2TP and so on.
Everything works fine and communication between routers are all fine but OSPF seems to be getting itself confused with this particular configuration and starts dropping routes on routers.
I have now changed all IP address to unique addresses, both local and network addresses for example, I now use 10.255.0.x as locals and 10.254.0.x as networks for each IPIP-Tunnel and haven’t had a single OSPF route drop.
As we have the same tunnel IP setup just with L2TP/IPsec tunnels, I was very interested reading that you changed the tunnel IP addresses.
I didn’t do that but I set all “non OSPF” interfaces to “passive”.
Since I did that, the issue with dropped routes disappeared.
Unfortunatelly, MikroTik didn’t follow up with an explanation of the problematic with this kind of setup.