Community discussions

MikroTik App
 
jkroon
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 58
Joined: Thu Apr 03, 2008 2:18 am
Contact:

OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Fri Jun 03, 2022 1:29 pm

Hi All,

Previously I've found viewtopic.php?t=78327 - this made zero sense to me, but we decided that the resolution of disabling md5 authentication was worth a shot, and after double checking that our mitigations for "directly attached attackers" were in place and operational we disabled OSPF MD5 Authentication. This did not resolve the issue in any way, as such, we suspect the authentication was never the real issue to begin with.

The same page indicates that your issue is packet loss. We don't see loss of adjacency according to any of the routers involved. The specific network segment consists of 2 x (48x10G + 4x40G) switches, paired and running vPC (LACP) back to the routers where possible, or 10G active + (2 x 1G LACP) standby where not (ie, only 1 x 10G port available). We also don't see any loss with echo request/response frames (I've just flooded just under 10m such requests at 1ms intervals, and got 0% loss - to a router that has in the same period by happy accident dropped out from OSPF).

We get between 20 to 25 of these drop-outs per day, per affected router. Detailed symptoms:

* Other routers on the same L2 segment drops routes being originated from the MT in question for (sometimes extended, eg, half an hour+) periods of time.
* Routes originated elsewhere and advertised to MT are not affected (ie, the MT retains it's routes to the rest of the network).

The routers involved on this segment (interface addresses and DR selection priorities):

172.31.255.1 - FRR 8.2.2 - priority 200 - dropped below at 08:15:33
172.31.255.2 - FRR 8.2.2 - priority 200 (current DR) - dropped below at 08:15:42
172.31.255.3 - RouterOS 6.48.4 (to be upgraded in the next 48h) - priority 1
172.31.255.5 - RouterOS 6.49.6 (current BDR, upgraded this morning) - priority 1

What I do spot in the logs leading up to the deletion of routes is quite a bit of "Skipping flooding: from DR or BDR". Which makes sense that we only want to flood updates back into the network if the router is the DR or the BDR. But I suspect it also means the router isn't refreshing it's routes back to the DR and BDR regularly enough, not even sure if this is a consideration, but as per anything I'm guessing that if the MT don't let the DR and BDR know from time to time that it's routes are still valid they will get dropped every so often.

What bugs me is that since the upgrade this morning: the rebooted router has bee n stable, and in the same time (just over 4 hours) we've had 4 outages from the not rebooted node (logs above for the first of these). This can, from what I can tell, have one of two causes:

1. A reboot fixes whatever the underlying problem is for a while and it will return; or
2. If a router is a DR or BDR it floods whenever LSAs are made, resulting in it's own routers also effectively being refreshed on peers.

I'm not familiar enough with the OSPF protocol to be able to confirm or reject either hypothesis.

Resulting problems:

* iBGP should normally connect to loop back addresses - if OSPF fails, loopbacks fail, iBGP fails, network as a whole fails. We've had to update iBGP to use interface addresses which turns link-layer failures into routing failures, but these are far less frequent than OSPF currently (handful of failures per year compared to near hourly OSPF drop-outs).
* Sub-optimal internal routing (eg, will follow a BGP announced /21 to a route-reflectors instead of the more specific /28 from OSPF to a different router). Just adds extra latency, not a trainsmash, as next-hop will redirect (which is not an MT).
* Non-functional routing for directly attached (connected routes) originated from Mikrotik routers. (similar problem to loopbacks, fortunately these destination networks are generally intended for routing EGP so it's 99% of the time not a blocking issue since mostly only the router itself needs to be able to access these destinations).

The first of these is a network killer. Yes, we can work around by routing to interface addresses, but this to a large degree negates the point of having redundancy in your network.

Happy to create a pcap of all traffic on the frr side (much lower performance impact), but can provide Mikrotik raw logs for OSPF forabove directly to Mikrotik Support (cannot post this publicly, but happy to discuss and test).
 
jkroon
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 58
Joined: Thu Apr 03, 2008 2:18 am
Contact:

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Sat Jun 04, 2022 12:52 am

So how does one go about reporting a bug?

max-age is 3600 by default for OSPF. Not sure if it's possible to change this.

Mikrotik sends the "refreshes" LSAs only after the max age of the previous advertisement has expired. I've now captured a lot of ospf traffic, and then looked simply for the source traffic.

So LSA for the loopback address gets sent at variable times, but what is plain as day:

1. The route drops from other routing tables 1 hour after every time it gets loaded.
2. Mikrotik sends a while after that, in the measured cases the intervals respectively was (in seconds) 4174, 4139 and 4233.
3. In each case the end of the drop out matches the transmission times of the LSAs exactly, and the route removal is in all cases within a few seconds the above times less 3600 prior to recovery.

There are similar "Transit" announcements coming from the same router, which indicates the same LS ID, however, these are designed to indicate links, what's funny about these are that it advertises back into the same network transit to another router, only "visible" to it on the same interface it's advertising this Transit LS on. These confused me for a bit as if I take these into consideration everything would be just fine in terms of timing of the LSAs.

Specifically for the above, only considering LS Updates sent *from* 172.31.255.3, and ONLY if it contains a LS ID f 192.168.32.3 (it's related loopback address)

At time 0 an LS Update advertising the ROUTE to it's loopback.
At time +3548 - an LS Update indicating it has transit to 172.31.255.2 (The DR router, and ONLY to this router).
At time +685 a Route LS again, or the 4233 from above, this outage was a total of 4233-3600=633 calculated, which roughly correlates the drop-out time of 615 seconds total.
+ 3480 the same transit LS Update.
+ 659 a Route LS Update again, +4139s from previous Route LS Update, 4139-3600=539, measured with ip route monitor logs at exactly 9 minutes or 540s.
+ 3491 transit LS Update.
+ 683 Route LS Update, +4174s from last Route LS Update, 4174-3600=574, measured at 575s.

Suggestions on how to get a fix for this implemented? Should be a relatively simple timer issue. If I had access to the code I'd go digging ...
 
jkroon
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 58
Joined: Thu Apr 03, 2008 2:18 am
Contact:

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Thu Jun 09, 2022 12:22 pm

It Looks like it's the reboot that solves it, so this is a time based issue requiring either a reboot (or we'll test going forward) an instance restart (as per my post on OSPF state issues).

How long it takes is still up for debate.

Currently after also rebooting the other MT router it seems both are stable. And neither one is a DR or BDR currently (that falls upon the two frr instances in the subnet currently).

Not going to mark this as solved, since frankly, we're waiting for this to start up again in a few weeks at most. Rebooting production routers every week even just doesn't seem like a sensible answer either.
 
jkroon
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 58
Joined: Thu Apr 03, 2008 2:18 am
Contact:

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Thu Aug 04, 2022 12:03 am

Just an update for everyone else bumping into this - an instance restart is not good enough, you have to reboot the entire router. As a result the only sensible course of action is to eliminate Mikrotik from our network setup for any but the simplest of use cases.
 
freshtechs
newbie
Posts: 29
Joined: Thu Nov 10, 2016 3:42 am

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Thu Aug 04, 2022 7:08 am

I think ospf is so buggy in mikrotik. When you are doing pretty simple things it works. But I think is not fully implemented I have an open ticket with support, because I noticed that ABRs leaks all routes from backbone to stub/nssa areas if there is a reboot/instance restart involved. Essentially stub/nssa areas routers all acts like backbone/area0 routers and all networks are known on routers that shouldn’t.
 
AlexDF
Trainer
Trainer
Posts: 16
Joined: Tue Jan 09, 2018 11:39 pm

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Mon Aug 08, 2022 8:46 am

We implemented many MK in mixed vendor environments (Aruba, Cisco, Fortigate, PaloAlto) and we haven't noticed any loopback route flapping on ospf (RouterOS version 6.48.6). Can you provide a configuration extract about ospf? Alex.
 
jkroon
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 58
Joined: Thu Apr 03, 2008 2:18 am
Contact:

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Tue Aug 23, 2022 4:38 pm

We implemented many MK in mixed vendor environments (Aruba, Cisco, Fortigate, PaloAlto) and we haven't noticed any loopback route flapping on ospf (RouterOS version 6.48.6). Can you provide a configuration extract about ospf? Alex.
/routing ospf instance
set [ find default=yes ] distribute-default=if-installed-as-type-1 redistribute-connected=\
    as-type-1 redistribute-static=as-type-1 router-id=a.b.c.d
/routing ospf interface
add interface=bonding-cisco.2-routing network-type=broadcast priority=0
/routing ospf network
add area=backbone network=172.31.0.0/16
bonding-cisco.2 has address 172.31.255.d/24, and is a tagged vlan with id=2. cisco is named after the switch, which has no interface on vlan=2. Both MKs have the same identical config with only a.b.c.d differing.

frr config (x2):
interface bond0.2
 ip ospf cost 10
 ip ospf priority 200
exit

router ospf
 ospf router-id a.b.c.d
 redistribute kernel metric-type 1 route-map iewc-permit-kernel
 redistribute connected metric-type 1 route-map iewc-filter-dynamic
 network 172.31.255.0/24 area 0
 area 0 authentication message-digest
exit
Not posting the route maps, between the two frr instances, the routes remain in tact here. Also, routes MK => FRR remains in tact, it's only the routes back to the MKs that gets dropped.

Hmm, could the network being /16 vs /24 affect things? I don't see HOW as this should only be used locally to determine which interfaces belong to which areas.
 
AlexDF
Trainer
Trainer
Posts: 16
Joined: Tue Jan 09, 2018 11:39 pm

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Tue Aug 23, 2022 7:34 pm

Hi jkroon, the network mask /16 isn't a problem, it is only involved in selecting the interfaces for ospf.
I'm not a fan of redistribute connected, i prefer to include network IPs and declare interfaces as passive.
In previous post you reported route loss from mk to others, but now from others to mk. Can you clarify?
What about ospf debug log when route loss happens?
 
jkroon
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 58
Joined: Thu Apr 03, 2008 2:18 am
Contact:

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Fri Sep 02, 2022 6:58 pm

Hi jkroon, the network mask /16 isn't a problem, it is only involved in selecting the interfaces for ospf.
I'm not a fan of redistribute connected, i prefer to include network IPs and declare interfaces as passive.
In previous post you reported route loss from mk to others, but now from others to mk. Can you clarify?
What about ospf debug log when route loss happens?
Other way around. MK maintains routes to other hosts.

Other hosts loses routes to MK due to not being "refreshed" within the hour, as per details in previous post.

We specifically do not want to include other networks and mark as passive since these are USUALLY facing clients, and even a passive interface CAN join onto OSPF if another host on the segment starts advertising OSPF, which we have seen some of our clients do, so that's highly risky. redistribute-connected does what we need in that it distributes the directly attached, and allows us to proceed.

We are at this stage seriously contemplating static routing the loopbacks, and using iBGP to redistribute-connected. This will get tricky as we will need to implement a fair amount of additional BGP filtering which isn't currently required.

I'll re-setup syslog to another host again ... we've moved the physical host where this was sent to but I think this warrants it.
 
AlexDF
Trainer
Trainer
Posts: 16
Joined: Tue Jan 09, 2018 11:39 pm

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Fri Sep 02, 2022 7:21 pm

Hi Jkroon

I'm quite sure that a passive interface cannot join with active ospf neighbors in the same lan segment nor processes any received Hellos, so it is safe to implement this way instead of redistribute connected.

Anyway, can you make a test including loopback ip numbering as ospf network? Obviously there aren't any neighbors on loopback segment.

Are you sure there aren't any loopback ip or router id overlapping on the network?

Alex
 
jkroon
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 58
Joined: Thu Apr 03, 2008 2:18 am
Contact:

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Sun Sep 04, 2022 6:29 pm

I'm quite sure that a passive interface cannot join with active ospf neighbors in the same lan segment nor processes any received Hellos, so it is safe to implement this way instead of redistribute connected.
From the best of my knowledge, you're wrong on this. Passive just says it won't actively send Hellos but it will most definitely join the network if others send Hellos. We use this elsewhere in our network "feature" elsewhere in our network.
Anyway, can you make a test including loopback ip numbering as ospf network? Obviously there aren't any neighbors on loopback segment.
loopback interface is safe, since no one can broadcast into that, yes sure, we can test that although I'm very certain this won't make a difference. Will still need redistribute connected to re-advertise a bunch of other /29s, but if this ends up that the loopback stays up, then sure, we can implement that for other segments after taking security into consideration.

Don't even need to explicitly make the loopback passive, it is so by default due to no interfaces associated with the bridge.
Are you sure there aren't any loopback ip or router id overlapping on the network?
If you mean am I sure there aren't duplicate router-ids on the network, then yes, I'm very sure. Ditto for duplicate IP networks, so all routes terminate in a distinct L2 domain, whether that's a loopback domain or ethernet segment (VLANs in many/most cases).
 
AlexDF
Trainer
Trainer
Posts: 16
Joined: Tue Jan 09, 2018 11:39 pm

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Tue Sep 06, 2022 5:25 pm

Sorry but I'm quite sure that a passive interface cannot join any adjacency nor can receive any route from other ospf routers.

As per Mikrotik wiki:

"passive (yes | no; Default: no) if enabled, do not send or receive OSPF traffic on this interface"

To be sure, I recreated in lab two RouterOS devices, connected via ethernet interface; each one has one loopback interface having one ip, the same as router id; both have an ospf adjacency via ethernet interface.

When both routers have ethernet interface "active" as ospf, the loopback ips are exchanged each others, but when you mark the ethernet interface as passive at just one of the two routers, both loose adjacency and both loose the loopback ip route of the other.

It is confirmed that passive means that inbound ospf traffic is discarded, as confirmed even in other brands manuals.

Alex
Last edited by AlexDF on Mon Sep 12, 2022 10:34 pm, edited 1 time in total.
 
jkroon
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 58
Joined: Thu Apr 03, 2008 2:18 am
Contact:

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Mon Sep 12, 2022 10:12 pm

Sorry but I'm quite sure that a passive interface cannot join any adjacency nor can receive any route from other ospf routers.

As per Mikrotik wiki:

"passive (yes | no; Default: no) if enabled, do not send or receive OSPF traffic on this interface"
OK, I'll retest and verify. Thanks.
 
mducharme
Trainer
Trainer
Posts: 1777
Joined: Tue Jul 19, 2016 6:45 pm
Location: Vancouver, BC, Canada

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Tue Sep 13, 2022 2:46 am

I just wanted to say that I agree completely with @AlexDF. I find OSPFv2 pretty bulletproof on MikroTik when advertising everything through OSPF->Networks and having redistribution turned off. And, if you previously had an interface set as passive and you had a neighbor relationship formed anyway, then something was really wrong as this should not happen.

Something I would suggest is adding interface "all" set to passive, this causes the default to be passive except for interfaces that are specifically defined as not passive.
 
jkroon
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 58
Joined: Thu Apr 03, 2008 2:18 am
Contact:

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Thu Sep 15, 2022 1:11 am

Sorry but I'm quite sure that a passive interface cannot join any adjacency nor can receive any route from other ospf routers.

As per Mikrotik wiki:

"passive (yes | no; Default: no) if enabled, do not send or receive OSPF traffic on this interface"
OK, I'll retest and verify. Thanks.
So I can confirm your understanding. Including same with other router brands, it seem most operates this way, so we dug into the passive thing, and we can't reproduce how we previously had the passive interface, we speculate NMBA could have been involved, to transit properly over a wireless bridge which did not deal well with multicast traffic.

We've now proceeded to rather redistribute connected and static via iBGP (which we would have preferred to avoid). And redistribute connected and static is now disabled OSPF side. The connected we can get done using the passive interface strategy, but this won't work for static (which unfortunately we need since some peers refuse to speak eBGP and insist on static routing).
I just wanted to say that I agree completely with @AlexDF. I find OSPFv2 pretty bulletproof on MikroTik when advertising everything through OSPF->Networks and having redistribution turned off. And, if you previously had an interface set as passive and you had a neighbor relationship formed anyway, then something was really wrong as this should not happen.

Something I would suggest is adding interface "all" set to passive, this causes the default to be passive except for interfaces that are specifically defined as not passive.
Thanks for the latter tip. This is useful.

This is not our first rodeo with highly unstable OSPFv2 on Mikrotik, even with a p2p "ethernet" link (ie, OSPFv2 on a /30 routing-only network) we've actually cooked scripts that detected the bad OSPF state and rectified automatically. Unfortunately those checks simply does not work in this case here. That was a completely different network, and one of the companies we consult with figured out a "reboot once a week" on a schedule at 2am in the morning all routerboards in their network to mitigate the same issue. Unfortunately when you've got routers in core positions doing things like that isn't an option. Even version upgrades are carefully checked to see if they're really required as scheduling even 5 minutes downtime for a reboot on these routers are problematic, not matter the time of the night. And to re-fill the BGP tables takes quite a while longer than that ... which mikrotik would just incorporate frr into routeros and move on.
 
mducharme
Trainer
Trainer
Posts: 1777
Joined: Tue Jul 19, 2016 6:45 pm
Location: Vancouver, BC, Canada

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Thu Sep 15, 2022 5:58 am

And to re-fill the BGP tables takes quite a while longer than that ... which mikrotik would just incorporate frr into routeros and move on.
You're comparing RouterOS v6 with other platforms, rather than RouterOS v7, which has the entire routing stack, including all protocols, rewritten from the ground up. You'll find most of these issues are gone in v7, although there are of course new bugs they are working through since the stack is new.
 
jkroon
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 58
Joined: Thu Apr 03, 2008 2:18 am
Contact:

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Thu Sep 15, 2022 8:20 am

And to re-fill the BGP tables takes quite a while longer than that ... which mikrotik would just incorporate frr into routeros and move on.
You're comparing RouterOS v6 with other platforms, rather than RouterOS v7, which has the entire routing stack, including all protocols, rewritten from the ground up. You'll find most of these issues are gone in v7, although there are of course new bugs they are working through since the stack is new.
This is true. And we are procuring two new RouterOS7 routers now (existing ones are to be moved to a new POP). We've just not had the courage to move to RouterOS v7 yet en masse. We will over the next two to four weeks start with procurement and start the process of migrating services and peerings one by one over to the two new CCR routers. We considered using something else rather, but decided to stick with Mikrotik for the time being.

In terms of stability with the new configuration:
> /routing ospf export 
# sep/15/2022 07:14:23 by RouterOS 6.49.6
# software id = 22X0-517R
#
# model = CCR1016-12S-1S+
/routing ospf instance
set [ find default=yes ] distribute-default=if-installed-as-type-1 router-id=loop_ip
/routing ospf interface
add interface=bonding-cisco.2-routing network-type=broadcast priority=0
/routing ospf network
add area=backbone network=172.31.0.0/16
add area=backbone network=loop_ip/32
We've not seen any improvement:
2022-09-15 00:20:16: ospf
2022-09-15 01:51:01: bgp
2022-09-15 02:01:01: ospf
2022-09-15 03:37:01: bgp
2022-09-15 04:31:01: ospf
2022-09-15 05:31:01: bgp
2022-09-15 06:01:01: ospf
2022-09-15 06:17:01: bgp
2022-09-15 06:18:02: ospf
Code used to track this:
#! /bin/bash

loop_ips=(... ... ...)

for ip in "${loop_ips[@]}" do
	cproto=$(ip ro sh $ip/32 | sed -nre 's/^.* proto ([^ ]+)( .*)?$/\1/p')
	lproto=$(tail -n1 .$ip.proto | sed -re 's/.* //')
	[[ "${cproto}" != "${lproto}" ]] && date "+%F %T: ${cproto}" >> .$ip.proto
done
For now at least the loopbacks doesn't go unreachable, and whilst the solution as it stands currently CANNOT scale (which is a problem) at least we're operational.
 
AlexDF
Trainer
Trainer
Posts: 16
Joined: Tue Jan 09, 2018 11:39 pm

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Thu Sep 15, 2022 9:41 am

Anyway I can only confirm that RouterOS 6.48.6 in a mixed brand environment is ok. In our core network we have Mikrotik CCR devices speaking with Cisco, Aruba, Fortigate and others, and neighborship is up since many days, there should be other aspects to evaluate.

Are ospf timing all equal between devices? For example we experienced an ospf flapping problem with FortiOS v7.0.6 using default (implicit) ospf timers, we needed to explicitly configure them.
Is there any intermediate switch devices (or switch logic inside ospf devices) which could limit multicast packets, as rate limiting, igmp snooping, etc. ?

Alex
 
jkroon
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 58
Joined: Thu Apr 03, 2008 2:18 am
Contact:

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Thu Sep 15, 2022 10:41 am

Anyway I can only confirm that RouterOS 6.48.6 in a mixed brand environment is ok. In our core network we have Mikrotik CCR devices speaking with Cisco, Aruba, Fortigate and others, and neighborship is up since many days, there should be other aspects to evaluate.
Neighborship never seems to reset ... only the LSA refreshes which maintains the routes.
Are ospf timing all equal between devices? For example we experienced an ospf flapping problem with FortiOS v7.0.6 using default (implicit) ospf timers, we needed to explicitly configure them.
Is there any intermediate switch devices (or switch logic inside ospf devices) which could limit multicast packets, as rate limiting, igmp snooping, etc. ?
No explicit timers are configured, all rely on the default (standard specified) timers, I've investigated this before, and the only timers you can really mess with are the hello timers, and these would cause failure to establish if mismatching, like MTU. Just to clarify, both FRR instances:
  Timer intervals configured, Hello 10s, Dead 40s, Wait 40s, Retransmit 5
And both Mikrotiks:
  retransmit-interval=5s transmit-delay=1s hello-interval=10s dead-interval=40s
The transmit-delay and wait timers seems to be different things based on the documentation and doesn't influence the protocol. Wait time causes FRR to wait for this time before recalculating (https://docs.frrouting.org/en/latest/ospfd.html) where Mikrotik merely states "link state transmit delay is the estimated time it takes to transmit a link state update packet on the interface" - which really doesn't say anything. The standard dictates that the LSAs is valid for 3600s and must be retransmitted every 1800s, and te retransmit-interval must be used to obtain ACKs upon transmission. In other words, Mikrotik must send (and receive ACKs) for LSAs every half an hour. It doesn't do this. Hoping RoS7 will fix this as the BGP setup is now beyond ugly and we'd really like to clean that up again.

In terms of rate limiting of broadcast and multicast, this has now been set (last night) to rate limit at 1% of interface capacity in order to mitigate problems with an external peer (not under our direct control) that was messing around and causing problems.

1% doesn't sound like a lot until I mention all ports involved in this setup are 10G, so that's 100Mbps of broadcast and another 100Mbps of multicast traffic allowed, per port. Most of these are LACP into vPCs on a CISCO Nexus pair, so up to 20G potentially, but I believe broadcast traffic will be hashed into one or the other links based on the LACP egress hash policy.

I believe I did post the MT logs previously as well showing that it doesn't send the LSA refresh. Adjacency is properly maintained at all times.
 
mducharme
Trainer
Trainer
Posts: 1777
Joined: Tue Jul 19, 2016 6:45 pm
Location: Vancouver, BC, Canada

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Thu Sep 15, 2022 3:00 pm

In other words, Mikrotik must send (and receive ACKs) for LSAs every half an hour. It doesn't do this.
Hi,

I don't understand this. The reason this doesn't make sense to me is the only issue that we have with OSPFv2 on RouterOS is that, in certain circumstances for us, it goes into "full" state prematurely, before it has received all LSA's. Sometimes it receives very few LSAs and the routing table is mostly empty, with many things unreachable. In this circumstance, it takes exactly half an hour before the routing table becomes complete and all the missing routes appear, with none missing. If it was true that MikroTik did not send LSA's every half an hour, I do not understand why half an hour is the "magic" period for us where suddenly the router has all routes, and is fine thereafter.
 
AlexDF
Trainer
Trainer
Posts: 16
Joined: Tue Jan 09, 2018 11:39 pm

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Fri Sep 16, 2022 7:20 pm

I recreated again a small lab connecting two RouterOS devices (v.6.48.6) as before to be sure, connecting the devices via ethernet interface; each one has one loopback interface having one ip, the same as router id; both have an ospf adjacency via ethernet interface. Then I captured ethernet traffic between them for 9 hours.

I confirm that:

- they don't lose adjacency, due to hello packets sent regularly every 10s;
- they don't lose remote routes to loopback, due to LSA Updates sent regularly every 30 minutes & confirmed from opposite device via LSA ACKs

So it is confirmed that there is no buggy implementation of the ospf protocol itself, but there is something specific in your scenario (devices or configurations).

If available, can you try to interconnect a spare CCR to a spare FRR to reproduce a minimal scenario?
 
jkroon
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 58
Joined: Thu Apr 03, 2008 2:18 am
Contact:

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Tue Oct 25, 2022 2:16 pm

I recreated again a small lab connecting two RouterOS devices (v.6.48.6) as before to be sure, connecting the devices via ethernet interface; each one has one loopback interface having one ip, the same as router id; both have an ospf adjacency via ethernet interface. Then I captured ethernet traffic between them for 9 hours.

I confirm that:

- they don't lose adjacency, due to hello packets sent regularly every 10s;
- they don't lose remote routes to loopback, due to LSA Updates sent regularly every 30 minutes & confirmed from opposite device via LSA ACKs

So it is confirmed that there is no buggy implementation of the ospf protocol itself, but there is something specific in your scenario (devices or configurations).

If available, can you try to interconnect a spare CCR to a spare FRR to reproduce a minimal scenario?
I'll see if I can set something up. Sorry for the radio silence, had to deal with some other issues.
 
jkroon
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 58
Joined: Thu Apr 03, 2008 2:18 am
Contact:

Re: OSPF "drop-outs" (routes originated from MT gets dropped from the rest of the network)

Thu Aug 03, 2023 10:21 pm

This went away with Router OS 7.

However, we still periodically get in a VLAN with combination of v6 and v7 that the routes towards v6 goes away for short periods. Can't get full tcpdump in that VLAN.

We don't want to upgrade those v6 routers to v7 yet until a few other critical issues in v7 has been resolved (as per other reports I've made in these forums, and have tried to address directly with support@mikrotik.com).

Who is online

Users browsing this forum: No registered users and 15 guests