Database description packet has different master status flag

Hi guys,
we have started to deploy OSPF on our whole network (around 40 routers). We use only mikrotiks and all ROS versions are 3.10 (except 3 routers with 3.24). On some routers we have problems with OSPF routing. Routers can’t make adjacency. Situation is everytime same: One router thinks that it has full adjacency(call its A), second router is frozen in Exchange state (call its B). After few seconds A writes to the log (Database description packet has different master status flag, new master flag = false) and B writes to the log(OSPF neighbour XX.XX.XX.XX: state from exchange to 2-way). Logs on both routers are full of these messages. Sometimes after while this problem disappears and both routers make adjacency. We had to use static routes otherwise this link would be unusable. When it happens there is no adjacency and it means no routes.

We have same problem on more routers. There are more links on the routers and it usually affects only one of this link. There is simple help when it happens. I have to disable affected network in the ospf routing menu on both routers and then enable it again. One can call it a solution but whole network is then not autonomous with this. I can even simulate this error on affected routers (to stop given network in ospf menu on one router and then reenable it, if you stop them on both routers simultanely and then enable both networks error doesn’t occur).

I made some investigations with wireshark. There are screenshots from the good situation (network 10.250.0.48/29 were enabled in ospf menu on both routers in almost same time) and also from bad situation. I also enclosed traffic dumps from both situations (pcap format, sorry for rsc file extensions but i couldn’t upload it to forum).

According to screenshots after few packets of database exchange router 10.250.0.49 sends multicast to DR multicast address (224.0.0.6) in good situation but it doesn’t happen in bad situation. I suppose that it has to be a bug in ospf implementation.

Do you have some solution suggestions?
ok_dump.rsc (8.93 KB)
bad.png
good.png

There was 3 attachments limit, so I had to add reply with bad situation traffic dump.
bad_dump.rsc (6.03 KB)

upgrade to the latest version, then check. maybe you should better use routing-test package - it has some more fixes

I have just tried Mikrotik 3.28 on both ends of the link and situation is completely same. When I enable network(10.250.0.48/29) on both ends of the link (10.250.0.49, 10.250.0.50, it is ptp wireless link with only these two ip address) in almost same time it works and routers find adjacency. Then if I disable network (in ospf menu) and then enable again on one of the router => they dont make full adjacency.

We use very simple ospf settings. It’s almost default configuration, I only set router-id(10.11.0.48 and 10.11.0.56) and add network in ospf network menu.

I am experiencing exactly the same problem with RouterOS 5.5.
The Routers are a RB1200 and a x86.

Is there any usable solution for this issue?

I have the same problems with Mikrotik Version 5.6 and RB750G on each ends of the link. Is there any solution to this question?

Or any explanation to that debug output, since MTU sizes on both routers matches and Multicasts are allowed.

I am having this same problem on RouterOS v5.5 with x86, mipsbe (RB750) and power-pc (RB1100) platforms.

Same error messages in the log files as the entries at the top. If I disable the instance and re-enable, the adjacencies are re-formed and it trudges on for about a week and then fails again.

Annoying and customer-un-friendly.

I am having this same problem on RouterOS v5.7 with x86, mipsbe (RB750) platforms.

Guys do you use any bonding (EtherChannels) on these links by chance?

Hi!

same in 5.5 with ppc and 5.7 with mipsbe.
We have no bonding-Interfaces.
only a simple config

regards

It can be a problem with multicast. I had a similar problem with unreliable radio link and also with a switch with incorrectly set multicast traffic control.

nbma mode should be used on wireless links.

usually default mode works well, unless the radio link is not reliable.

Has someone found a solution for the issue ? We’re experiencing it often and often, seems to me that a flapping link is likely to trigger the problem…that’s a real no-go issue for us.
I tried using nbma too, but it doesn’t seem to work well with many routers on the same segment (10+).


Regards,
Simone.

post your config - this thread is 2+ years old and is probably not related to your problem.

This is the AP:

/routing ospf instance
set default disabled=no distribute-default=never in-filter=ospf-in metric-bgp=auto metric-connected=20 metric-default=1 metric-other-ospf=\
    auto metric-rip=20 metric-static=20 name=default out-filter=ospf-out redistribute-bgp=no redistribute-connected=no \
    redistribute-other-ospf=no redistribute-rip=no redistribute-static=no router-id=172.31.0.1
/routing ospf area
set backbone area-id=0.0.0.0 disabled=no instance=default name=backbone type=default
add area-id=0.0.0.25 disabled=no instance=default name=area25 type=default
/routing ospf interface
add authentication=none authentication-key="" authentication-key-id=1 cost=10 dead-interval=40s disabled=no hello-interval=10s instance-id=0 \
    interface=all network-type=broadcast passive=yes priority=1 retransmit-interval=5s transmit-delay=1s use-bfd=no
add authentication=none authentication-key="" authentication-key-id=1 cost=10 dead-interval=40s disabled=no hello-interval=10s instance-id=0 \
    interface=wlan1 network-type=broadcast passive=no priority=10 retransmit-interval=5s transmit-delay=1s use-bfd=no
add authentication=none authentication-key="" authentication-key-id=1 cost=1 dead-interval=40s disabled=no hello-interval=10s instance-id=0 \
    interface=ether1 network-type=default passive=no priority=1 retransmit-interval=5s transmit-delay=1s use-bfd=no
/routing ospf network
add area=backbone disabled=no network=172.31.0.1/32
add area=area25 disabled=no network=172.17.0.0/24
add area=backbone disabled=no network=192.168.3.0/24

And a client

/routing ospf instance
set default disabled=no distribute-default=never in-filter=ospf-in metric-bgp=auto metric-connected=20 metric-default=1 metric-other-ospf=\
    auto metric-rip=20 metric-static=20 name=default out-filter=ospf-out redistribute-bgp=no redistribute-connected=no \
    redistribute-other-ospf=no redistribute-rip=no redistribute-static=no router-id=172.31.1.1
/routing ospf area
set backbone area-id=0.0.0.0 disabled=no instance=default name=backbone type=default
add area-id=0.0.0.25 disabled=no instance=default name=area25 type=default
/routing ospf interface
add authentication=none authentication-key="" authentication-key-id=1 cost=10 dead-interval=40s disabled=no hello-interval=10s instance-id=0 \
    interface=all network-type=broadcast passive=yes priority=1 retransmit-interval=5s transmit-delay=1s use-bfd=no
add authentication=none authentication-key="" authentication-key-id=1 cost=10 dead-interval=40s disabled=no hello-interval=10s instance-id=0 \
    interface=wlan1 network-type=broadcast passive=no priority=0 retransmit-interval=5s transmit-delay=1s use-bfd=no
add authentication=none authentication-key="" authentication-key-id=1 cost=10 dead-interval=40s disabled=no hello-interval=10s instance-id=0 \
    interface=ether1 network-type=default passive=no priority=1 retransmit-interval=5s transmit-delay=1s use-bfd=no
/routing ospf network
add area=area25 disabled=no network=172.17.0.0/24
add area=area25 disabled=no network=172.16.254.0/24
add area=area25 disabled=no network=172.31.1.1/32

Pretty much default config, not so stable. I tried nbma, but got even worse: when rebooting AP - for example - often adjacencies fails to form, really weird.
My feeling is that OSPF has been somehow broken few release ago (we’re currently trying 5.9, same problems), another tower running 5.4 doesn’t exhibit so much problems (but nbma fails to work correctly there, too).

BR,
Simone

Same probleme, test with version 5.6, 5.7, 5.8 and now 5.9 on all router that participate in the exchange of OSPF
We are using nbma everywhere, priority and cost setup correctly.
We have two network on physical local switch and one wireless between them, really stable link, up to 130 mbs on it, latency 2 ms, no packet lost, no drop, no problem.
Only one router don’t show the ‘‘state change from Exchange to 2-Way’’ or the ‘‘Database Description packet has different master status flag new master flag false’’, all the other exibite that error on the log

Only solution we found so far to allow us to keep ospf for our backbone redondancy is to add on top of it RIP with limited network subnet :open_mouth:

Any chance Mikrotik guru will find a solution ? Or is this a bug ?

Thank you
Simon

Hi, unfortunately we found the mikrotik ospf implementation to be quite buggy…and unable to find any workaround so far.

BR,
Simone.

Hi, unfortunately we found the mikrotik ospf implementation to be quite buggy…and unable to find any workaround so far.

such statement needs more explanation

Normis,

sadly the above statement is very true.
Just read all those related postings allover the forum.

I’m a Mikrotik user for at least 10 years now, started even way before PPPoE was implemented, and RouterOS was still called Mikrotik, but lately I’m getting really tired of this kind of attitude.

Why post at all, if all you add, is just another snippy comment?

And while I’m at it, how is it that I feel like a Beta Tester every time a new 5.x “stable” is released?
I truly have seen doing way better some years ago.

/rant over

saludos
Bernardo