MNDP problem - one-sided neighbor relationship, no mndp packets received

Hello,

I recently found that two of our MikroTiks are not forming a proper neighbor relationship. The first one, (NEAR) is sending and receiving MNDP packets, but the second one (FAR) is only sending MNDP packets, it is either not receiving the ones sent from NEAR, or it is not acknowledging them.

To illustrate, here is a 300 second packet sniff on NEAR:

[me@NEAR] > tool sniffer quick interface=wlan3 duration=300 mac-address=FF:FF:FF:FF:FF:FF
INTERFACE                                                                         TIME    NUM DIR SRC-MAC           DST-MAC           VLAN   SRC-ADDRESS                         DST-ADDRESS                         PROTOCOL   SIZE CPU FP
wlan3                                                                           36.952      1 <-  00:0C:42:31:5A:A1 FF:FF:FF:FF:FF:FF        10.10.10.200:35640                  255.255.255.255:5678 (discovery)    ip:udp      145   0 no
wlan3                                                                           40.705      2 ->  00:0C:42:23:1D:86 FF:FF:FF:FF:FF:FF        10.10.10.100:35675                  255.255.255.255:5678 (discovery)    ip:udp      143   0 no
wlan3                                                                           96.953      3 <-  00:0C:42:31:5A:A1 FF:FF:FF:FF:FF:FF        10.10.10.200:35640                  255.255.255.255:5678 (discovery)    ip:udp      145   0 no
wlan3                                                                          100.705      4 ->  00:0C:42:23:1D:86 FF:FF:FF:FF:FF:FF        10.10.10.100:35675                  255.255.255.255:5678 (discovery)    ip:udp      143   0 no
wlan3                                                                          156.954      5 <-  00:0C:42:31:5A:A1 FF:FF:FF:FF:FF:FF        10.10.10.200:35640                  255.255.255.255:5678 (discovery)    ip:udp      145   0 no
wlan3                                                                          160.706      6 ->  00:0C:42:23:1D:86 FF:FF:FF:FF:FF:FF        10.10.10.100:35675                  255.255.255.255:5678 (discovery)    ip:udp      143   0 no
wlan3                                                                          216.949      7 <-  00:0C:42:31:5A:A1 FF:FF:FF:FF:FF:FF        10.10.10.200:35640                  255.255.255.255:5678 (discovery)    ip:udp      145   0 no
wlan3                                                                          220.703      8 ->  00:0C:42:23:1D:86 FF:FF:FF:FF:FF:FF        10.10.10.100:35675                  255.255.255.255:5678 (discovery)    ip:udp      143   0 no
wlan3                                                                           276.93      9 <-  00:0C:42:31:5A:A1 FF:FF:FF:FF:FF:FF        10.10.10.200:35640                  255.255.255.255:5678 (discovery)    ip:udp      145   0 no
wlan3                                                                          280.705     10 ->  00:0C:42:23:1D:86 FF:FF:FF:FF:FF:FF        10.10.10.100:35675                  255.255.255.255:5678 (discovery)    ip:udp      143   0 no

And here is a 300 second packet sniff from FAR:

[me@FAR] > tool sniffer quick interface=wlan1 duration=300 mac-address=FF:FF:FF:FF:FF:FF
INTERFACE                                                                         TIME    NUM DIR SRC-MAC           DST-MAC           VLAN   SRC-ADDRESS                         DST-ADDRESS                         PROTOCOL   SIZE CPU FP
wlan1                                                                           34.429      1 ->  00:0C:42:31:5A:A1 FF:FF:FF:FF:FF:FF        10.10.10.200:35640                  255.255.255.255:5678 (discovery)    ip:udp      145   0 no
wlan1                                                                           94.429      2 ->  00:0C:42:31:5A:A1 FF:FF:FF:FF:FF:FF        10.10.10.200:35640                  255.255.255.255:5678 (discovery)    ip:udp      145   0 no
wlan1                                                                          154.426      3 ->  00:0C:42:31:5A:A1 FF:FF:FF:FF:FF:FF        10.10.10.200:35640                  255.255.255.255:5678 (discovery)    ip:udp      145   0 no
wlan1                                                                           214.43      4 ->  00:0C:42:31:5A:A1 FF:FF:FF:FF:FF:FF        10.10.10.200:35640                  255.255.255.255:5678 (discovery)    ip:udp      145   0 no
wlan1

NEAR can see FAR’s packets, but FAR cannot see NEAR’s packets.

The connection between the two devices has been uninterrupted. There are no connectivity issues between them, and no bandwidth issues.

I think the problem must be at FAR’s end, since NEAR has other neighbors with which it has no problem communicating. Here are the neighbor tables:

NEAR:

[me@NEAR] > ip neighbor print
 # INTERFACE ADDRESS                                                                                              MAC-ADDRESS       IDENTITY   VERSION    BOARD
 0 ether1    10.2.3.4                                                                                             00:15:6D:DD:AA:AA pb5.obe... XM.v5.6.5
 1 ether1    10.4.5.6                                                                                             68:72:51:DD:AA:AA pb5.obe... XM.v5.6.5
 2 ether1    10.7.8.9                                                                                             00:0C:42:DD:AA:AA CORE       6.27       RB1100AH
 3 wlan2     10.10.11.12                                                                                          00:15:6D:AA:AA:AA WiDSLRo... XM.v5.6.9
 4 wlan2     10.13.14.15                                                                                          00:15:6D:EE:EE:EE WiDSLRo... XM.v5.6.9
 5 wlan2     10.17.18.19                                                                                          00:15:6D:BB:BB:BB WiDSLRo... XM.v5.6.1
 6 wlan3     10.10.20.200                                                                                         00:0C:42:31:5A:A1 FAR        6.37.1 ... RB433AH
 7 wds       10.20.21.22                                                                                          00:27:22:CC:CC:CC WiDSLRo... XM.v5.6.9

FAR:

[me@FAR] > ip neighbor print
 # INTERFACE ADDRESS                                                                                              MAC-ADDRESS       IDENTITY   VERSION    BOARD
[me@FAR] >

And here is the neighbor discovery configuration:

NEAR:

[me@NEAR] > ip neighbor export verbose
# nov/08/2016 12:01:41 by RouterOS 6.37.1
#
/ip neighbor discovery settings
set default=yes default-for-dynamic=no

FAR:

[me@FAR] > ip neighbor export verbose
# nov/08/2016 12:02:03 by RouterOS 6.37.1
#
/ip neighbor discovery settings
set default=yes default-for-dynamic=no

The non-dynamic interfaces on both devices should all be participating in neighbor discovery.

Both devices are running version: 6.37.1 (stable)
NEAR is a firmware-type: mpc8323, running on firmware version 2.18
FAR is a firmware-type: ar7100, running on firmware version 3.24
These are the most up-to-date versions of their respective firmwares.

The most significant result of this issue is that these two devices will not form a persistent OSPF adjacency. Every minute they will form a Full adjacency, then after 40 seconds it will be removed. This is because, just as with MNDP, NEAR will send and acknowledge OSPF Hello packets, but FAR will only send OSPF Hellos, it will never admit to receiving any:

NEAR:

12:21:02 echo: route,ospf,debug SEND: Hello 10.10.10.100 -> 224.0.0.5 on wlan3
12:21:02 echo: route,ospf,debug RECV: Hello <- 10.10.10.200 on wlan3 (10.10.10.100)
12:21:02 echo: route,ospf,debug   received options: E
12:21:12 echo: route,ospf,debug SEND: Hello 10.10.10.100 -> 224.0.0.5 on wlan3
12:21:12 echo: route,ospf,debug RECV: Hello <- 10.10.10.200 on wlan3 (10.10.10.100)
12:21:12 echo: route,ospf,debug   received options: E
12:21:22 echo: route,ospf,debug SEND: Hello 10.10.10.100 -> 224.0.0.5 on wlan3
12:21:22 echo: route,ospf,debug RECV: Hello <- 10.10.10.200 on wlan3 (10.10.10.100)
12:21:22 echo: route,ospf,debug   received options: E
12:21:32 echo: route,ospf,debug SEND: Hello 10.10.10.100 -> 224.0.0.5 on wlan3
12:21:32 echo: route,ospf,debug RECV: Hello <- 10.10.10.200 on wlan3 (10.10.10.100)
12:21:32 echo: route,ospf,debug   received options: E

FAR:

12:21:02 echo: route,ospf,debug SEND: Hello 10.10.10.200 -> 224.0.0.5 on wlan1
12:21:12 echo: route,ospf,debug SEND: Hello 10.10.10.200 -> 224.0.0.5 on wlan1
12:21:22 echo: route,ospf,debug SEND: Hello 10.10.10.200 -> 224.0.0.5 on wlan1
12:21:32 echo: route,ospf,debug SEND: Hello 10.10.10.200 -> 224.0.0.5 on wlan1
  (2 messages discarded)
12:21:34 echo: route,ospf,debug     area=RemoteNet
12:21:34 echo: route,ospf,debug Installing an LSA
12:21:34 echo: route,ospf,debug     lsa=Router LSA id=10.10.11.1 originator=10.10.11.1 seqnum=0x800036cd
12:21:34 echo: route,ospf,debug     old=Router LSA id=10.10.11.1 originator=10.10.11.1 seqnum=0x800036cc
12:21:34 echo: route,ospf,debug Flooding an LSA
12:21:34 echo: route,ospf,debug     lsa=Router LSA id=10.10.11.1 originator=10.10.11.1 seqnum=0x800036cd
12:21:34 echo: route,ospf,debug     area=RemoteNet
12:21:34 echo: route,ospf,debug Deleting an LSA
12:21:34 echo: route,ospf,debug     lsa=Router LSA id=10.10.11.1 originator=10.10.11.1 seqnum=0x800036cc
12:21:34 echo: route,ospf,debug wlan1 (10.10.10.200): interface event
12:21:34 echo: route,ospf,debug     event=OSPF_IFE_NEIGH_CHANGE
12:21:34 echo: route,ospf,debug     state=Point-to-Point

Just to reiterate: there are no issues regarding user traffic, my own ssh sessions (I am persistently connected to FAR, through the wlan interfaces listed above. We are currently using static routes as a work-around), or any other tcp or udp protocols (that we’ve noticed).

Both sides have had several reboots, and I’ve fiddled with interface settings and such to no avail.

Has anyone else come across something like this before? Does anyone have any ideas?

Addendum: I should also add, there are no firewall rules or NATs configured on either device.

Second Addendum: BFD seems to work fine between the two devices, until the OSPF neighbor adjacency goes down. NEAR and FAR swap BFD packets in both directions happily until OSPF fails, then the BFD neighbor relationship is removed.

So MNDP and OSPF send packets one-way, but BFD is working fine.