Hello,
I recently found that two of our MikroTiks are not forming a proper neighbor relationship. The first one, (NEAR) is sending and receiving MNDP packets, but the second one (FAR) is only sending MNDP packets, it is either not receiving the ones sent from NEAR, or it is not acknowledging them.
To illustrate, here is a 300 second packet sniff on NEAR:
[me@NEAR] > tool sniffer quick interface=wlan3 duration=300 mac-address=FF:FF:FF:FF:FF:FF
INTERFACE TIME NUM DIR SRC-MAC DST-MAC VLAN SRC-ADDRESS DST-ADDRESS PROTOCOL SIZE CPU FP
wlan3 36.952 1 <- 00:0C:42:31:5A:A1 FF:FF:FF:FF:FF:FF 10.10.10.200:35640 255.255.255.255:5678 (discovery) ip:udp 145 0 no
wlan3 40.705 2 -> 00:0C:42:23:1D:86 FF:FF:FF:FF:FF:FF 10.10.10.100:35675 255.255.255.255:5678 (discovery) ip:udp 143 0 no
wlan3 96.953 3 <- 00:0C:42:31:5A:A1 FF:FF:FF:FF:FF:FF 10.10.10.200:35640 255.255.255.255:5678 (discovery) ip:udp 145 0 no
wlan3 100.705 4 -> 00:0C:42:23:1D:86 FF:FF:FF:FF:FF:FF 10.10.10.100:35675 255.255.255.255:5678 (discovery) ip:udp 143 0 no
wlan3 156.954 5 <- 00:0C:42:31:5A:A1 FF:FF:FF:FF:FF:FF 10.10.10.200:35640 255.255.255.255:5678 (discovery) ip:udp 145 0 no
wlan3 160.706 6 -> 00:0C:42:23:1D:86 FF:FF:FF:FF:FF:FF 10.10.10.100:35675 255.255.255.255:5678 (discovery) ip:udp 143 0 no
wlan3 216.949 7 <- 00:0C:42:31:5A:A1 FF:FF:FF:FF:FF:FF 10.10.10.200:35640 255.255.255.255:5678 (discovery) ip:udp 145 0 no
wlan3 220.703 8 -> 00:0C:42:23:1D:86 FF:FF:FF:FF:FF:FF 10.10.10.100:35675 255.255.255.255:5678 (discovery) ip:udp 143 0 no
wlan3 276.93 9 <- 00:0C:42:31:5A:A1 FF:FF:FF:FF:FF:FF 10.10.10.200:35640 255.255.255.255:5678 (discovery) ip:udp 145 0 no
wlan3 280.705 10 -> 00:0C:42:23:1D:86 FF:FF:FF:FF:FF:FF 10.10.10.100:35675 255.255.255.255:5678 (discovery) ip:udp 143 0 no
And here is a 300 second packet sniff from FAR:
[me@FAR] > tool sniffer quick interface=wlan1 duration=300 mac-address=FF:FF:FF:FF:FF:FF
INTERFACE TIME NUM DIR SRC-MAC DST-MAC VLAN SRC-ADDRESS DST-ADDRESS PROTOCOL SIZE CPU FP
wlan1 34.429 1 -> 00:0C:42:31:5A:A1 FF:FF:FF:FF:FF:FF 10.10.10.200:35640 255.255.255.255:5678 (discovery) ip:udp 145 0 no
wlan1 94.429 2 -> 00:0C:42:31:5A:A1 FF:FF:FF:FF:FF:FF 10.10.10.200:35640 255.255.255.255:5678 (discovery) ip:udp 145 0 no
wlan1 154.426 3 -> 00:0C:42:31:5A:A1 FF:FF:FF:FF:FF:FF 10.10.10.200:35640 255.255.255.255:5678 (discovery) ip:udp 145 0 no
wlan1 214.43 4 -> 00:0C:42:31:5A:A1 FF:FF:FF:FF:FF:FF 10.10.10.200:35640 255.255.255.255:5678 (discovery) ip:udp 145 0 no
wlan1
NEAR can see FAR’s packets, but FAR cannot see NEAR’s packets.
The connection between the two devices has been uninterrupted. There are no connectivity issues between them, and no bandwidth issues.
I think the problem must be at FAR’s end, since NEAR has other neighbors with which it has no problem communicating. Here are the neighbor tables:
NEAR:
[me@NEAR] > ip neighbor print
# INTERFACE ADDRESS MAC-ADDRESS IDENTITY VERSION BOARD
0 ether1 10.2.3.4 00:15:6D:DD:AA:AA pb5.obe... XM.v5.6.5
1 ether1 10.4.5.6 68:72:51:DD:AA:AA pb5.obe... XM.v5.6.5
2 ether1 10.7.8.9 00:0C:42:DD:AA:AA CORE 6.27 RB1100AH
3 wlan2 10.10.11.12 00:15:6D:AA:AA:AA WiDSLRo... XM.v5.6.9
4 wlan2 10.13.14.15 00:15:6D:EE:EE:EE WiDSLRo... XM.v5.6.9
5 wlan2 10.17.18.19 00:15:6D:BB:BB:BB WiDSLRo... XM.v5.6.1
6 wlan3 10.10.20.200 00:0C:42:31:5A:A1 FAR 6.37.1 ... RB433AH
7 wds 10.20.21.22 00:27:22:CC:CC:CC WiDSLRo... XM.v5.6.9
FAR:
[me@FAR] > ip neighbor print
# INTERFACE ADDRESS MAC-ADDRESS IDENTITY VERSION BOARD
[me@FAR] >
And here is the neighbor discovery configuration:
NEAR:
[me@NEAR] > ip neighbor export verbose
# nov/08/2016 12:01:41 by RouterOS 6.37.1
#
/ip neighbor discovery settings
set default=yes default-for-dynamic=no
FAR:
[me@FAR] > ip neighbor export verbose
# nov/08/2016 12:02:03 by RouterOS 6.37.1
#
/ip neighbor discovery settings
set default=yes default-for-dynamic=no
The non-dynamic interfaces on both devices should all be participating in neighbor discovery.
Both devices are running version: 6.37.1 (stable)
NEAR is a firmware-type: mpc8323, running on firmware version 2.18
FAR is a firmware-type: ar7100, running on firmware version 3.24
These are the most up-to-date versions of their respective firmwares.
The most significant result of this issue is that these two devices will not form a persistent OSPF adjacency. Every minute they will form a Full adjacency, then after 40 seconds it will be removed. This is because, just as with MNDP, NEAR will send and acknowledge OSPF Hello packets, but FAR will only send OSPF Hellos, it will never admit to receiving any:
NEAR:
12:21:02 echo: route,ospf,debug SEND: Hello 10.10.10.100 -> 224.0.0.5 on wlan3
12:21:02 echo: route,ospf,debug RECV: Hello <- 10.10.10.200 on wlan3 (10.10.10.100)
12:21:02 echo: route,ospf,debug received options: E
12:21:12 echo: route,ospf,debug SEND: Hello 10.10.10.100 -> 224.0.0.5 on wlan3
12:21:12 echo: route,ospf,debug RECV: Hello <- 10.10.10.200 on wlan3 (10.10.10.100)
12:21:12 echo: route,ospf,debug received options: E
12:21:22 echo: route,ospf,debug SEND: Hello 10.10.10.100 -> 224.0.0.5 on wlan3
12:21:22 echo: route,ospf,debug RECV: Hello <- 10.10.10.200 on wlan3 (10.10.10.100)
12:21:22 echo: route,ospf,debug received options: E
12:21:32 echo: route,ospf,debug SEND: Hello 10.10.10.100 -> 224.0.0.5 on wlan3
12:21:32 echo: route,ospf,debug RECV: Hello <- 10.10.10.200 on wlan3 (10.10.10.100)
12:21:32 echo: route,ospf,debug received options: E
FAR:
12:21:02 echo: route,ospf,debug SEND: Hello 10.10.10.200 -> 224.0.0.5 on wlan1
12:21:12 echo: route,ospf,debug SEND: Hello 10.10.10.200 -> 224.0.0.5 on wlan1
12:21:22 echo: route,ospf,debug SEND: Hello 10.10.10.200 -> 224.0.0.5 on wlan1
12:21:32 echo: route,ospf,debug SEND: Hello 10.10.10.200 -> 224.0.0.5 on wlan1
(2 messages discarded)
12:21:34 echo: route,ospf,debug area=RemoteNet
12:21:34 echo: route,ospf,debug Installing an LSA
12:21:34 echo: route,ospf,debug lsa=Router LSA id=10.10.11.1 originator=10.10.11.1 seqnum=0x800036cd
12:21:34 echo: route,ospf,debug old=Router LSA id=10.10.11.1 originator=10.10.11.1 seqnum=0x800036cc
12:21:34 echo: route,ospf,debug Flooding an LSA
12:21:34 echo: route,ospf,debug lsa=Router LSA id=10.10.11.1 originator=10.10.11.1 seqnum=0x800036cd
12:21:34 echo: route,ospf,debug area=RemoteNet
12:21:34 echo: route,ospf,debug Deleting an LSA
12:21:34 echo: route,ospf,debug lsa=Router LSA id=10.10.11.1 originator=10.10.11.1 seqnum=0x800036cc
12:21:34 echo: route,ospf,debug wlan1 (10.10.10.200): interface event
12:21:34 echo: route,ospf,debug event=OSPF_IFE_NEIGH_CHANGE
12:21:34 echo: route,ospf,debug state=Point-to-Point
Just to reiterate: there are no issues regarding user traffic, my own ssh sessions (I am persistently connected to FAR, through the wlan interfaces listed above. We are currently using static routes as a work-around), or any other tcp or udp protocols (that we’ve noticed).
Both sides have had several reboots, and I’ve fiddled with interface settings and such to no avail.
Has anyone else come across something like this before? Does anyone have any ideas?