I have various RB2011s, Metal 5SHPns, NetMetals, and CRS125s. Of all of my devices with nearly identical configurations (deployed using Ansible, so user error chance is at least lower), I have one metal and two RB2011s that wont respond to SNMP – I receive timeouts. Torch on those devices shows the packets received and no packets sent out. The remote machine can in fact ssh to the routers, however. I can also do SNMP to devices around the three that do not work.
Here’s an export from the Metal:
[ryan_turner@sec3.hil] > /snmp export
# dec/07/2015 22:36:58 by RouterOS 6.33.1
# software id = LTVD-TT50
#
/snmp community
set [ find default=yes ] addresses=44.34.128.0/21 name=hamwan
/snmp
set contact="#HamWAN on irc.freenode.org" enabled=yes
[ryan_turner@sec3.hil] > /ip firewall export
# dec/07/2015 22:37:44 by RouterOS 6.33.1
# software id = LTVD-TT50
#
/ip firewall mangle
add action=change-mss chain=output new-mss=1378 protocol=tcp tcp-flags=syn \
tcp-mss=!0-1378
add action=change-mss chain=forward new-mss=1378 protocol=tcp tcp-flags=syn \
tcp-mss=!0-1378
And here’s the timeout from the remote:
root@monitor:/var/log/prometheus# snmpwalk -v1 -chamwan sec3.hil.memhamwan.net 1.3.6.1.4.1.14988.1.1
Timeout: No Response from sec3.hil.memhamwan.net
Corresponding sniff:
[ryan_turner@sec3.hil] > /tool sniffer quick interface=ether1-local port=snmp
INTERFACE TIME NUM DI SRC-MAC DST-MAC VLAN
ether1-local 1.897 1 <- 4C:5E:0C:89:7E:BF 00:0C:42:6E:6C:1E
ether1-local 2.912 2 <- 4C:5E:0C:89:7E:BF 00:0C:42:6E:6C:1E
ether1-local 3.901 3 <- 4C:5E:0C:89:7E:BF 00:0C:42:6E:6C:1E
ether1-local 4.904 4 <- 4C:5E:0C:89:7E:BF 00:0C:42:6E:6C:1E
ether1-local 5.911 5 <- 4C:5E:0C:89:7E:BF 00:0C:42:6E:6C:1E
ether1-local 6.908 6 <- 4C:5E:0C:89:7E:BF 00:0C:42:6E:6C:1E
[ryan_turner@sec3.hil] > /interface ethernet print
Flags: X - disabled, R - running, S - slave
# NAME MTU MAC-ADDRESS ARP MASTER-PORT SWITCH
0 R ether1... 1500 00:0C:42:6E:6C:1E enabled none switch1
So… what gives? I’m stumped as to why this isn’t working.
In my previous post I showed the firewall settings as well as sniffer traffic showing in fact traffic was being received. Even if there was a firewall in between filtering traffic going from sec3.hil to monitor, the local sniffer on sec3.hil should’ve seen packets being sent out.
Still stumped by this… anywhere else to check? Any other services I should try?
When you did the sniff, did you also examine one of the packets to see what the IP source address was?
Your posting only shows src/dst MAC addresses.
Basically, one of three things is happening:
The Mikrotik is discarding the SNMP messages on their way up the IP stack during ingress
The Mikrotik’s SNMP service is ignoring the requests / failing to process them
The SNMP replies are being discarded on their way down the IP stack during egress (or are being mis-routed due to an IP routing issue)
Are you trying to do control plane isolation with VRFs on these devices? I’ve had problems trying this when the control plane is on any other than the default VRF.
Please note that when you filter packets in a firewall rule, they still appear in the packet sniff!
So the appearance in that output does not indicate your firewall must be OK.
In other threads about mysterious problems I have sometimes read the advice to export the configuration, then reset the
router to defaults with initial config from that file (on the local router flash).
Not that I would dare to do that on a remote router without experience…
Yeah one is easy to get to but the other is only accessible every few months. I’ll just have to buy spares and swap them out. Very disappointed that this has happened.
Maybe you can try that method on the accessible one and when it succeeds and fixes the problem try it on the other one.
In principle, it can be done from remote. The only problem of course being that it is down when something goes wrong,
and right now it probably works but just cannot be monitored.
In the meantime I have done some of those configuration resets on remote routers, and while they proceeded
without problem, they never solved my (different) issues. I.e. after the reset the situation was exactly the same
as before. Which is what you would expect, of course.