Bug: SNMP over VRRP interface problem

Mikrotik VRRP problem.png
Hi! I would like to report a strange bug. SNMP based monitoring of Mikrotik router stopped working just after deploying new configuration, involving VRRP. The router R1 is not responding to SNMP queries on IP address assigned to VRRP interface, if queries are sent from the monitoring server S1. Here is an example linux command which I am using for verification:

root@S1:~$ snmpwalk -v1 -c public 10.28.0.> 10
Timeout: No Response from 10.28.0.10

But everything works just fine if queries are sent to the IP address assigned to physical interface. Example command:

root@S1:~$ snmpwalk -v1 -c public 10.28.0.> 8
iso.3.6.1.2.1.1.1.0 = STRING: “RouterOS CCR1036-8G-2S+”
iso.3.6.1.2.1.1.2.0 = OID: iso.3.6.1.4.1.14988.1
iso.3.6.1.2.1.1.3.0 = Timeticks: (7105900) 19:44:19.00
. . .

The problem became visible on SNMP traffic, but other UDP traffic is not working as well (ie. DNS). TCP traffic ie. SSH is working without any problem.

I have tested this against RouterOS 6.37rc11
as well as 6.34.6 (Bugfix only) and 6.36 (Current). The behavior is the same all the time.

The strange part of this problem is, that if I provide a direct connection between S1 and R1 by replacing R2 with cable together with moving R2’s IP address 10.28.0.1 to S1, then it works even with IP assigned to VRRP interface (10.28.0.10).

R1 (VRRP router) configuration:

# aug/02/2016 11:18:38 by RouterOS 6.37rc11
#
#
/interface vrrp
add interface=ether3 name=vrrp priority=150 vrid=10
/interface wireless security-profiles
set [ find default=yes ] supplicant-identity=MikroTik
/ip address
add address=192.168.88.1/24 comment="default configuration" interface=ether1 network=192.168.88.0
add address=10.28.0.8/24 interface=ether3 network=10.28.0.0
add address=10.28.0.10 interface=vrrp network=10.28.0.10
/ip route
add distance=1 dst-address=10.30.0.128/30 gateway=10.28.0.1
/ip service
set api disabled=yes
/snmp
set enabled=yes
/system clock
set time-zone-name=Europe/Berlin
/system package update
set channel=release-candidate
/system routerboard settings
set cpu-frequency=1200MHz memory-frequency=1066DDR protected-routerboot=disabled

R2 (subnet router) configuration:

# jan/07/1970 19:46:59 by RouterOS 6.35.4
#
#
/ip address
add address=10.28.0.1/24 interface=ether3 network=10.28.0.0
add address=10.30.0.129/30 interface=ether1 network=10.30.0.128
/system routerboard settings
set cpu-frequency=650MHz protected-routerboot=disabled

Monitoring server networking configuration:

ip address add 10.30.0.130/30 dev  eth0
ip route add default via 10.30.0.129

The problem may be “source address selection”.
When the router has to send a packet, it selects a source address for it, from its own available source addresses.
You can try adding a preferred source address to the route used to reach your SNMP monitoring system.

You should avoid using the VRRP address for monitoring purposes.

To pe1chl: It could be caused by source address selection, thank you for pointing this out, but it would affect TCP traffic as well as UDP. I have tested this with SSH and it is working. I even followed your suggestion and set pref-src to the route, but did not help.

Why? In it is useful to use VRRP IP for monitoring in some clustering setups. But I tested your suggestion by creating loopback like bridge interface. It is even not possible to access SNMP on IP address assigned to it.

You should monitor both nodes, not just the master.

Let’s not focus on the way how is my monitoring solution working, because it is not really important now.

I can give you an another example to think about instead of SNMP. Let’s say DNS service is running on VRRP IP. The problem is that I cannot reach DNS using VRRP IP address. It seems to be a RouterOS bug, because SSH, telnet … (TCP based services in general) are reachable, but not SNMP, DNS … (UDP based services in general).

Hey!

Got the same issue here. DNS on VRRP interface is not working.
The strange thing is, that it worked for month. Since this morning no DNS requests are answered anymore. There is no change in configuration.

In short:
The client creates a request ans send it to the VRRP address of the router. The packets arrive at the VRRP address, the DNS server on the active router get the DNS information from the internet, place it in its own DNS cache, but didn’t answer the clients request. :open_mouth:

Have you found a solution?

Ok. I made a little test setup and contacted the Mikrotik support. Maybe anyone here want to examine the issue :wink:

Here is the network setup:
MT VRRP DNS Problem.png
You can find complete .rsc config files for RB750GL as attachment.


All routerboards (RB750GL for testing purposes) have the latest ROS version 6.39.2 installed.

Brief description
RB_001 get no answer on DNS requests if it ask RB_003 on the VRRP address (10.0.0.3). RB_003 however get the DNS request, fetches the information from the internet and put it in its DNS cache. But it do not send a reply to RB_001.
If RB_001 ask RB_003 on the ether interface address (10.0.0.2) it gets the reply instantly. No matter whether the name is in the cache of RB_003 or not.
RB_002 gets everytime a reply. No matter if it uses the ether (10.0.0.2) or the VRRP (10.0.0.3) address of RB_003.

Very strange behavior.
RB_001.rsc (1.47 KB)
RB_002.rsc (1.15 KB)

And the RB_003.rsc :wink:
RB_003.rsc (1.54 KB)

Try it with a /32 assigned to a bridge. You can assign the same /32 to multiple devices as a loopback (no ports in a bridge). It is learned trough a routing protocol like OSPF and the queries will go to the nearest /32 naturally and fail over to further ones in the event of a failure.

Post your configs as plain text please. Side note, VRRP should source traffic from an IP on the underlying interface not the shared address. If you have any security policies or NAT around the shared IP this could be compounding the issue.

I can think of at least one reason why you’d want DNS queries to the virtual IP to work - high availability. If you give out one of the physical router IPs as the DNS server in DHCP options, what happens when that router fails over to the other one?

Read my post. Use a /32 aka anycast. They can be present at multiple places in your network. Perfect for DNS resolver and NTP servers. Try out it. Grab 3 routers and put the same /32 that isn’t part of your network scheme elsewhere on them and advertise the /32 via a routing protocol like OSPF.

It’s literally what Google does for our beloved 8.8.8.8.

Hmm. Never thought about the anycast solution. Sounds interesting.

Nevertheless Mikrotik answered my support request. They have the following solution and it works.

/ip firewall mangle add action=mark-connection chain=input dst-address=10.0.0.3 
new-connection-mark=to_vrrp passthrough=yes

/ip firewall mangle add action=mark-routing chain=output connection-mark=to_vrrp
new-routing-mark=from_vrrp passthrough=yes

I’m not sure, if this behavior is a bug or feature, but it works with the mangle rules :smiley:

Nice, glad you found a work-around to use the VRRP interface. Let me know if you have any questions using a loopback.

Hi all
Just let me know if this the same problem as below :

I have 2 CCR routers with an IP public each other
On the interface of each (wich publie IP public), I have a VRRP interface with an other IP Public

When VRRP is disabled on each routeurs, we can SNMP on each CCR.
When VRRP is enabled on each, we can SNMP only VRRP IP and “VRRP BACKUP” CCR NativeIP. The “MASTER VRRP” CCR Native IP don’t respond to SNMP request…

SSH is OK on the 2 CCR even if VRRP is enabled.

Do you think it’s same issue ? It’s strange beceause it seams the inverse…
I would like to monitor my 2 CCR with their native IP and no with the VRRP IP…

Thx

Yes it seems like the inverse. Try setting the source IP for SNMP.

Additionally, I don’t typically manage my device by public IP. I typically deploy a private management network. This might take the shape of a VLAN pushed to the CPE that is isolated. I don’t know your specific topology but it is generally unwise to perform management functions across open networks.

We have discovered exactly the same problem today. The VRRP master is not accessible on its unique address when using SNMP.

Did you find any solution to this error?

Look further up the page:

http://forum.mikrotik.com/t/bug-snmp-over-vrrp-interface-problem/100372/18

I think it is not really specific to VRRP, I have other MikroTik devices with multiple IP address and when routing becomes asymmetric the SNMP server sometimes does not respond.
It looks like the SNMP server does not reply from the address on which it receives the query, but rather it sends the reply using an unnumbered socket and the source address is determined by the outgoing route :frowning:
I hope it will be fixed sometime as it is cumbersome to have specific handling of such errors in the mangle table at every device.