SNMP timeouts and polling issue on ROS v7.6.

I have several routers (1036, 1072 and x86)on ROS V7.6 and I did notice since updating to v7.6 SNMP polling on Librenms and Cacti is no longer working or works intermittently on high CPU loaded router. Interface graphing is mostly affected where nothing gets graphed anymore. Has anyone experienced the same issue and is there a work around?

I can confirm that this issue still exists up to 7.8, havent tested with 7.9 yet.

In my scenario I tested with several 1072 using LibreNMS. Some of these 1072’s has lots of routing, vlans along with BGP, some has lots of NAT rule and other has lots of queues up. I never had this issue before upgrading to RoS 7.x. None of the routers I mentioned above doesn’t pass 20% CPU overall in peak times.

Example below - This router has 2 BGP sessions up and running with default route only (one for ipv4 other one ipv6, 2000~ static routes and 30 firewall rules, 250 vlan entries) and can burst up to 4 Gbps traffic on peak times.

Daily graph

Weekly graph

Mikrotik graph (enabled for testing purposes)

The issue is still persistent, which made me result to setting up an x86 boxes using Epyc CPUs. I can confirm the SNMP graphing gaps are gone. So it must be a CPU time prioritisation in ROS and SNMP is given a low priority.

Problem exist in ROS 7.10.1 with 3-4 Full BGP tables , OSPF , and some static routes ( no NAT or connection tracking enabled ). CCR1036

Same issues with CCR2216 and 7.12beta3. About 40 bgp sessions and very low cpu load.
CPU/Memory graphs are fine, just the interface graphs are not longer working.

The workaround for me is to increase the timeout of the polling and parallel request at least worked for librenms.

I’m on 7.11.2 and have another issue, also related to SNMP.
In (i-)regular intervals the values read via SNMP are just bogus. This two screenshots show what i mean:
Screenshot_2023-09-18_13-13-53.png
Screenshot_2023-09-18_13-17-46.png
This is on a gigabit link, later shrinking to 100megs… 14.19gigs down… I’d wish xD

so something clearly isn’t right with the SNMP reading.
funny enough, having a lot of devices running 7.11.2, i only get this on my CCR2004-16G-2S+ ¯_(ツ)_/¯

The workaround for me was to disable polling for arp/ip/cache routes.

FYI this is Mikrotik support answer:

Look at the /routing/stats/process list
If most of the CPU time is from "configuration and reporting" process (supout file shows that it is the case) then just stop monitoring routing table via SNMP.

We need to be able to disable the oids for routes (can’t), even a simple snmp walk kills the snmp process. We have approx 300,000 routes