I have four RouterOS devices:
CCR2116-12G-4S+, CRS326-24S+2Q+, and two CRS310-8G+2S+
All devices are running RouterOS 7.18
I have SNMP v3 working and enabled on all four devices.
I have SNMP limited to only LibreNMS box.
I have given my SNMPv3 user read/write privileges
I have installed the agent script for LibreNMS via this link: https://docs.librenms.org/Support/Device-Notes/Routeros/
This works for the CCR2116-12G-4S+ and both CRS310-8G+2S+…
However I cannot get it to work with my CRS326-24S+2Q+
I have compared the SNMP settings, user, script ownership, the script itself. All are the same.
When I look at the run count for the script, the run count is going up much faster than the other boxes.
When I install the script, the script, the job and “txtContent” environment gets created.
If I run the script manually, the “vlansu” and “vlanst” environments get created.
If I let LibreNMS run the script, the “vlansu” and “vlanst” environments do NOT get created.
Seems like a permissions issue, but I can’t seem to figure it out yet.
I have disabled write, saved, enabled write,save on the SNMPv3 user to see if that helps.
I know the SNMPv3 user works because everything else for the CRS326-24S+2Q+ populates in LibreNMS correctly.
Observations (Edited to add info):
The script user owner is a “full” group member. (I removed “admin”) and both the user and the group is exactly the same across all four devices.
I removed the script, job, and all three created environments and re-added the script. Then ran rediscovery in LibreNMS.
I re-applied the SNMPv3 secrets again. No change.
Just an idea - It looks there was a change to the script merged in October: https://github.com/librenms/librenms-agent/pull/524
I didn’t look too deeply into it, but it looks like the LibreNMS server-side change to handle the change in VLAN name generation is still waiting to be merged.
Is it possible that you have the older version on the working configs?
The new version line #4 declares a variable that wasn’t used in the earlier version of the script:
:local vname
Ok. I was thinking that if the working configs predated the broken one it could be related.
Perhaps you could also try running the script remotely via snmpset from the LibreNMS server. That would tell you if the connection and auth are working.
If that’s good then you could try a manual discovery with discovery.php:
./discovery.php -h hostname -m vlans -d
Compare with output from a working switch.
Edit: One more thought - too many script executions could be caused by SNMP retries. The default on Librenms is something like 5 sec timeout and 5 retries. And the failing switch is also has the lowest processor capability. You might try extending the snmp timeout for that one device to 15 sec to see if changes the result. Or if you want to know for sure, log snmp activity on the Mikrotik and add a log output to the end of the script so that you can correlate events.