The Dude "Forgets" SNMP Probes

First some stats on my setup:

  • Version 3.6
  • Windows XP machine with all the latest Windows updates
  • 109 Submaps
  • 1060 Devices
  • Approx 300 RouterOS devices
  • 8 Admins

Now the problem:

We’ve had a recurring problem where The Dude tends to just “forget” how to do SNMP probes. For example, there are approximately 20-30 devices with an SNMP probe on them. Every couple of days (it’s seemingly random), we get alerts that every single one of the SNMP probes is down at the same time. A restart of The Dude fixes the problem. This has happened with different devices and different SNMP probes checking different OID’s. The only common factors are The Dude and Windows.

I’ve looked into memory issues and that doesn’t seem to be the problem. Neither does CPU. I also wondered if Windows was running out of sockets, so I installed a little program called TCPZ. Some people have mentioned this in the past so I tried it even though I know SNMP uses UDP, not TCP. It didn’t really help anyway.

At this point, I’m at a loss as to where else to look. Do other people have this problem as well? Any ideas where I could look?

Thanks.

We see this problem from time to time on our platform.

We have sometimes Router OS Link labels not refreshing as well.


We are waiting for version 4 final as it seems that version 3.6 is not upgraded anymore.

We have the same problem. Not every couple of days, but couple of weeks. The difference is in device count (we do not have 1060 devices, but about 200).
The problem was in The Dude v 3.6, and it is now in The Dude v 4.0 b2. The solution is to restart The Dude service.

The Dude opens 1 UDP server socket which is used for all SNMP communications. You will not be able to solve the problem, only The Dude developers can do it.

You may only make scheduled task to restart The Dude every night.