First some stats on my setup:
- Version 3.6
- Windows XP machine with all the latest Windows updates
- 109 Submaps
- 1060 Devices
- Approx 300 RouterOS devices
- 8 Admins
Now the problem:
We’ve had a recurring problem where The Dude tends to just “forget” how to do SNMP probes. For example, there are approximately 20-30 devices with an SNMP probe on them. Every couple of days (it’s seemingly random), we get alerts that every single one of the SNMP probes is down at the same time. A restart of The Dude fixes the problem. This has happened with different devices and different SNMP probes checking different OID’s. The only common factors are The Dude and Windows.
I’ve looked into memory issues and that doesn’t seem to be the problem. Neither does CPU. I also wondered if Windows was running out of sockets, so I installed a little program called TCPZ. Some people have mentioned this in the past so I tried it even though I know SNMP uses UDP, not TCP. It didn’t really help anyway.
At this point, I’m at a loss as to where else to look. Do other people have this problem as well? Any ideas where I could look?
Thanks.