Dude high memory usage

Tue Mar 18, 2008 5:00 pm

I was having a problem with The Dude gradually increasing it's memory usage to very large numbers (over 700M). At the same time I would see approximately 1.5Mb/s of traffic from Dude Server using the network interface monitor function of Task Manager. I run The Dude on a 2.8GHz P4 with Hyperthreading with Windows XP SP2 as my server. Have half a dozen clients, mostly Windows XP SP2 but a couple web clients. Approximately 475 devices being monitored. Dude Version 2.20.

Have been reading here about Dude service stopping after a period of time with Dude 3b7 so I had not upgraded to it. Decided to create a new installation with Dude 3b7 to test it to see if I could cure memory runaway issue I was having with Dude 2.2. Installed Dude 3b7 on separate computer and started building the database of network from scratch rather than importing old database. Stopped periodically to check server performance. After completing about 15% of database, noticed memory usage increasing dramatically and network usage near 1Mb/s. Stopped creating database and installed WireShark to analyze what was going on on network. Found that system was constantly doing SNMP queries for three of my devices. Turned off SNMP for these three devices and memory usage and network usage returned to normal. Then installed WireShark on primary production Dude Server (Version 2.20) and found same three devices in constand SNMP query mode. Disabled SNMP for these three devices and memory and network usage returned to normal (service restart was required).

The three devices that had problems were all three Linux based devices with SNMP enabled. Two of them were my NMIS servers running CentOS while the third is a VPN gateway from SnapGear. The only thing common about these devices is that I was monitoring 'ping', and 'http' on each and all three had SNMP enabled. I have other Linux servers that I am also monitoring but none that have SNMP enabled with an HTTP service also on the server.

Hope this helps others isolate their problems. Hint: to make troubleshooting network traffic simpler with WireShark, change default polling interval to 2 minutes from default 30 seconds.

Also if you have significant number of Cisco routers/switches in your network and you are using MRTG, you really need to look at NMIS. If you have questions about NMIS, send me an email.

