I have used the dude for years and have helped many people use the software and think it is still the best monitoring system available. You guys do awesome work and I know you will continue so here are some long standing issues/requests…
RouterBoard users are unable to export a backup.
A database above 2gb will causes issues and probably wont load.
Negative Cache time should not become true until the number of configured retries is reached. Currently it becomes true on a single probe timeout and causes the probe to stay down for 300 seconds. Manual re-probes do not override Negative Cache either. (causes false positives)
Set a notification with a delay of 15 minutes and if a device goes down and comes back up before the delay user will still receive a notification.
Clicking “new map” might display loading but it will not load. Clicking new map again is a work around. (not important)
The clipboard hook breaks other applications that are currently hooked to the windows clipboard. Reloading the other application fixes their hook into the clipboard.
Honestly wish I had a better way to describe this but. Something changed or is wrong with the way polling works, somehow related to internal IO routines. This seems to have been introduced between 3.x and 3.6. Labels are way more often seen with TX/RX instead of values. I see a lot of false positives with 30 second polling and since I don’t see other people complain about false positives I believe this might just be my setup although when I changed to 1 minute polling all my false positives went away so it still seems like there is a fundamental problem. (In the past I have verified with Wireshark that a probe for a failing service was issued and was received yet the probe stayed down.)
Ping has a wildly varying RTT. For example Ping will show a 100ms response time until a manual re-robe is issued. This will reset ping to a more realistic 1ms for a long time. Does a successful ping somehow use a cached previous value to graph?
Other observations; in windows polling is halted while doing an export and all probes that time out while exporting appear down as soon as the export finishes. (The disk IO seems to affect probe IO in a negative way or it is designed this way to keep from writing to the data that is being exported either way it could be improved).
No need to fix this but if you right click on a device and leave the label up for a long time all probes will time out (maybe related to IO troubles).
Clicking on copy clears the history for the original object, it should only clear the history for the object on the clipboard.
Feature requests…
Embedded sub-maps don’t reflect outages to top map.
Database editor/export/cleanup tool.
Security
Security groups for tools, currently read only users can use any tool (including web admin).
Security group for maps restricting access to specific maps.
Security group for additional SNMP OIDs. If you are logged in as a read only user snmpwalk doesn’t work, custom labels don’t display.
Thanks for creating the dude, even with the issues it is still the best, none of them are show stoppers…
Thanks again,
Lebowski