Dude 4.0 beta4??

Duduhandelman · March 5, 2012, 7:17pm

Thanks you for that.
I would Be happy to see the following.

Adding and removing probes from a few devices at once.
I think that the database should be readable somehow for example my dude started to show pings probes as down but the pings graphs are alive on 400 devices from 1000+ I have no idea where to look.
Option to change text color with conditions.
Add more custom fields that can include OID. This will allow better inventory for example number of CPU’s amount of memory.
Linux native.
Ability to add Latancy monitor on a circuit.

I would like to say THANKS for this great peace of software.

You should consider one of the following.

Paid support ( I would be happy to pay for support)
Going open source.

I would be happy to support the project…

Many thanks.

lebowski · March 5, 2012, 9:40pm

I have used the dude for years and have helped many people use the software and think it is still the best monitoring system available. You guys do awesome work and I know you will continue so here are some long standing issues/requests…

RouterBoard users are unable to export a backup.

A database above 2gb will causes issues and probably wont load.

Negative Cache time should not become true until the number of configured retries is reached. Currently it becomes true on a single probe timeout and causes the probe to stay down for 300 seconds. Manual re-probes do not override Negative Cache either. (causes false positives)

Set a notification with a delay of 15 minutes and if a device goes down and comes back up before the delay user will still receive a notification.

Clicking “new map” might display loading but it will not load. Clicking new map again is a work around. (not important)

The clipboard hook breaks other applications that are currently hooked to the windows clipboard. Reloading the other application fixes their hook into the clipboard.

Honestly wish I had a better way to describe this but. Something changed or is wrong with the way polling works, somehow related to internal IO routines. This seems to have been introduced between 3.x and 3.6. Labels are way more often seen with TX/RX instead of values. I see a lot of false positives with 30 second polling and since I don’t see other people complain about false positives I believe this might just be my setup although when I changed to 1 minute polling all my false positives went away so it still seems like there is a fundamental problem. (In the past I have verified with Wireshark that a probe for a failing service was issued and was received yet the probe stayed down.)

Ping has a wildly varying RTT. For example Ping will show a 100ms response time until a manual re-robe is issued. This will reset ping to a more realistic 1ms for a long time. Does a successful ping somehow use a cached previous value to graph?

Other observations; in windows polling is halted while doing an export and all probes that time out while exporting appear down as soon as the export finishes. (The disk IO seems to affect probe IO in a negative way or it is designed this way to keep from writing to the data that is being exported either way it could be improved).

No need to fix this but if you right click on a device and leave the label up for a long time all probes will time out (maybe related to IO troubles).

Clicking on copy clears the history for the original object, it should only clear the history for the object on the clipboard.

Feature requests…
Embedded sub-maps don’t reflect outages to top map.

Database editor/export/cleanup tool.

Security
Security groups for tools, currently read only users can use any tool (including web admin).

Security group for maps restricting access to specific maps.

Security group for additional SNMP OIDs. If you are logged in as a read only user snmpwalk doesn’t work, custom labels don’t display.

Thanks for creating the dude, even with the issues it is still the best, none of them are show stoppers…

Thanks again,
Lebowski

EMOziko · March 6, 2012, 7:33am

ok ok

http://forum.mikrotik.com/t/routeros-dude-web-interface-problems/25965/1
http://forum.mikrotik.com/t/dude-for-linux/36344/1
http://forum.mikrotik.com/t/feature-requests/41563/1

If you read this forum, there is plenty feature request enought for nex 3 versions.

ste · March 6, 2012, 8:20am

We use Dude for a long time, too. It works great. Of course there are wishes:

More possibilities to manage MT-Wireless Network

CPE-Management
Keep a record of each CPE with customer name, location, Serial Number, connection/trafficstatistics …
So when a problem arise I have a searchbox, enter customer number and find the device. See which
AP it logged in last time, what was the signal level, …
Upgrade Firmware of MT-Boxes from within Dude.
Scheduled Upgrade (e.g. when a CPE connects give him an upgrade). It’s annoying if you want to
upgrade all CPEs and you find some which are only connected from time to time.
Radius-Integration: When I make a record for a CPE within Dude Access is granted for this CPE.
Get CPE Traffic statistics to show within Dude.
Google Earth Integration
If you have coordinates of APs and CPEs with its signals it would be easy to predict if a new
customer would get a usable signal by simply look at his location within Dude.
QOS Monitoring
Give warnings when some defined levels of Quality are not met. Latency, Bandwith, Customers
per AP.
Android App.
We also notice that IO has some problems. Sometimes SNMP queries seem to fail and values are
not shown.

Keep up your good work,
Stefan

odie · March 6, 2012, 12:45pm

snmp problems are getting worse the more clients you monitor - seems to be an I/O problem ?
we are running the dude server on a intel dual 4 core server with 32gb ram on w2k8r2 getting more and mor troubels with
alarms beacuse of snmp polling errors
a feature to move some items to a new sub-map would also be fine

pelish · March 6, 2012, 3:36pm

I have big problems with SNMP too:
The Dude as x86 routeros package - mikrotik version 5.7 with about 250+ clients on map.

After 7 days without any problems SNMP disconnects from all mikrotiks with 5.x versions and from all APC UPS’s, also from some switches (some netgear etc.) It need to be rebooted because it never more get SNMP info from that clients.

Also when server is rebooted, it needs manually connect to RouterOS in Device-RouterOS. If it tryes to connect automatic it never connects to client.

All that erros tested on mikrotik 3.30 as server with the same result.

I will be rally glad with any progress in fixing that.

lebowski · March 6, 2012, 10:32pm

Would be nice if 10GB interfaces were collected using the 64bit counters.

Minollie · March 7, 2012, 11:54am

Ai… Now I understand why The Dude seemed to be dead, you were waiting for us to post bugs/requests…

Here are some bugs/request:

B1: every now and then The Dude server gives up polling SNMP values, a service restart or server reboot solves this, but that doesn’t seem to be the way to solve it if you ask me..
B2: every now and then it becomes impossible to log in into The Dude using the webinterface
B3: I miss a whole lot of outages, in the device page/webpage and overview, it has been there but due to redesign it went somewhere else and now is most of the time unusable.
B4: SNMP walk crashes more then once, places where it crashes vary, no particular place or device(type)

R1: an Android app would be great
R2: (sub)map separate reporting/exporting settings (I now run into the problem Dude exports the Main Map even though you configure Sub Maps to export), and then make sure the exported data/graph is nice to look at and not messed up
R3: a way to select a part of a graph and the ability to export that selection (if possible every x days/weeks/months for management information purposes)
R4: a way to schedule maintenance per device/map or entirely
R5: a way to align text in the labels not just centered and possibly messing things up, and labels themselves, maybe enabling/disabling the self-adjusting feature of the labels
R6: implementation of multiple thresholds per device
(eg: Temperature: <15 low critical, 15-18 low major, 18-19 low minor, 20-25 OK, 26-27 high minor, 28-30 high major, >30 high critical and one stating Not Available/Error)
R7: more syslog possibilities/functionality, event forwarding would be nice
R8: a way The Dude would work around the 32bit traffic counter problem on interfaces, continously going from 0 to 4GB data to 0 and up again, would be nice if this would always show increasing values which is desirable

Well, this is it for now.. I think I might have another couple of bugs/featurerequests but none come to my mind right now.

Thanks in advance for your work guys!

Regards,
Minollie

Minollie · March 7, 2012, 11:56am

One more little thing…
How about a new certificate?? The old one is getting grey hairs and looks like a dinosaur..

Thanks!

normis · March 7, 2012, 12:50pm

What do you mean by “new certificate” ?

pelish · March 7, 2012, 1:24pm

The dude saves all passwords in plane text. When you set device type to Mikrotik and than you use web command it shows password in url in get method so everyone can read it. Also when someone use “copy” on an device with saved password and place it into notepad, it shows him unencrypted password.

so my other request is: use hashed passwords on every place in Dude.

Minollie · March 7, 2012, 1:25pm

Hi Normis,

With the initial setup of The Dude v 3.x and as fas as I know 4.betax there was a security certificate included.
This one was issued by Mikrotik to be used for https and other secure links.

I found the file in [drive:]\dude\data\files directory.

But.. since you seem to have forgotten about it’s existence in the first place I now know why it has expired

Is this enough information?

Regards,
Minollie

pelish · March 7, 2012, 5:27pm

Backuping Mikrotik devices by The Dude will be nice. Also exports can be useful…

soosp · March 9, 2012, 10:40am

Lack of the ppc support is a barrier to migrate to The Dude v4, because of some of our Dude agents are ppc devices.
Native Linux version (at least the server side) would be fine.

Duduhandelman · March 9, 2012, 2:34pm

When looking on the map list double click on a map opens the map setting and the map itself.
I assume it should only open the map.

lebowski · March 9, 2012, 2:38pm

@Duduhandelman

Check settings, Misc, contents pane behavior…

calman · March 10, 2012, 5:05pm

Dude v4.0beta3 on mikrotik powerpc devices!

Beccara · March 11, 2012, 9:39pm

Stability first and formost, The current b3 has major stability problems after handling a major outage when monitoring 1000+ devices the dude will start maxing CPU and memory will start to spiral out of control.

Also any device with a global route table and SNMP turned on within the dude will cause stability issues. Multithreading may help

These issues occuring on a very fast box but when the dude goes nuts it’ll consume any and all resources on the box, Compared to PRTG monitoring the same amount of devices on a box 1/4 the spec of the dude box and it chugs along just fine

fbsdmon · March 17, 2012, 9:31pm

Features:

Simple text fields on maps to be used as notes, would do magic for me
Ability to use functions in tools
Threshold and Baseline monitoring. It would probably be a wrapper probe(s) that can watch over another probe and would be enabled and configured on the service.
Ability to reconnect link’s! I would kill for this I just hate it when i need to reconnect a cable/port to another device and loos all history when I do the same in The Dude.
Ability to disable snmp probing of devices. Sometimes I just need to ping a device, and I cannot tell the dude not to try and pool snmp data off it.
RegEx in functions
Some functions are not aware of dude agents, like ping for example

Can we have a road-map ?
Can we have some commitment ?

If open sourcing the dude is not an option, maybe you would be willing to accept help from programmers that are willing to work under an NDA? I know I wouldn’t mind.

I would be more then happy to put my experience with fault management and performance monitoring software to good use and articulate some features/solutions I’ve seen in greater detail for you, if you wish.

geoffsmith31 · March 17, 2012, 11:43pm

Capability to specify a “Context” for an SNMP configuration
Security groups (AD integrated?) to give none, read only or read/write access to different maps and tools
Outages on sub-maps and sub-sub-maps to flow up to the top level
Automatic unAck for devices/probes once theu are up again (option for auto or manual unAck)
SNMP trap receiver - with notifications per different types of traps received
MSSQL or MySQL database backend so we can write queries to extract data directly for reports etc.
Notification escalation. ie. A probe is down 3 times and one type of notification is sent, down 15 times and a different notification is sent, down 50 times and another type of notification is sent. Also option to send notifications repeatedly while the probe remains down.
Ability to specify which OIDs The Dude will probe by default per device type. To help overcome the high CPU utilisation some devices suffer when The Dude has snmp probe turned on.

Apart from the fact that there are things I would like to see added to The Dude, there is no denying that it is one of the best, most intuitive and easily maintainable monitoring tools that I have ever used. Keep up the good work MikroTik.