Dude 3.0 beta7 service hangs/stop monitoring

Hi there,

Anybody experimenting Dude 3.0 beta 7 hangs (stops monitoring) after around 20-24 hours working?

Maps still “working” I mean you can edit devices, move them around, create links etc. but all the data and services monitoring is dead, stopped, showing exactly the same values forever until SERVICE “The dude server” is restarted.

Any ideas or a way to restart periodically the service in order to keep my network monitored ?

Help will be really appreciated.

Jorge Boardman

I am getting this as well. Actually exactly the same symptoms as with 3.0 Beta 6 which is very dissapointing. I was really hoping this would get fixed as there was a lot of discussion about the problem with Beta 6 and we have waited a long time for any fix

BUMP !!!

Any comments from developers ?

Hi anybody else experimenting this issues? Mikrotik guys any recommendation?

This Dude server is running only for that purpose on a Dell Poweredge 350 server, 850Mhz P3 CPU w/512M Ram on windows XP.

Most of unnecessary services turned off, all resources dedicated toi The Dude.

Best Regards

Yep same problem for me.

please make the export files at that moment and send to support@mikrotik.com so we could check the problem.

Yes, I’ve seen this problem too for a long time and have implemented the following workaround (which has done quite well for me for several months now).

On the Dude Server, I made a little batch (restartdude.cmd) file:

@echo off
net stop "The Dude Server"
net start "The Dude Server"

Then, I downloaded (freeware) nncron lite from nnsoft (http://www.nncron.ru/). I installed it as a service and set the cron.tab file to run the restartdude.cmd file once/hour.

Has worked well for me over 6 months. Occasionally get a dead spot, but the brief stop/start doesn’t even show and the service has been running quite well this way.

I too have had problems with the dude running out of memory and then hanging. Unable to stop service, or end task from task manager…so I wrote a small script and set to run every hour as a scheduled task. It uses the pskill command from sysinternals (http://tinyurl.com/2ouz4o) to kill the dude process in case it didn’t respond to the net stop command.

REM stops the dude
net stop “The Dude Server”

REM kills the service if still running
C:\pstools\pskill.exe -t dude.exe

REM starts the the dude
net start “The Dude Server”

Anyone know why the memory holes are going unplugged??

Our dude server is running non-stop for weeks, so those export files would help to find your problem.

I wish I could, however, my company believes there is information which should not be released. What typically cures the problem with other people who have submitted the xml export?

I emailed my log file last week.
Did you find anything?
Thanks,
Aaron

We just experienced the same problem. Happens quite often actually. Sent along the export file, but it is rather large. Not sure it it will make it through.

Yep, same simptoms here…

Scenario:
»The server roles on wich Dude is running are just Dude and VMWare.
»I Run Dude as a Service.
»I constantly have 2windows open separately with 2network diagrams that I use as a dashboard.

Problem:
When I leave work, usualy I lock the computer with the 2windows Open.
What happens is that on the next day, when I unlock the computer, everything is green…that is…stalled!
The problem is that… if I don’t check it right away (enable/disable a map) I was fooled to believe that everything was fine…when it isn’t, becouse although the icons are green, if one node is down, the icon is never gonna change becouse monitoring is stalled!

Another problem that I noticed on Dude 3.0Beta7 is…

Scenario:
»I have several maps created, but normaly I just need to monitor some of them; so I disable all the maps that I dont need to monitor and I save network bandwith.

Problem:
When I need to restart the service in order to correct the problem that I mentioned previously, the maps that I disabled are now beeing monitored. What happens is the icons get coloured, but the map still have the disable status.
To correct this I have to select all the maps that I want disabled again, enable them and then disable them again! Only then the icons turn Gray.

Hope to see this problems fixed on the next Beta release.
Dispite this problems, congratulations on this tool…best monitor I’ve found after a long search&compare.

Best Regards.

New developments on this issue…

I tested something different… insted of running Dude as a Service…I’m running in “All the Time” mode, since I maintain the same login on that Computer and I never LogOff (just lock the Computer).

24hrs have passed and everything is working fine now…
I will keep running on this mode and will post future results.

Has anything been discovered by the MT crew on this? I am experiencing the same problem. The Dude (3.0b7) is running on a Win2K3 server as a service, and after 24-48 hours, many of my devices start showing as down. If I restart the service in windows, all comes back green. My network has about 100 devices being monitored on multiple sub-maps.

Based on Synclops suggestion above, I’ve just changed from running as a service to running All The Time and just keeping the windows user logged on.

Synclops, has this resolved your issue?

Sorry to say Analog...but it didn't work out! :frowning:

...so I went back to run as a service and made a Batch file:

@echo off

REM Stopping Dude Service....
REM .
net stop "The Dude Server"

REM Starting Dude Service....
REM .
net stop "The Dude Server"

This was suggested by a user above.

...Copy and paste this to a file and then use windows task schedulle to run it from within 12 and 12hrs or other task schedulle softw.

To workarround the other problem...the maps that were disabled that start beeing monitored despite the fact that are disabled... :slight_smile:

...I'm trying to find a Macro software that records actions (mouse and keyboard) to enable all the maps, disable all again and then enable just the ones that I wand to monitor constantly.

Unfortunatly this seems to be the only solution...since MT don't really care about fixing this issues. I understand that is freeware, but It's a shame really...because it still gives the brand a bad costumer support image.
If money is the problem...I believe that people wouldn't mind to pay up to 50$ for this tool. Is not much, put all together would pay a development team to fix this bug's.

Best Regards to all the participants of this topic.
Synclops.

That is unfortunate. I’ll give that script a try.

uldis and normis, is there anything we can do to help troubleshoot this issue? The Dude is such great software and yes, I too would be willing to pay for it if that would help justify more development time at MT. It has such great potential.

I had also noticed this behavior for a long time, but attributed it to server issues. But I have mentioned this to MT regarding corporate acceptance. Having gaps in graphs can never be good for reporting.
However, contrary to others mention, a new installation doesnt seem to have this annoying problem. I installed Dude 3.0 beta? on a fresh new 2K3 server, and it had the same problem, but when I upgraded to beta7, I havent had any gaps in over a month.
Of course, if you change two variables (as invariably happens in development), I wonder what solved the problem. The new installation is monitoring a much smaller network (but initially showed gaps), and the installation is essentially on a dedicated Dude server. I now have a website on the server, but still havent seen gaps.
Formerly, MRTG and WUG were also running on the same box. That server was very sufficiently powered (2x3GHz Xeon, 2GB RAM), so I never saw any significant utilization. Possibly multiple SNMP applications, probing hundreds of devices caused some sort of contention. Never did see any gaps in MRTG though. Just checked all the links, and so far so good.

Just thought I would comment on this, since this was hard to explain to even basic users when they asked ‘why are there gaps?’. I had trained tech support to capture bandwidth exports to show customers everything was OK with their circuit. Longer period graphs were not usable for that handy purpose.

Also, maybe for another thread, but the beta7 doesnt seem to have anomalous memory usage. Previously, I would see memory rising to a certain max around 250MB, then cyclically go back down to normal, but start its sine wave rise again. The memory usage now is much lower, and stays pretty consistent. Possibly the two problems were related.

Dont know if others are still experiencing the gaps, but hope this helps…

znet, thank you for your testemony.

I forgot to mention, I have Dude v3.0Beta7 but the maps were created on Dude v2.2.
When I decided to use de Dude v3.0Beta7 insted of the v2.2 version, I used the Import/Export feature and it seemed to work fine because all the information was properly displayed on the newer version.

I will reveal other details regarding the network topology that I’m monotoring:
I am on a LAN environment, monitoring branch office conections to the main office. The connections are made by dedicated access (frame relay) and the network bandwith on each branch is 64kbps. The main office has 2Mbps of network bandwith do all the branch offices.

We are monitoring about 70 Branch Offices, just by using PING and TELNET.
Alltough I have several maps where nodes are monitored using SNMP, those maps are disabled for monitoring.
Only 2 Maps are enabled: Branch Office Gatways (PING and TELNET) and Branch Office Servers (PING).

So… I dont really now if the problem is related to a matter of network size, or network bandwidth or even Network Services. I leave that to MT crew to test and analyse.

Once again I think this tool is very usefull, even to MT Router OS clients to monitor their networks…since there are features in Dude for RouterOS. So I dont understand why this is not subject of an investigation by MT.
Three Months have passed and the comments by the forum administrators were very scarced! :frowning:

Hope this new information will help find a solution.
Greetings to all.

Just loaded the Beta 3 B7 after having extremely stable 2.2 running. The B7 version started hanging in service mode almost immediately (takes less than 5 minutes). The Dude program has consumed about 1GB of ram and the CPU is running 70% or higher where is was about 23MB of ram and 1% cpu usage. Everything slows to a stop on the computer until I can stop the service then all is normal. The only way to get the service to run normally again is to reset the configuration when starting the service and then reimport the settings from disk.

Backed down to B6 and the problem is there also. Same time frame and same problems.

Reinstalled 2.2 and everything is back to being a happy camper.