Ping Timeout (Database related)

Posted: Mon Feb 27, 2012 10:43 am
by Duduhandelman
After running a Dude for a few weeks with 1000+ devices.
Something very strange happened about 800 devices are down/partially down because ping probe timeout.
The servers uses 10 agents all running on WINE.
The ping graphs are working correctly.

In trying to solve it backup and restore on a different machine even windows did not changed a thing.

I have no Idea how to proceed.

Any idea will do?

Noticed that all outages on all devices happen on the exact same time.
I have restored from a week old backup, Ping probe is working now but a lot of other services shows as down, While hitting reprobe its says down immediately the affected services are memory cpu disk and a few others.
The graph of the services which are down is contains new data.
Also while trying to rediscover new servers they are shows on the map while discovering and disappear a few seconds after.

Also notice that after the restore. Http switch and router probes are working well.
Is there any way to turn on debugging to know what's going on?

Posted: Wed Mar 07, 2012 10:16 am
by metlinux
We have the same problem.
Running The Dude with ~3500 devices.
At some moment, a ping probe fails, and most of devices appears as down.
Looks like some bug in ping probe..

Posted: Wed Mar 07, 2012 4:29 pm
by Duduhandelman
This is very problematic.
I have no idea what is the cause, Since Ive restored the db all is up but with less 300 devices
I'm not adding devices..
This issue is critical.
I hope we will find it fast..
Many thanks

Posted: Wed Mar 21, 2012 11:24 am
by blue
I have the problem that Dude can't ping device. But traceroute is working ok. Dude is running on Win2003 server, and from Windows i CAN ping device normally. Reboot of win server didn't gave results. Can someone from mtk examine this problem...

Posted: Wed Mar 21, 2012 11:26 am
by blue
Is traceroute also using icmp protocol?

Posted: Sat Sep 06, 2014 5:10 pm
by blazej44800
Just run Server as admin .. with administrator rights :)

Posted: Wed Jan 08, 2020 11:06 am
by miladta
i have the same prolem
i have over +300 device and cant ping but traceroute is complete and all device work properly

Posted: Thu Jan 09, 2020 3:24 am
by macsrwe
May or may not be related…

I ran into a situation where as my network slowly grew, the Dude suddenly became sluggish and started dropping tasks (like pings and traffic graphs) on the floor. I discovered all the time was being spent in an encryption routine.

When you add new devices to Dude, the default is “secure mode.” If you turn “secure mode” off for your devices (which unfortunately has to be done by hand one at a time, but at least it’s just two mouse clicks per device) suddenly the Dude becomes functional again.

I was running Dude on a hEX, so I saw this effect at about 250 devices. If you’re running it on a CCR or whatever, I could see it happening at 800.