After running a Dude for a few weeks with 1000+ devices.
Something very strange happened about 800 devices are down/partially down because ping probe timeout.
The servers uses 10 agents all running on WINE.
The ping graphs are working correctly.
In trying to solve it backup and restore on a different machine even windows did not changed a thing.
I have no Idea how to proceed.
Any idea will do?
Noticed that all outages on all devices happen on the exact same time.
I have restored from a week old backup, Ping probe is working now but a lot of other services shows as down, While hitting reprobe its says down immediately the affected services are memory cpu disk and a few others.
The graph of the services which are down is contains new data.
Also while trying to rediscover new servers they are shows on the map while discovering and disappear a few seconds after.
Also notice that after the restore. Http switch and router probes are working well.
Is there any way to turn on debugging to know what's going on?