Page 1 of 1

Receiving mail

Posted: Fri Nov 20, 2009 3:21 pm
by Cybertiti01

I 've a problem with the mail notification. when a router Fall down, i get sometime a notification and sometime i dont.

I've cheked all the different change status (up ==> down , instable=> down ..etc)

I think the problem commes from the probes interval and prob timeout.

Can you help me and tell me witch value did you entered for the prob and the delay interval ( in advanced tab of notification/mail)

nb: Sorry for my poor English

Re: Receiving mail

Posted: Tue Nov 24, 2009 3:10 pm
by chrisd13
If you need the device to be notifying quicker, I would reduce the probe count from 5 to 3 and potentially drop your Probe interval down from 30 to 10 or 20. This way if your device reboots quicky The Dude should atleast notice that it has actually gone down. I used to get this with a few Windows servers which used to reboot within the time my probe was set to count and notify. I altered the probe count to 3 and it now flags each time it is rebooted.

Sometimes you will need to test different intervals / Delays and Counts for each individual device, but your global probe settings wil normally suffice for most devices.

Hope that helps.

Re: Receiving mail

Posted: Fri Nov 27, 2009 5:11 pm
by pcpolo
Can you explain exactly how the Probe Interval, Timeout, and Count work?

Re: Receiving mail

Posted: Sun Nov 29, 2009 8:03 pm
by matthew12345
Hi...Can you explain what count work is?
I posted a link for who ever interested:

Re: Receiving mail

Posted: Mon Nov 30, 2009 9:38 pm
by lebowski
Matthew is a spammer, now back to the subject.

Probe interval: is how often is the probe executed against the device... default every 30 seconds.
Probe timeout: is how long to wait for a device not to respond to a request... default? mine set to 10. others have set to 30.
Probe Down Count: is a counter that if the value is reached will cause a notification or any configured event. The counter increments each time a timeout occurs... default is 3

So in my setup I don't notify on up to unstable which is equal to the first and second probe timeouts but at the third probe timeout the device goes from unstable to down at which point the probe down count is reached and I get an email. It takes 1.5 minutes for the dude in my configuration to decide something broke and send an email.

Re: Receiving mail

Posted: Tue Dec 01, 2009 6:09 pm
by sady
Sweetdude: Does your timeout interval work correctly? All my probes return result "Timeout" after 10 seconds of probing. Whenever i set timeout even to 24 hour, in both server, map and device settings. :(

Re: Receiving mail

Posted: Wed Dec 02, 2009 2:02 am
by lebowski
Oh you shouldn't have the timeout longer than the probe interval... so if your probe is 30 seconds your timeout should be 29 seconds max since your going to probe again in one second. Also keep in mind a device should be responding in seconds at most. BUT I don't have any probes timing out and have not broke a device on purpose and timed how long before they do actually fail. What is causing you so many timeouts?

My probes seem to work fairly good but I will say they have about 5 minute cool down. Meaning once probe fails it is failed for about 5 minutes then the next probe after that will work if the device is back up. I don't know if they did this on purpose and I suppose it is a feature bug :)

If a probe that is based on an function fails you can't get it to "reprobe" until at least about 4.5 minutes has passed if you are clicking the reprobe button.. Other probes don't suffer this like a TCP probes will reprobe instantly.

I did find that there is some other weirdness on windows 2k3 servers, For me if I force the service to real time it doesn't have nearly as many false positives.

Re: Receiving mail

Posted: Wed Dec 02, 2009 5:58 pm
by lebowski
Oh yeah there is a SNMP timeout that is separate from the polling timeout. That time out is set on each community name you create, that has a maximum of 10 seconds. Maybe that is the timeout you are seeing?

Re: Receiving mail

Posted: Wed Dec 02, 2009 7:47 pm
by sady
When I said "All my probes"
I meant "All my probes return in 10 sec"
no "All my probes return Timeout" :)

Just finished test:
-Created empty server(cleared dude "data" folder)
-added new device with telnet and ping
-added "dns" service (there is no actually dns running there)
-press "reprobe", start my hand-timer. 10 sec. away - timeout.
-Set my Server,Map,Device and Service probe interval to 24:00:00 (24hours), timeout interval to 12:00:00(12hours)
-press "reprobe". After 10 sec "timeout" in problem field, against dns service :(