Can the dude send a mail with an oid?
I have to mail the description of a specific port in a cisco router when the interface is down.
In the descrition i have the tgu of the line.
Can somebody help me??
thank
Can the dude send a mail with an oid?
I have to mail the description of a specific port in a cisco router when the interface is down.
In the descrition i have the tgu of the line.
Can somebody help me??
thank
Anything you can get to appear on the appearance of a device can easily be placed directly into the notification.
For example putting this on the label of an APC UPS will show the temperature on the label.
Temp: [oid(“1.3.6.1.4.1.318.1.1.10.2.3.2.1.4.1”)]
If I place that above oid in a Notification I would see that value in either the body or subject depending how I crafted the notification. IF the oid exists on the device…
I would create a new notification almost identical to “Notification” and enable that new notification only on that device with the extra oid(s) you need.
HTH
SD
Thank you for the answer, i try to put the oid in the descrition and i can view the tgu of the line in the device description, but when i put the same oid in a new notification the mail that arrive repeat the test of the oid.
es.
Service [Probe.Name] on [Device.Name] is now [Service.Status] ([Service.ProblemDescription])
[Service.NotesColumn]
TGU: [oid(“1.3.6.1.2.1.31.1.1.1.18.2”)]
the mail repeat exactly the same test od the last row “TGU: [oid(“1.3.6.1.2.1.31.1.1.1.18.2”)]”
Oh hmm I have not done this exactly I must be mistaken but I think there is a solution.
The only way I was able to track the status of an interface was to SNMPwalk the device and create a probe that tracked the operational status.
This is not the greatest solution but it works. Just give it the same name as the description on the interface.
I tried to create a function based probe that would give an error but that didn’t work.
Hi sweetdude, thanks for the reply!
I try to use your solutions but it doesn’t work, the notification mail return the name of the probe but not the desciption of the inteface.
Now i’m trying to insert the oid description in the notes fields of the service but i don’t understand the right syntax to pass it.
Can someone help me with this problem?
Bye
The solution is simple using the functions
<?xml version="1.0" ?> **** 57 1345886 If_1_Statusoid_raw("1.3.6.1.2.1.2.2.1.8.1")
**** Check the status of interface number 1
<?xml version="1.0" ?>
****
57
1345962
If_1_Descr
oid("1.3.6.1.2.1.2.2.1.2.1")
**** Description of an interface
\
\
<?xml version="1.0" ?>
****
13
1345894
If_1_Status
8
If_1_Status()
if (If_1_Status() = 1, "", concatenate(" Warning Interface ", If_1_Descr(), " is down"))
If_1_Status()
Note: You should create 2 functions and 1 probe for each monitored interface index.
it looks annoying if you have to monitor many oids, but you can change oids and funtions names directly in XML and then copy-paste it in functions or probes.
You also may avoid second function, then the probe will look like
<?xml version="1.0" ?> **** 13 1345894 If_1_Status 8 If_1_Status() if (If_1_Status() = 1, "", concatenate(" Warning Interface ", **oid("1.3.6.1.2.1.2.2.1.2.1")** , " is down")) If_1_Status() \ \ **to sweetdude**: your function probe did not work probably because you used oid("1.3.6.1.2.1.2.2.1.8.1") it returns string "up (1)" **oid_raw**("1.3.6.1.2.1.2.2.1.8.1") returns 1Ahh yeas thanks gsandul that was the reason that my probe didn’t work. I won’t forget oid_raw, Someone else has tried to solve tracking interfaces in the past and they never got a satisfactory answer. Nice work on that.
Right, I was just saying name the probe the name of the description of the interface you are tracking…
When tracking specific interfaces your going to be building lots of probes but I would use gsandul’s method since it will work on all of the same type of devices and give correct description.
IT WORKS GREAT!!! thanks
Now i can repeat that solution for all the possible router interfaces!!
thanks to gsandul and sweetdude for the reply
I can confirm this works! Thank you all!!!
one more thing: Lebowski, in http://forum.mikrotik.com/t/receiving-mail/33221/1 you wrote:
I wonder: does this happen to all your probes or only to the “custom” probes that you created?
I am having this “issue” now, aswell.
Ping, telnet, etc.. all come back quickly if a device went down completely but the “custom” probes that contain OIDs seem to need 2-3 times longer to report back.
However, they all do come back and that’s what is important
I have solved my custom function based probes.
The description of the “oid” function is “returns value of given snmp OID. Only first parameter mandatory. First parameter - oid string, second - cache time - default 5 seconds (5.0), third - negative cache time - default 5 minutes (300.0), forth - ip address (overrides context device), fifth - snmp profile (overrides context device)”
So for my custom probes I have increased the cache time and reduced the negative cache time to 29 seconds, one second less than the probe interval.
For Example a probe that finds how many Inactave Phones there are the error line is:
if(oid(“1.3.6.1.4.1.9.9.156.1.5.6.0”,10, 29) >= 0, “”, “No Inactive Phones”)
All the oid based functions support manually setting negative cache time.
array_size(oid_column(“1.3.6.1.4.1.9.2.1.57”, 10, 29))
I have completely eliminated false positives by supplying a negative cache time that is lower than the retry interval in all my probes. In most cases I applied it directly to the function that is called inside the probe.
I was hoping that increasing cache time would cause the graphs to be more stable. Sometimes a value is graphed into the next time slice and the graph will have a divot followed by a spike. Increasing cache time doesn’t seem to solve this but at least I have very few false positives. I have not tried to fix the built in functions like virtual memory. I don’t monitor many servers so fixing or creating my own version has not been a priority.
I’m trying to log when the user logs in a windows xp.
For this I’ve done an snmp extension wich returns that username via an oid
Works well and i’ve changed the label of winxp devices (so I’m seeing who is logged on each device).
Now I’m trying to log that oid in a separated log but I’m not able to do the probe
In examples above the oid is logged when probe becomes inactive, but it is possible doing this when probe becomes active?
well.. you could create a probe that goes down when the interface comes up…
Create a Function (like: “interface_status”) that reads the status of the interface (up=1, down=2) and then create a probe (type function) with the “available” condition interface_status() = 2
Now if it comes up, it will report as “down” - get what I mean?
I have not figured out, yet how to report another status than “OK” when something is up.. I would love to show an OID there for our helpdesk but I suspect this is hard-coded into the Dude…
Andreas
Yes, for example, the value
I’ll try your solution (as simple as I Don’t know why didn’t think myself before post !)
but don’t like it very much, because devices will show “inestable” when they are not.
I think I’ll use separate map, and change colors for my windows devices.
Would you add this in new dude features? sure your english is better than mine.
Gsandul, Lebowski,
I have made negative experiences with OIDs being called in probes… however I might have overdone it by polling four OIDs per probe and creating 300 probes total…
The Dude somehow stopped discovering services that I had written probes for. (and they were up and online)
Are you aware of some SNMP polling limit that the Dude has? SNMP uses UDP so sessions would have to be maintained on the Application level (meaning the Dude) - or do you know where to enable debugging in the Dude?
On the bright side: the [Service.ProblemDescription] field now contains interface name and description so I can use it in the email notifications
Thanks,
Andreas
P.S. current workaround is to create 3 functions per ifindex and tie them to one probe: Example for ifindex 4:
<Function>
<sys-type>57</sys-type>
<sys-id></sys-id>
<sys-name>If_4_Status</sys-name>
<code>oid_raw("1.3.6.1.2.1.2.2.1.8.4")</code>
<descr>polls the ifindex 4 Status</descr>
</Function>
<Function>
<sys-type>57</sys-type>
<sys-id></sys-id>
<sys-name>If_4_Name</sys-name>
<code>oid("1.3.6.1.2.1.2.2.1.2.4")</code>
<descr>polls the ifindex 4 interface name</descr>
</Function>
<Function>
<sys-type>57</sys-type>
<sys-id></sys-id>
<sys-name>If_4_Desc</sys-name>
<code>oid_raw("1.3.6.1.4.1.9.2.2.1.1.28.4")</code>
<descr>polls the ifindex 4 interface description</descr>
</Function>
<Probe>
<sys-type>13</sys-type>
<sys-id></sys-id>
<sys-name>IFindex_4</sys-name>
<typeID>8</typeID>
<functionAvailable>If_4_Status() = 1</functionAvailable>
<functionError>if (If_4_Status() = 1, "", concatenate( If_4_Desc()," on ", If_4_Name(), " is down!" ) )</functionError>
<functionValue>If_4_Status()</functionValue>
</Probe>
<Notification>
<sys-type>24</sys-type>
<sys-id></sys-id>
<sys-name>Email alert (TESTING)</sys-name>
<typeID>1</typeID>
<textTemplate>\0d\0aA network alert was received from [Device.Type] [Device.Name] at [TimeAndDate]\0d\0a\0d\0aStatus: [Probe.Name] on [Device.Type] [Device.Name] is now [Service.Status]\0d\0aDescription: [Service.ProblemDescription]\0d\0aPlease report this alarm to be investigated.\0d\0a\0d\0a\0d\0a\0d\0aAlarm details:\0d\0aService [Probe.Name] on [Device.Name] is now [Service.Status] ([Service.ProblemDescription])\0d\0a\0d\0aDevice details:\0d\0aType: [Device.Type]\0d\0aModel: [Device.CustomField1]\0d\0aLocation: [Device.CustomField2]\0d\0aNote: [Device.CustomField3]\0d\0a\0d\0a\0d\0aHave a nice day\0d\0a\0d\0a\0d\0a\0d\0aThe Dude\0d\0a\0d\0a\0d\0a</textTemplate>
<mailTo>name@invalid.com</mailTo>
<mailSubject>NETWORK ALERT: Service [Probe.Name] on [Device.Type] [Device.Name] is now [Service.Status]</mailSubject>
<statusList></statusList>
</Notification>
Have you tried to double your polling times and see if things stabilize (can auto-discover?), I doubt that 600 probes is the problem but then what is it?
I have about 300 probes on a quad core and CPU is hovering around 4%. server rx and tx are 3kbps to 300kbps with it mostly on the low side ~10kbps and spikes to 200kbps for few seconds here and there. I have heard tale of an install with many more probes…
I run the dude process as “real time” priority at all times. Forcing it to real time I do see two benefits, one (maybe just in my head) graphing seems more stable and two when I add a new device and get the 4.2g snmp bug I can usually just wait a few seconds to a minute before I get response back in the client. Then I can delete the device and undo the delete. This has been much less frustrating than when I ran it at normal priority and took forever to finish… 4.2g.bug
Not that this helps but here is a screen shot of my server utilization on a Gigabit Ethernet…
Maybe we can find out if a certain amount of traffic is above the time sensitive processing requirements?