Community discussions

  • 1
  • 2
  • 3
  • 4
  • 5
  • 7
 
mueller
just joined
Topic Author
Posts: 17
Joined: Mon Nov 20, 2006 7:01 pm

Probe Thread

Tue Nov 28, 2006 5:09 pm

I would like to start a thread that has custom Probe example on it. My hope is not only to further my Knowledge but hopefully help others with this great software.


Here are a few I have made.

Cisco CPU
Type: Function
Available: if(oid("1.3.6.1.4.1.9.2.1.58.0")>0, 1, -1)
Error: ""
Value: oid("1.3.6.1.4.1.9.2.1.58.0")
Unit: % of cpu load

APC PDU LOAD
Type:Function
Available:if(oid("1.3.6.1.4.1.318.1.1.12.2.3.1.1.2.1")>0, 1, -1)
Error: if(oid("1.3.6.1.4.1.318.1.1.12.2.3.1.1.2.1")>0, "", "No Load")
Value: oid("1.3.6.1.4.1.318.1.1.12.2.3.1.1.2.1")
Unit: Load amps in decimal
 
winkelman
Member Candidate
Member Candidate
Posts: 235
Joined: Wed Aug 16, 2006 5:00 pm
Location: Amsterdam, The Netherlands

Mon Dec 04, 2006 3:18 pm

Check if a certain program is running on a Windows system ('OUTLOOK.EXE' in this example):

Type: function
Available: if(array_find(oid_column("1.3.6.1.2.1.25.4.2.1.2"),"OUTLOOK.EXE")>0, 1, -1)
Error: if(array_find(oid_column("1.3.6.1.2.1.25.4.2.1.2"),"OUTLOOK.EXE")>0, "", "OUTLOOK.EXE not detected by SNMP probe")
Value: 1 (or anything else, is purely for charting purposes and I return 1 if the service is running)
Unit: running (or whatever you want to call the above values)
Rate: none

This of course requires the SNMP agent is running and configured properly on the Windows system.
 
winkelman
Member Candidate
Member Candidate
Posts: 235
Joined: Wed Aug 16, 2006 5:00 pm
Location: Amsterdam, The Netherlands

Tue Dec 05, 2006 5:24 pm

Add this to a device notes (right-click the device, Notes)
[oid("1.3.6.1.2.1.1.1.0")]
Then the device's popup will list a description of the system.

For example, for Windows systems it will show the hardware and software platform, for Cisco devices the hardware and firmware revisions, etc.
 
winkelman
Member Candidate
Member Candidate
Posts: 235
Joined: Wed Aug 16, 2006 5:00 pm
Location: Amsterdam, The Netherlands

Tue Dec 05, 2006 5:35 pm

The standard CPU load figure ("34%") is the average of all available CPU's, but if you add this to a devices 'Appearance'
Load on [array_size(oid_column("iso.org.dod.internet.mgmt.mib-2.host.hrDevice.hrProcessorTable.hrProcessorEntry.hrProcessorLoad"))] CPU('s): [oid_column("iso.org.dod.internet.mgmt.mib-2.host.hrDevice.hrProcessorTable.hrProcessorEntry.hrProcessorLoad")]
and the device label will show the number of CPU's in the system and the load on each separate CPU (for example: 'Load on 4 CPU('s): 12, 15, 46, 2').

(Only tested for Windows target systems...)
 
mueller
just joined
Topic Author
Posts: 17
Joined: Mon Nov 20, 2006 7:01 pm

Mon Dec 18, 2006 12:58 pm

This is my default label
Thanks for the addin guys


Device [Device.Name] ([Device.Type])
IP: [Device.AddressesCommaList]
Services ([Device.ServicesCount]):
Up: [Device.ServicesUp]
Unstable: [Device.ServicesUnstable]
Down: [Device.ServicesDown]
Acked: [Device.ServicesAcked]
Unknown: [Device.ServicesUnknown]
Dell Tag [oid("iso.3.6.1.4.1.674.10892.1.300.10.1.11.1")]
[oid("1.3.6.1.2.1.1.1.0")]
[snmp_name][snmp_description][snmp_uptime][snmp_contact][snmp_location]
Load on [array_size(oid_column("iso.org.dod.internet.mgmt.mib-2.host.hrDevice.hrProcessorTable.hrProcessorEntry.hrProcessorLoad"))] CPU('s): [oid_column("iso.org.dod.internet.mgmt.mib-2.host.hrDevice.hrProcessorTable.hrProcessorEntry.hrProcessorLoad")]
Notes:
[Device.NotesColumn]
 
mueller
just joined
Topic Author
Posts: 17
Joined: Mon Nov 20, 2006 7:01 pm

Mon Dec 18, 2006 1:00 pm

Dell temperature alert if it gets over 95

available
if(oid("iso.3.6.1.4.1.674.10892.1.700.20.1.6.1.3")>0, 1, -1)

error
if(oid("iso.3.6.1.4.1.674.10892.1.700.20.1.6.1.3")<350, "", "Over Temp 95")

value
iso.3.6.1.4.1.674.10892.1.700.20.1.6.1.3
C
none
 
winkelman
Member Candidate
Member Candidate
Posts: 235
Joined: Wed Aug 16, 2006 5:00 pm
Location: Amsterdam, The Netherlands

Fri Feb 02, 2007 8:30 pm

Warning when disk usage goes over 89%

Type: function
Available: if(hdd_usage()>0, 1, -1)
Error: if(hdd_usage()<90, "", "Disk usage > 89%")
Value: hdd_usage()
Unit: %
Rate: none

Note: this probe uses the built in hdd_usage function, so for devices with multiple hard disks it looks at the average disk space usage. Of course, the above example is easily adapted for a specific hard drive. Just replace 'hdd_usage' with the appropriate oid("xxxxx") call.
 
Pablete
just joined
Posts: 22
Joined: Wed May 23, 2007 4:11 pm

Reachability

Thu May 24, 2007 2:20 pm

First of all, I like very much this tool.

I managed to plot a reachability graph. 100% reachability=no packet loss. "reachability (%)" = 100 - "packet loss (%)"

1st I have created a function:
Name: packet_loss_test
Desc: number of replied pings from 10 ping requests (0-10)
Code:
if( array_element(ping(device_property("FirstAddress")) , 0)<0 , 0 , 1 ) +
if( array_element(ping(device_property("FirstAddress")) , 0)<0 , 0 , 1 ) +
if( array_element(ping(device_property("FirstAddress")) , 0)<0 , 0 , 1 ) +
if( array_element(ping(device_property("FirstAddress")) , 0)<0 , 0 , 1 ) +
if( array_element(ping(device_property("FirstAddress")) , 0)<0 , 0 , 1 ) +
if( array_element(ping(device_property("FirstAddress")) , 0)<0 , 0 , 1 ) +
if( array_element(ping(device_property("FirstAddress")) , 0)<0 , 0 , 1 ) +
if( array_element(ping(device_property("FirstAddress")) , 0)<0 , 0 , 1 ) +
if( array_element(ping(device_property("FirstAddress")) , 0)<0 , 0 , 1 ) +
if( array_element(ping(device_property("FirstAddress")) , 0)<0 , 0 , 1 )

2nd) I have created the probe
Name: reachability
Type: function
Available: ping(device_property("FirstAddress")) >= 0
Error: ""
Value: packet_loss_test()*10
Unit: %

If you want finer values you can make an addition of 20 pings instead of 10 and change Value in the probe for
packet_loss_test()*5
but the probe will be more intrusive.
-------------------Post edit;-------------------------------
I have noticed that the Dude only performed two pings of the ten i wrote in the function. Options are:
1) to change the function to make only two pings
and to change the probe Value: packet_loss_test()*50
2) to execute an external ping. I'm working on this.
3) to enhace the ping function, with a parameter that should be the number of packets to send, and to return the number of answered packets.
-------------------Post edit, even later;-------------------------------
Now testing Dude 4beta3. Dude stores somewhere the answer of the first ping so it only sends one ping, so this probe is almost useless.
The only interesting result is if you understand it as a pulse-code-modulated signal.
-------------------Post edit, even later than the previous one;-------------------------------
Not so useless. I still have my graphs.
Clipboard01.gif
Here you may see three graphs.
First is mostly in yellow the reachability of four of my WiFi computers.
Second, several computers in Internet
Third, a place that has suffered of a communications problem today.
It is not perfect, mostly when it is under the time to change from raw draws information to the 10 min summary as you may see at the right side of the third chart.

Regards
You do not have the required permissions to view the files attached to this post.
Last edited by Pablete on Fri Apr 15, 2011 12:39 am, edited 3 times in total.
 
Pablete
just joined
Posts: 22
Joined: Wed May 23, 2007 4:11 pm

proxy probe

Thu May 24, 2007 2:31 pm

This is almost a copy and paste of the http probe.

Probe definition:
Name: Google through proxy
Type: TCP
Port: 8080 (this may change at your company)
Connect Only UNCHECKED
First Receive, Then Send UNCHECKED
Send: GET http://www.google.com/ HTTP/1.0\r\n\r\n
Receive: HTTP/1.1 200 OK
 
Pablete
just joined
Posts: 22
Joined: Wed May 23, 2007 4:11 pm

More graphs

Thu May 24, 2007 4:17 pm

Some other probes.
Bluecoat is a http cache proxy. I have probes to measure the CPU of a solaris host and CPU and pages/sec of the Bluecoat

Function definition ---------------------------------
Name: cpu_bluecoat_usage
Desc: cpu usage for blue coat device
Code: oid("1.3.6.1.4.1.3417.2.4.1.1.1.4.1")

Probe definition ------------------------------------
Name: cpu_bluecoat
Type: Function
Available: cpu_bluecoat_usage()
Error: if(cpu_bluecoat_usage(), "", "down")
Value: cpu_bluecoat_usage()
Unit: %




Function definition ---------------------------------
Name: cpu_solaris_idle_ticks
Desc: timeticks in idle mode for solaris host
Code: oid("1.3.6.1.4.1.42.3.13.4.0")

Function definition ---------------------------------
Name: cpu_solaris_usage_ticks
Desc: timeticks not in dle mode for solaris host
Code: oid("1.3.6.1.4.1.42.3.13.1.0") +
oid("1.3.6.1.4.1.42.3.13.2.0") +
oid("1.3.6.1.4.1.42.3.13.3.0")

Function definition ---------------------------------
Name: cpu_solaris_total_ticks
Desc: cpu timeticks for solaris host
Code: oid("1.3.6.1.4.1.42.3.13.1.0") +
oid("1.3.6.1.4.1.42.3.13.2.0") +
oid("1.3.6.1.4.1.42.3.13.3.0") +
oid("1.3.6.1.4.1.42.3.13.4.0")

Function definition ---------------------------------
Name: cpu_solaris_usage
Desc: cpu usage for solaris host
Code: 100 *
rate(cpu_solaris_usage_ticks()) /
rate(cpu_solaris_total_ticks())

Probe definition ------------------------------------
Name: cpu_solaris
Type: Function
Available: cpu_solaris_idle_ticks()
Error: if(cpu_solaris_idle_ticks(), "", "")
Value: cpu_solaris_usage()
Unit: %



Function definition ---------------------------------
Name: http_requests_bluecoat
Desc: http requests for a bluecoat device
Code: oid("1.3.6.1.3.25.17.3.2.1.1.0")

Function definition ---------------------------------
Name: http_rate_bluecoat
Desc: http requests for a bluecoat device
Code: rate( oid("1.3.6.1.3.25.17.3.2.1.1.0") )

Probe definition ------------------------------------
Name: http_pages_bluecoat
Type: Function
Available: http_requests_bluecoat()
Error: if(http_requests_bluecoat(),"", "down")
Value: http_rate_bluecoat()
Unit: pages/sec

Have I said I like this tool?
Regards
 
Tsiera
just joined
Posts: 8
Joined: Tue Feb 12, 2008 9:45 am
Location: Alphen aan de Rijn

Re: Probe Thread

Wed Feb 13, 2008 12:49 pm

I have the following probes

Check CPU, warning @ 80% CPU usage
Name: CPU usage < 80%
Type: Funtion
Available: if(cpu_usage()>0, 1, -1)
Error: if(cpu_usage()<80, "", "CPU usage > 79%")
Value: cpu_usage()
Unit: %
Rate: none

Check Memory, warning @ 80 % Memory usage
Name: Memory usage < 80%
Type: Funtion
Available: if(mem_usage()>0, 1, -1)
Error: if(mem_usage()<80, "", "Memory usage > 79%")
Value: mem_usage()
Unit: %
Rate: none

Check Virtual Memory, warning @ 80% Virtual Memory usage
Name: Virtual Memory usage < 80%
Type: Funtion
Available: if(virtual_mem_usage()>0, 1, -1)
Error: if(virtual_mem_usage()<80, "", "Virtual Memory usage > 79%")
Value: virtual_mem_usage()
Unit: %
Rate: none

thanks to winkelman

Are there any probes to check the disk status?
 
winkelman
Member Candidate
Member Candidate
Posts: 235
Joined: Wed Aug 16, 2006 5:00 pm
Location: Amsterdam, The Netherlands

Re: Probe Thread

Wed Feb 13, 2008 3:23 pm

Are there any probes to check the disk status?
What do you mean? Status like 'up' or 'down' (perhaps for external disks)? Or status like '80% full'?
 
Tsiera
just joined
Posts: 8
Joined: Tue Feb 12, 2008 9:45 am
Location: Alphen aan de Rijn

Re: Probe Thread

Wed Feb 13, 2008 5:28 pm

Are there any probes to check the disk status?
What do you mean? Status like 'up' or 'down' (perhaps for external disks)? Or status like '80% full'?
with a raid 5 i want to check if al disk are oke,
So when one of the disks go down i gets a notification
 
luzik
just joined
Posts: 13
Joined: Mon Feb 11, 2008 1:24 pm

Re: Probe Thread

Thu Feb 14, 2008 9:27 am

it is possible to use parameters in created function like it is in builtin ones?
Now i got error message "too many parameters for functioname"
How to read parameters? $1 $2 ?
 
winkelman
Member Candidate
Member Candidate
Posts: 235
Joined: Wed Aug 16, 2006 5:00 pm
Location: Amsterdam, The Netherlands

Re: Probe Thread

Thu Feb 14, 2008 12:58 pm

with a raid 5 i want to check if al disk are oke,
So when one of the disks go down i gets a notification
By itself that is not possible. The OS (and thus the standard SNMP agent) just sees a RAID-set as a 'single disk'. It wouldn't know the status of any of its sub-parts. However, I do know that for example IBM ServeRAID adapters allow you to install the IBM ServeRAID Manager program, which optionally includes an additional SNMP agent. That makes RAID-info available through SNMP and thus to the Dude.

Perhaps your brand of RAID adapter also has such management software available.
 
Tsiera
just joined
Posts: 8
Joined: Tue Feb 12, 2008 9:45 am
Location: Alphen aan de Rijn

Re: Probe Thread

Thu Feb 14, 2008 2:26 pm

So its not possible to check disk status from any disk?
Because when i do a SNMP walk i see the physical disks oid's.
So then it must be posible te make a probe that checks the disk is ok and when the status is failed there will be a notification?
 
Tsiera
just joined
Posts: 8
Joined: Tue Feb 12, 2008 9:45 am
Location: Alphen aan de Rijn

Re: Re:

Tue Mar 25, 2008 10:16 am

Check if a certain program is running on a Windows system ('OUTLOOK.EXE' in this example):

Type: function
Available: if(array_find(oid_column("1.3.6.1.2.1.25.4.2.1.2"),"OUTLOOK.EXE")>0, 1, -1)
Error: if(array_find(oid_column("1.3.6.1.2.1.25.4.2.1.2"),"OUTLOOK.EXE")>0, "", "OUTLOOK.EXE not detected by SNMP probe")
Value: 1 (or anything else, is purely for charting purposes and I return 1 if the service is running)
Unit: running (or whatever you want to call the above values)
Rate: none

This of course requires the SNMP agent is running and configured properly on the Windows system.
I have a question about this probe
if you want to check somthing els like sqlservr.exe
You only need to change outlook.exe for sqlservr.exe or you need another oid?
 
talon63
Frequent Visitor
Frequent Visitor
Posts: 65
Joined: Tue Mar 25, 2008 2:31 pm
Location: Texas USA

Re: Probe Thread

Tue Mar 25, 2008 2:40 pm

I would be interested in a probe/function that would allow detection of DHCP servers on the wire. It would prove invaluable in tracking down the odd rogue that pops up from time to time when someone puts a misconfigured router on the network. Can anyone assist in this?

thanks!

[edit] OK I am still trying to work this out and would like to know if I am on the right track. If I create a probe with the following setup, my thought would be that that I would get an alert if another dhcp server is detected.
name: dhcp probe..............................................//just a name
type:snmp......................................//may or may not be the right way to go
oid: 1.3.6.1.4.1.5.1.1.55.1.1.22........................//DhcpSrvDomainServer IpAddress used to match against known dhcp server address
oid type: IP Address
compare method: !=(not equal)...........................//this will provide my comparison
ip address:xxx.xxx.xxx.xxx................................//ip addy of known dhcp server
So, what I am thinking is that something like this should detect a rogue on the wire and if I have notifications setup for this, I should get a near immediate alert when it is detected. Does anyone have any input or suggestions, I'm open to them.

cheers
"Technology is dominated by two types of people: those who understand what they do not manage, and those who manage what they do not understand." - Putt's Law
 
talon63
Frequent Visitor
Frequent Visitor
Posts: 65
Joined: Tue Mar 25, 2008 2:31 pm
Location: Texas USA

Re: Probe Thread

Wed Apr 02, 2008 4:10 am

Bump, and a request to make this thread a sticky. I'll be more than happy to share any probes I get working if the rest of the community is willing. :D
"Technology is dominated by two types of people: those who understand what they do not manage, and those who manage what they do not understand." - Putt's Law
 
beerfiend
newbie
Posts: 27
Joined: Fri Jan 04, 2008 12:18 am

Re: Probe Thread

Fri Apr 04, 2008 5:34 pm

double bump and I'll contribute my only custom probe. =( wish i was better at this stuff.

Name: wirlessID
Type: snmp
OID:iso.anonymous#62.anonymous#63.ieee802dot11.dot11smt.dot11StationConfigTable.dot11StationConfigEntry.dot11DesiredSSID.1
OID type: octet string
compare method: ==
String Value: (your SSID)

i use this to take auto discovered wireless devices and auto ID them into Wirless APs device type.
 
keith
Frequent Visitor
Frequent Visitor
Posts: 52
Joined: Thu May 24, 2007 12:30 am

Re: Probe Thread

Fri Apr 04, 2008 7:37 pm

I agree also.

What do you mean by
"i use this to take auto discovered wireless devices and auto ID them into Wirless APs device type."

I really don't understand probes and what all can be done with them. Could i make one that would tell me which port on a switch the workstation is attached to?
 
talon63
Frequent Visitor
Frequent Visitor
Posts: 65
Joined: Tue Mar 25, 2008 2:31 pm
Location: Texas USA

Re: Probe Thread

Sat Apr 05, 2008 2:41 am

I think what he means is that when the devices are discovered, they will placed on the map as AP's not switches or other devices which is what the Dude with my AP's the first time around forcing me to manually change them into AP's on the map. Thanks for the script.

As for what probes are and what you can do with them...probes are the method by which things are discoverd, and by things it can cover a pretty broad area. Hardware, software, protocols, configuration, etc. As to what you can do with them, that is what this thread was created for. Given enough experience and knowledge about things work, connect, and talk to each other, and with some study of the OIDs, you can build custom probes to perform a great many tasks.

As far as it goes, I am still learning about custom probes myself, this is why I suggested making this a sticky and inviting others to participate. I am sure that there are some among the 14000+ membersip here who would be willing to share knowledge with the rest of us.
"Technology is dominated by two types of people: those who understand what they do not manage, and those who manage what they do not understand." - Putt's Law
 
keith
Frequent Visitor
Frequent Visitor
Posts: 52
Joined: Thu May 24, 2007 12:30 am

Re: Probe Thread

Mon Apr 07, 2008 7:39 pm

Ok - I think i have the basics of snmp and OID's but how do you implement probes. Can anyone give like some really basic examples to get started with?
 
beerfiend
newbie
Posts: 27
Joined: Fri Jan 04, 2008 12:18 am

Re: Probe Thread

Tue Apr 08, 2008 4:55 pm

yeah, just read the backscroll of this thread to get an idea of what people are using these for. i like the cisco processor one myself. i was wondering how to do that. the network is your oister.
 
Pablete
just joined
Posts: 22
Joined: Wed May 23, 2007 4:11 pm

Re: Probe Thread

Fri Apr 11, 2008 3:42 pm

Some more of my own

Is the proxy service up?
Name: proxy08080-google.es
Type: TCP
Port: 8080
Connect only: Unchecked
First receive, then send:Unchecked
Send: GET http://www.google.com/ HTTP/1.1\r\n\r\n
Receive: HTTP/1.1 200 OK

Be careful with this probe because it will fail in the following cases:
1) You have an authenticated proxy
2) You are in other country and you get a redirection (in my case to http://www.google.es, so I don't get a 200 OK).

And the socks service? This is also a good reference for the syntax of the escape secuences in probes
Name: socks01080-ftp.drivehq.com
Type: TCP
Port: 1080
Connect only: Unchecked
First receive, then send:Unchecked
Send: \x05\x01\x02
Receive: \x05
Send: \x05\x01\0\x03\x0fftp.drivehq.com\0\x15
Receive: 220 Welcome to the

Be careful with this probe because it will fail in the following cases:
1) You have an authenticated socks
2) This is for a V5 socks.
3) The \x0f before ftp.drivehq.com must have the length of the string ftp.drivehq.com . If you change the target server then the length of the dns name will change and you should change the length accordingly.
Last edited by Pablete on Mon Jul 07, 2008 6:54 pm, edited 1 time in total.
 
talon63
Frequent Visitor
Frequent Visitor
Posts: 65
Joined: Tue Mar 25, 2008 2:31 pm
Location: Texas USA

Re: Probe Thread

Fri Apr 11, 2008 6:22 pm

Thanks for the probes, and thanks for listing reasons why they might fail. That should keep some of us out of trouble. :lol:
"Technology is dominated by two types of people: those who understand what they do not manage, and those who manage what they do not understand." - Putt's Law
 
talon63
Frequent Visitor
Frequent Visitor
Posts: 65
Joined: Tue Mar 25, 2008 2:31 pm
Location: Texas USA

Re: Probe Thread

Wed Jun 25, 2008 5:18 pm

Shameless bump of topic.
"Technology is dominated by two types of people: those who understand what they do not manage, and those who manage what they do not understand." - Putt's Law
 
Pablete
just joined
Posts: 22
Joined: Wed May 23, 2007 4:11 pm

TCP Connections and TCP retransmissions

Mon Jul 07, 2008 6:52 pm

Hi, two more. They need snmp.

The TCP retransmissions measures the number of TCP segments that the queried computer has needed to retransmit due to lost frames. This may be due to line congestion or LAN congestion, or non-responding computers. A flat zero line says that no frame is lost. Note that some media like frame relay specifically allows the drop of some frames.
Usually if you do a measure of the retransmissions over a second the result will be zero. You should measure them in a minute basis.

function
tcp_retrans_segs
tcp retransmited segments
oid("1.3.6.1.2.1.6.12.0")

Probe

tcp_retrans
Function
AV: tcp_retrans_segs()
ER: if(tcp_retrans_segs(), "", "down")
VAL: rate( tcp_retrans_segs() *60 )
UN: Packets
Rate: Minute

-------------------------------------------------------------------------------

TCP Connections gives the number of established or closewait connections of the queried device. The number depends on the usage of the device. A cache/proxy/load-balancing-device may have a really high number of tcp connections.

function
tcp_currestab
number of tcp established or closewait connections
oid("1.3.6.1.2.1.6.9.0")

Probe
tcp_estab
Function
AV: tcp_currestab()
ER: if(tcp_currestab(), "", "down")
VAL: tcp_currestab()
UN: connections

Regards
------ Post Edit ----
As mr. Winkelman posted before, the OID for the currently established or closewait tcp connections is 1.3.6.1.2.1.6.9.0 . THX for the correction. Sorry for the typo.
Last edited by Pablete on Sat Oct 10, 2009 12:50 am, edited 1 time in total.
 
zhall
Frequent Visitor
Frequent Visitor
Posts: 57
Joined: Fri Aug 20, 2004 6:33 pm
Location: Virginia

Re: Probe Thread

Fri Aug 15, 2008 8:20 pm

I'm trying to get a problem working that checks the SU RSSI level and then creates a graph off it. I guess I've gotten it working ok once. It graph'd correctly at least. The OID for signal strength seems to change from mikrotik to mikrotik, so I haven't figured out to deal with that. Anyway. Anybody had any luck in this department that might show me an example?

is there any documentation for custom functions and such?
 
zhall
Frequent Visitor
Frequent Visitor
Posts: 57
Joined: Fri Aug 20, 2004 6:33 pm
Location: Virginia

Re: Probe Thread

Fri Aug 15, 2008 9:44 pm

Dis is what I ended up with so far:

Available : rssi_avail() > 0

which is --

array_size(oid_column("iso.3.6.1.4.1.14988.1.1.1.2.1.3"))


and then for value i have

array_element(oid_column("iso.3.6.1.4.1.14988.1.1.1.2.1.3"), 0)



it seems to work ok.
 
winkelman
Member Candidate
Member Candidate
Posts: 235
Joined: Wed Aug 16, 2006 5:00 pm
Location: Amsterdam, The Netherlands

Re: TCP Connections and TCP retransmissions

Mon Aug 18, 2008 4:33 pm

<snip>

function
tcp_currestab
number of tcp established or closewait connections
oid("1.3.6.1.2.1.6.12.0")

<snip>
Nice, but not entieryly correct (I suspect copy-paste error :)): established TCP connections is on oid("1.3.6.1.2.1.6.9.0")
 
CGirardy
Frequent Visitor
Frequent Visitor
Posts: 78
Joined: Tue Sep 25, 2007 1:09 pm
Location: Grasse / Alpes-Maritimes / France

Re: Probe Thread

Tue Aug 26, 2008 1:31 pm

Hi,
I would love to have a probe for an AS400 system (CPU / Network / HD)

I don't understand why it works with MRTG and I cannot get it to work using The Dude.

How can I do to translate this from MRTG to a probe in TheDude and why can't TheDude see the network cards ?

Thanks for you help

MRTG File used to query the SNMP on the AS400:

### Interface 2 >> Descr: 'ETHLINE' | Name: '' | Ip: 'xx.xx.xx.xx' | Eth: '00-09-6b-eb-d8-56' ###

Target[s44r1755_2]: 2:xxxxxxxx@s44r1755:
SetEnv[s44r1755_2]: MRTG_INT_IP="xx.xx.xx.xx" MRTG_INT_DESCR="ETHLINE"
MaxBytes[s44r1755_2]: 12500000
Title[s44r1755_2]: Traffic Analysis for 2 -- S44R1755.xxxxxxxxxxxx.xxxxxxxx
PageTop[s44r1755_2]: <h1>Traffic Analysis for 2 -- S44R1755.xxxxxxxxxxxx.xxxxxxxxx</h1>
<div id="sysdetails">
<table>
<tr><td>System:</td> <td>S44R1755.xxxxxxxxx.xxxxxxxx in </td></tr>
<tr><td>Maintainer:</td> <td></td></tr>
<tr><td>Description:</td> <td>ETHLINE </td></tr>
<tr><td>ifType:</td> <td>ethernetCsmacd (6)</td></tr>
<tr><td>ifName:</td> <td></td></tr>
<tr><td>Max Speed:</td> <td>100.0 Mbits/s</td></tr>
<tr><td>Ip:</td> <td>xx.xx.xx.xx (s44r1755.xxxxxxx.xxxxxxxx)</td></tr>
</table>
</div>

### Interface 3 >> Descr: 'ETHLINE2' | Name: '' | Ip: 'xx.xx.xx.xx' | Eth: '00-09-6b-eb-d8-57' ###

Target[s44r1755_3]: 3:xxxxxx@s44r1755:
SetEnv[s44r1755_3]: MRTG_INT_IP="xx.xx.xx.xx" MRTG_INT_DESCR="ETHLINE2"
MaxBytes[s44r1755_3]: 12500000
Title[s44r1755_3]: Traffic Analysis for 3 -- S44R1755.xxxxxx.xxxxxx
PageTop[s44r1755_3]: <h1>Traffic Analysis for 3 -- S44R1755.xxxxx.xxxxx</h1>
<div id="sysdetails">
<table>
<tr><td>System:</td> <td>S44R1755.xxxxxx.xxxxx in </td></tr>
<tr><td>Maintainer:</td> <td></td></tr>
<tr><td>Description:</td> <td>ETHLINE2 </td></tr>
<tr><td>ifType:</td> <td>ethernetCsmacd (6)</td></tr>
<tr><td>ifName:</td> <td></td></tr>
<tr><td>Max Speed:</td> <td>100.0 Mbits/s</td></tr>
<tr><td>Ip:</td> <td>xx.xx.xx.xx (vsiprd.xxxxxxx.xxxxxxx)</td></tr>
</table>
</div>


### Interface 4 >> Descr: 'TRNLINE' | Name: '' | Ip: 'xxx.xxx.xxx.xxx' | Eth: '40-00-00-00-08-20' ###

Target[s44r1755_4]: 4:xxxxx@s44r1755:
SetEnv[s44r1755_4]: MRTG_INT_IP="xxx.xxx.xxx.xxx" MRTG_INT_DESCR="TRNLINE"
MaxBytes[s44r1755_4]: 2000000
Title[s44r1755_4]: Traffic Analysis for 4 -- S44R1755.xxxxxx.xxxxxx
PageTop[s44r1755_4]: <h1>Traffic Analysis for 4 -- S44R1755.xxxxxx.xxxxxx</h1>
<div id="sysdetails">
<table>
<tr><td>System:</td> <td>S44R1755.xxxxxxx.xxxxxx in </td></tr>
<tr><td>Maintainer:</td> <td></td></tr>
<tr><td>Description:</td> <td>TRNLINE </td></tr>
<tr><td>ifType:</td> <td>iso88025TokenRing (9)</td></tr>
<tr><td>ifName:</td> <td></td></tr>
<tr><td>Max Speed:</td> <td>16.0 Mbits/s</td></tr>
<tr><td>Ip:</td> <td>xxx.xxx.xxx.xxx (ASGRASSE.xxxxxx.xxxxxx)</td></tr>
</table>
</div>

### CPU ###

YLegend[s44r1755.processorLoad]: % Utilization
Options[s44r1755.processorLoad]: growright,gauge,nopercent,nobanner
Target[s44r1755.processorLoad]: .1.3.6.1.2.1.25.3.3.1.2.1&.1.3.6.1.2.1.25.3.3.1.2.2:xxxxxxx@s44r1755
MaxBytes[s44r1755.processorLoad]: 100
Title[s44r1755.processorLoad]: S44R1755 : Utilisation du Processeur
ShortLegend[s44r1755.processorLoad]: %
Legend1[s44r1755.processorLoad]: Utilisation Processeur #1
Legend2[s44r1755.processorLoad]: Utilisation Processeur #2
Legend3[s44r1755.processorLoad]: Utilisation Processeur #1 (Maximum)
Legend4[s44r1755.processorLoad]: Utilisation Processeur #2 (Maximum)
LegendI[s44r1755.processorLoad]: &nbsp;Load:
LegendO[s44r1755.processorLoad]: &nbsp;Load:
PageTop[s44r1755.processorLoad]: <H1>S44R1755 : Utilisation du Processeur </H1>
<TABLE>
<TR><TD>System:</TD> <TD>S44R1755</TD></TR>
</TABLE>

### HD ###

YLegend[s44r1755.Disk]: Occupation Disque
Options[s44r1755.Disk]: growright,gauge
Target[s44r1755.Disk]: .1.3.6.1.2.1.25.2.3.1.6.1&.1.3.6.1.2.1.25.2.3.1.5.1:xxxxxxxx@s44r1755 * 4096
MaxBytes[s44r1755.Disk]: 281555238912
Title[s44r1755.Disk]: S44R1755 : Occupation Disque
ShortLegend[s44r1755.Disk]: o
Legend1[s44r1755.Disk]: Occupation disque
Legend2[s44r1755.Disk]: Taille du disque
Legend3[s44r1755.Disk]: Occupation disque Maximum
Legend4[s44r1755.Disk]: Taille disque Maximum
LegendI[s44r1755.Disk]: &nbsp;Occupé :
LegendO[s44r1755.Disk]: &nbsp;Taille :
PageTop[s44r1755.Disk]: <H1>S44R1755 : Occupation Disque </H1>
<TABLE>
<TR><TD>System:</TD> <TD>S44R1755</TD></TR>
</TABLE>
PLEASE... Release source code to the community and start developing The Dude again....
PLLLLLLLLLLLEEEEEEEEEAAAAAAAAAASSSSSSSSSEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE (big cry for help :))
 
lebowski
Forum Guru
Forum Guru
Posts: 1614
Joined: Wed Aug 27, 2008 5:17 pm

Re: Probe Thread

Thu Sep 04, 2008 9:17 pm

I believe this is a correction to the first post of the cisco cpu function... The 0 where the -1 was keeps this probe from being added to devices that do not have a value in the cisco cpu OID. More specifically I think "if(x>0,1,-1)" returns not false in and "if(x>0,1,0)" returns false. I noticed a lot of other functions in this thread have -1 in the return on false, those can be fixed also by following this example.

Test and correct me if I am wrong.

Cisco CPU
Type: Function
Available: if(oid("1.3.6.1.4.1.9.2.1.58.0")>0, 1, 0)
Error: ""
Value: oid("1.3.6.1.4.1.9.2.1.58.0")
Unit: % of cpu load

Thanks for the great thread! keep it up :)
Sweet!!!
 
lebowski
Forum Guru
Forum Guru
Posts: 1614
Joined: Wed Aug 27, 2008 5:17 pm

Re: Probe Thread

Fri Sep 05, 2008 12:45 am

Some APC UPS probes I made based on the one from above... They all seem to work I just haven't actually tested them.
Create a new probe with the type of function and paste the lines.
Name is line 1,
Available is line 2,
Error is line 3,
value is line 4,
Unit is line5, but if you want similar values to show up on the same graph give them the same Unit description.

Checks to see if there is voltage on the input.
APC_Vin
if(oid("1.3.6.1.4.1.318.1.1.1.3.2.1.0")>0, 1, 0)
if(oid("1.3.6.1.4.1.318.1.1.1.3.2.1.0")>0, "", "No Power")
oid("1.3.6.1.4.1.318.1.1.1.3.2.1.0")
total

Measure battery temperature , should fix it to complain if too hot...
APC Battery Temperature
if(oid("1.3.6.1.4.1.318.1.1.1.2.2.2.0")>0, 1, 0)
if(oid("1.3.6.1.4.1.318.1.1.1.2.2.2.0")>0, "", "No Temp reading")
oid("1.3.6.1.4.1.318.1.1.1.2.2.2.0")
total

Battery Capacity errors if battery less than half. needs testing.
APC Capacity
if(oid("1.3.6.1.4.1.318.1.1.1.2.2.1.0")>0, 1, 0)
if(oid("1.3.6.1.4.1.318.1.1.1.2.2.1.0")>50, "", "Battery less than half")
oid("1.3.6.1.4.1.318.1.1.1.2.2.1.0")
%

Total load on UPS, complains if load is over 80. Need to test.
APC_Load
if(oid("1.3.6.1.4.1.318.1.1.1.4.2.3.0")>0, 1, 0)
if(oid("1.3.6.1.4.1.318.1.1.1.4.2.3.0")<80, "", "Over Load")
oid("1.3.6.1.4.1.318.1.1.1.4.2.3.0")
%
 
lebowski
Forum Guru
Forum Guru
Posts: 1614
Joined: Wed Aug 27, 2008 5:17 pm

Re: Probe Thread - improved apc temperature by adding range

Wed Oct 15, 2008 3:46 am

An improved battery temperature... complains if it is out of range. Now if I could figure out how make it keep graphing when it is out of range. I am certain I could solve that with another probe but WHY :)


APC Battery Temperature
if(oid("1.3.6.1.4.1.318.1.1.1.2.2.2.0")>0, 1, 0)
if(or(oid("1.3.6.1.4.1.318.1.1.1.2.2.2.0")<20,oid("1.3.6.1.4.1.318.1.1.1.2.2.2.0")>40),concatenate("temperature out of range = ",oid("1.3.6.1.4.1.318.1.1.1.2.2.2.0")),"")
oid("1.3.6.1.4.1.318.1.1.1.2.2.2.0")
total
 
lebowski
Forum Guru
Forum Guru
Posts: 1614
Joined: Wed Aug 27, 2008 5:17 pm

Re: Probe Thread

Fri Oct 17, 2008 12:51 am

A quick way to make a probe... Causes your UPS to show when a battery is needing replaced.

Snmpwalk the device you are wanting to create a probe for, right click on the oid once you find it, select create probe.

You can modify the function but this one worked with no changes...


iso.org.dod.internet.private.enterprises.apc.products.hardware.ups.upsBattery.upsAdvBattery.upsAdvBatteryReplaceIndicator.0
probe.JPG
Lucky or not here they are :)
failed.JPG
You do not have the required permissions to view the files attached to this post.
 
sdrenner
Member Candidate
Member Candidate
Posts: 138
Joined: Wed Mar 02, 2005 10:03 pm
Contact:

Re: Probe Thread

Sat Oct 25, 2008 12:04 am

Are these SNMP oid's via MT or straight to the APC ups?
 
lebowski
Forum Guru
Forum Guru
Posts: 1614
Joined: Wed Aug 27, 2008 5:17 pm

Re: Probe Thread

Mon Nov 03, 2008 4:35 pm

For the APCs I read the device directly, reading the oid ... You might need PowerNet-MIB.mib.
 
adamd292
newbie
Posts: 49
Joined: Tue Oct 07, 2008 8:56 am

Re: Probe Thread

Wed Nov 12, 2008 3:14 am

I have some VMWare ESX Servers and NetApp filers.
and wrote the following functions and probes to instrument CPU and Memory for them in The Dude:
(also there is a windows service probe)

FUNCTIONS

cpu_mem_disk_enhanced
Same as supllied cpu_mem_disk, but with added support for VMWare and NetApp CPU load
concatenate(
if(cpu_usage_available(), concatenate("cpu: ", round(cpu_usage()), "% "), ""),
if(vmcpu_usage_available(), concatenate("cpu: ", round(vmcpu_usage()), "% "), ""),
if(netappcpu_usage_available(), concatenate("cpu: ", round(netappcpu_usage()), "% "), ""),
if(mem_usage() > 0, concatenate("mem: ", round(mem_usage()), "% "), ""),
if(virtual_mem_usage() > 0, concatenate("virt: ", round(virtual_mem_usage()), "% "), ""),
if(hdd_usage() > 0, concatenate("disk: ", round(hdd_usage()), "% "), ""),
if(netapphdd_usage() > 0, concatenate("disk: ", round(netapphdd_usage()), "% "), "")
)


device_performance_enh
Adjusted to call cpu_mem_disk_enhanced
if(
string_size(cpu_mem_disk_enhanced()) > 0,
concatenate(cpu_mem_disk_enhanced(), "
"),

""
)


failed_service_summary
Shows failed services. Used to produce summary information on maps
if (device_property("ServicesCount")<>device_property("ServicesUpCount"), concatenate("Services (", device_property("ServicesUpCount"), "/", device_property("ServicesCount"), ")
", if (device_property("ServicesUnstableCount"), concatenate("Unstable: ", device_property("ServicesUnstableCount"), "
"), ""), if (device_property("ServicesDownCount"), concatenate("Down: ", device_property("ServicesDownCount"), "
"), ""), if (device_property("ServicesAckedCount"), concatenate("Acked: ", device_property("ServicesAckedCount"), "
"), ""), if (device_property("ServicesUnknownCount"), concatenate("Unknown: ", device_property("ServicesUnknownCount"), "
"), "")), "")


netappcpu_usage
Reports the CPU usage in percent of a NetApp
oid("1.3.6.1.4.1.789.1.2.1.3.0")

netappcpu_usage_available
Detects whether NetApp CPU usage is available
if(oid("1.3.6.1.4.1.789.1.2.1.3.0"),1,0)

netapphdd_usage
Reports NetApp Total Storage usage as a percentage.
oid("1.3.6.1.4.1.789.1.5.7.3.0")

vmcpu_usage
Report the CPU usage in percent on a VMWare server
100-oid("1.3.6.1.4.1.2021.11.11.0")

vmcpu_usage_available
Detects whether VMWare CPU usage is available.
if (oid("1.3.6.1.4.1.2021.11.11.0"),1,0)

FUNCTION PROBES

VMWare – cpu
Monitor the host CPU on a VMWare server
Available: vmcpu_usage_available()
Error: if(vmcpu_usage_available(), "", "down")
Value: vmcpu_usage()
Unit: %
Rate: none


NetApp - active disks
Number of active RAID disks
Available: if(oid("1.3.6.1.4.1.789.1.6.4.2.0")>0,1,0)
Error: if(oid("1.3.6.1.4.1.789.1.6.4.2.0")>0,"","none")
Value: oid("1.3.6.1.4.1.789.1.6.4.2.0")
Unit:
Rate: none


NetApp - bad disks
Number of failed RAID disks
Available: if(oid("1.3.6.1.4.1.789.1.6.4.7.0")=0,1,0)
Error: if(oid("1.3.6.1.4.1.789.1.6.4.7.0")=0,"","bad disks")
Value: oid("1.3.6.1.4.1.789.1.6.4.7.0")
Unit:
Rate: none


NetApp - bad fans
Number of failed cooling fans.
Available: if(oid("1.3.6.1.4.1.789.1.2.4.2.0")=0,1,0)
Error: if(oid("1.3.6.1.4.1.789.1.2.4.2.0")=0,"","bad fans")
Value: oid("1.3.6.1.4.1.789.1.2.4.2.0")
Unit:
Rate: none


NetApp - bad power
Number of failed power supply units.
Available: if(oid("1.3.6.1.4.1.789.1.2.4.4.0")=0,1,0)
Error: if(oid("1.3.6.1.4.1.789.1.2.4.4.0")=0,"","bad psu")
Value: oid("1.3.6.1.4.1.789.1.2.4.4.0")
Unit:
Rate: none


NetApp – cpu
CPU Busy % time
Available: netappcpu_usage_available()
Error: if(netappcpu_usage_available(), "", "down")
Value: netappcpu_usage()
Unit: %
Rate: none


NetApp - space used
Percentage of disk space used
Available: if(oid("1.3.6.1.4.1.789.1.5.7.1.0")>0,1,0)
Error: if(oid("1.3.6.1.4.1.789.1.5.7.1.0")>1,"disk space low","")
Value: oid("1.3.6.1.4.1.789.1.5.7.3.0")
Unit: %
Rate: none


NetApp – spare disks
Number of spare disks in NetApp array
Available: if(oid("1.3.6.1.4.1.789.1.6.4.8.0")>0,1,0)
Error: if(oid("1.3.6.1.4.1.789.1.6.4.8.0")>0,"","no spares")
Value: oid("1.3.6.1.4.1.789.1.6.4.8.0")
Unit:
Rate: none


NetApp – total disks
Total number of disks in NetApp array
Available: if(oid("1.3.6.1.4.1.789.1.6.4.1.0")>0,1,0)
Error: if(oid("1.3.6.1.4.1.789.1.6.4.1.0")>0,"","no disks")
Value: oid("1.3.6.1.4.1.789.1.6.4.1.0")
Unit:
Rate: none


Windows Service MyService
Is the Windows Service MyService up
Available: if(array_find(oid_column("1.3.6.1.4.1.77.1.2.3.1.1"),"MyService")>0, 1, 0)
Error: if(array_find(oid_column("1.3.6.1.4.1.77.1.2.3.1.1"),"MyService")>0, "", "MyService not detected by SNMP probe")
Value: 1
Unit:
Rate: none


*Need to replace MyService with the name of the service that you actually want to monitor

SNMP PROBES

NetApp - over temp
Probe to detect NetApp temperature limit
Oid: iso.org.dod.internet.private.enterprises.netapp.netapp1.sysStat.environment.envOverTemperature.0
Type: integer
Compare: ==
String Value: 1


NetApp - status
Probe to detect when the NetApp is not ok
Oid: iso.org.dod.internet.private.enterprises.netapp.netapp1.sysStat.misc.
miscGlobalStatus.0
Type: integer
Compare: ==
String Value: 3



Also in "Server Configuration" -> "Map" -> "Device Appearance"
I modified the global default for device label to use the enhanced device performance function from the list above

Change Device Appearance Label to:
[Device.CustomField1]
[Device.Name]
[Device.FirstAddress]
[device_performance_enh()][failed_service_summary()]
 
breazer
just joined
Posts: 8
Joined: Mon Nov 24, 2008 1:47 pm

Re: Probe Thread

Mon Nov 24, 2008 1:52 pm

Hi All,
Do any of you know of a probe that will moniter a port

Regards
Breazer
 
rebellion
newbie
Posts: 32
Joined: Tue Oct 14, 2008 5:25 pm
Location: Russia_tlt

Re: Probe Thread

Tue Nov 25, 2008 4:44 am

what exactly you want to monitor?
Use snmpwalk with correct community name to see available monitoring-parameters (i think you need IfTable section of device).
 
breazer
just joined
Posts: 8
Joined: Mon Nov 24, 2008 1:47 pm

Re: Probe Thread

Tue Nov 25, 2008 1:04 pm

Hi
Thanks for your reply, I am try to moniter if an ethernet port goes down.
I can see the if table entry of the interface but Im not to sure on how to create a prob for it.
Do you know if there is any where I can get this info to create a prob again thanks for your help

Breazer
 
Toepfe
newbie
Posts: 30
Joined: Fri Oct 31, 2008 11:48 am

Re: Probe Thread

Mon Dec 01, 2008 4:13 pm

Hi,

Somewhere in this great thread, Tsiera shows how to check the memory usage. I also used this way. Now I noticed that a switched of server, shows this special memory probe is ok?! The build in "memory" probe was down as it should be when the server is not reachable ;-)

I guess the reason is that function "mem_usage()" could not work correctly if the function "mem_size()" has no value.

I found two workarounds:

1. I changed function "mem_usage()" to:

if(mem_size() > 0,
oid(concatenate("iso.org.dod.internet.mgmt.mib-2.host.hrStorage.hrStorageTable.hrStorageEntry.hrStorageUsed.",
array_element(
oid_column("iso.org.dod.internet.mgmt.mib-2.host.hrStorage.hrStorageTable.hrStorageEntry.hrStorageIndex", 600),
array_find(
oid_column("iso.org.dod.internet.mgmt.mib-2.host.hrStorage.hrStorageTable.hrStorageEntry.hrStorageType", 600),
"iso.org.dod.internet.mgmt.mib-2.host.hrStorage.hrStorageTypes.hrStorageRam"
))))
* 100 / mem_size(), -1)

My probe looks as following:
Name: memory_95
Type: Function
Agent: default
Available: mem_size() > 0
Error: if(and(mem_usage()>=0, mem_usage() < 95), "", "down")
Value: mem_usage()
Unit: %
Rate: none

2. Workaround, instead of the error line above use:
Error: if(ping(device_property("FirstAddress")) >= 0, if(mem_usage() < 95, "", "down"), "down")


In the second workaround no changes in function "mem_usage()" are necessary. But if you use function mem_usage() you will have the behavior that your probe is ok, although the server is not reachable.

Bye
 
breazer
just joined
Posts: 8
Joined: Mon Nov 24, 2008 1:47 pm

Re: Probe Thread

Thu Dec 04, 2008 7:07 pm

Thanks, its a port on a switch and a router I need to moniter

Regards
Breazer
 
ittsmith
just joined
Posts: 3
Joined: Wed Dec 10, 2008 11:56 pm

Re: Probe Thread

Thu Dec 11, 2008 12:01 am

I'm having problems trying to monitor a service using the following...

Windows Service MyService
Is the Windows Service MyService up
Available: if(array_find(oid_column("1.3.6.1.4.1.77.1.2.3.1.1"),"MyService")>0, 1, 0)
Error: if(array_find(oid_column("1.3.6.1.4.1.77.1.2.3.1.1"),"MyService")>0, "", "MyService not detected by SNMP probe")
Value: 1
Unit:
Rate: none


*Need to replace MyService with the name of the service that you actually want to monitor


I know SMTP is fine cuz i can monitor lets say "Windows Audio" and if I stop/start it i get expected responses. I've tried the simple service name of my problem service, and what I THINK is the long name of my service and still nothing. Does it matter what type of windows service you are monitoring? This is a 3rd party service. What is the definitive way to get the service name? If I do SMTP walk, i do see it has a service name and that matches up to the simple name. What am I missing?
 
Toepfe
newbie
Posts: 30
Joined: Fri Oct 31, 2008 11:48 am

Re: Probe Thread

Thu Dec 11, 2008 4:53 pm

Could it be that you use the wrong OID. I check Windows services as following:

Available: if(array_find(oid_column("1.3.6.1.2.1.25.4.2.1.2"), "db2sec.exe")>0, 1, -1)
Error: if(array_find(oid_column("1.3.6.1.2.1.25.4.2.1.2"), "db2sec.exe")>0, "", "DB2_db2sec.exe not detected by SNMP probe")
Value: 1
Unit: running
Rate: none

Change "db2sec.exe" with the name of the service you like to monitor. One thing I also noticed is, that the name is case-sensitive! Write the service name the same as you see it in the Windows Task Manager.

Sometimes it also helps to use the if() function for debugging. If you are not shure what you will get with "if(array_find(oid_column("1.3.6.1.2.1.25.4.2.1.2"), "db2sec.exe")>0" and want to see it, put the same command as "to do". For example:

Error: if(array_find(oid_column("1.3.6.1.2.1.25.4.2.1.2"), "db2sec.exe")>0, array_find(oid_column("1.3.6.1.2.1.25.4.2.1.2"), "db2sec.exe"), array_find(oid_column("1.3.6.1.2.1.25.4.2.1.2"), "db2sec.exe"))

With this error line you will see the output of the command in the "Problem" column of the "Services" table. Sometimes it helps me a lot to find bugs in my commands.
 
ittsmith
just joined
Posts: 3
Joined: Wed Dec 10, 2008 11:56 pm

Re: Probe Thread

Thu Dec 11, 2008 7:20 pm

Thanks, that worked. I was sensitive to capitalization and tried the service with both the .exe and not, and also with its short name. I'm guessing it was the new OID number you gave me that made it work. Now I'm digging in to find out how to find out the disk free space on partitions I have on a single disk. It does me no good to know my disk is 50% free when in reality my c:\ partition is at 99% used. I see OID numbers for the partitions with differing instances so I'm assuming I'll have to do an array function. Right now, whenever i use the OID i think is correct i get a -1 returned (thank you for how to troubleshoot what you get returned) so that's not it. When I do the SNMP walk, there are so many OID... I'm still digging in.
 
rebellion
newbie
Posts: 32
Joined: Tue Oct 14, 2008 5:25 pm
Location: Russia_tlt

Re: Probe Thread

Thu Dec 11, 2008 7:24 pm

Thanks, its a port on a switch and a router I need to moniter

Regards
Breazer
for exapmle - what to do to monitor oper state of first port on switch DLink DES 3526:
1. make telnet session to your switch and use comman "show ports", to see admin/oper state for ports
2. snmwalk to this device from dude with correct community name (default - CommunityRead, if i'm not wrong).
3. Go to if_table -> ifOperState (or something like that, i'm don't remember exactly)
4. Check that indexes for oid is correct with waht you see in telnet session (or rtfm).
5. when checked and founded correct oid for oper state of first port, right click on string with oid, and choose "create probe".
6. in probe settings see for data type (must be integer) and =1 (if state is up).
after that you will have probe, that can be added for all of switches this type.
 
Toepfe
newbie
Posts: 30
Joined: Fri Oct 31, 2008 11:48 am

Re: Probe Thread

Fri Dec 12, 2008 9:10 am

To check the space in Gb of the first disk, I use the following probe:

Available: if((((((oid("1.3.6.1.2.1.25.2.3.1.5.1")-oid("1.3.6.1.2.1.25.2.3.1.6.1"))*oid("1.3.6.1.2.1.25.2.3.1.4.1"))/1024)/1024)/1024)>0, 1, -1)

Error: if((((((oid("1.3.6.1.2.1.25.2.3.1.5.1")-oid("1.3.6.1.2.1.25.2.3.1.6.1"))*oid("1.3.6.1.2.1.25.2.3.1.4.1"))/1024)/1024)/1024)>4, "", concatenate("Disk 1 (", string_substring(oid("1.3.6.1.2.1.25.2.3.1.3.1"),0,2), ") free space < 4 Gb"))

Value: (((((oid("1.3.6.1.2.1.25.2.3.1.5.1")-oid("1.3.6.1.2.1.25.2.3.1.6.1"))*oid("1.3.6.1.2.1.25.2.3.1.4.1"))/1024)/1024)/1024)

Unit: Gb
Rate: none

If you want to check disk 2, only change the last number of all OIDs from "1" to "2", and the message text from "Disk 1" to "Disk 2". The same for disk 3,4,5... . This probe will warn you after the space is lower than 4 Gb. To change it you may modify in the Error line ">4" to any other value (and of course the message text "free space < 4 Gb").

The part "string_substring(oid("1.3.6.1.2.1.25.2.3.1.3.1"),0,2)" of the error line, will give you the disk name (f.e. C:) in the problem column of the services screen from the device. The whole message text looks like: "Disk 1 (C:) free space < 4 GB"

You should also pay attention to the fact, that if you have a diskette drive, this drive is typically disk 1. If you take a look at the "Snmp" screen of the device and switch to "Storage" you see all the disks and the memory. I use this order to find out the disk number.

I know it is not the best way to have - for example - 5 probes to check 5 disks and if there are changes to edit 5 probes, but unfortunatelly it is not possible to use the probe name in functions. With this feature you would be able to write a function which uses the probe name as parameter for some values. For example probe name: "Disk 1, 2Gb". The function then will use "1" as disk number and "2" for the space value.

But maybe the MikroTik people will implement this feature in future ;-)

@breazer: Sorry I have no experience in checking ports of switches, I only use the Dude for checking server hardeware and their services.
 
ittsmith
just joined
Posts: 3
Joined: Wed Dec 10, 2008 11:56 pm

Re: Probe Thread

Thu Dec 18, 2008 5:13 pm

Toepfe,
Thanks for your help. I just need percentage full, not actual threshold of a specific GB value so I combined a previous thread with some of the more advanced features you showed me regarding displaying the C: or D: in the error message as such:

NAME: Disk1UsedSpace

AVAILABLE: if((oid("1.3.6.1.2.1.25.2.3.1.6.1")/oid("1.3.6.1.2.1.25.2.3.1.5.1"))*100>0, 1, -1)

ERROR: if((oid("1.3.6.1.2.1.25.2.3.1.6.1")/oid("1.3.6.1.2.1.25.2.3.1.5.1"))*100<80, "", concatenate("Disk 1 (", string_substring(oid("1.3.6.1.2.1.25.2.3.1.3.1"),0,2), ") used space is currently at (", string_substring (oid("1.3.6.1.2.1.25.2.3.1.6.1")/oid("1.3.6.1.2.1.25.2.3.1.5.1"))*100 , ") % "))

VALUE: (oid("1.3.6.1.2.1.25.2.3.1.6.1")/oid("1.3.6.1.2.1.25.2.3.1.5.1"))*100

UNIT: %

RATE: none

Note that I used your previous concept of displaying the value in the ERROR line which is not only useful for troubleshooting but for determining how critical the percentage full really is. I do have a couple of questions.
Did I need string_substring in there? I cheated and just borrowed your example and morphed it to my own and (after tons of replacing parentheticals and quotes) massaged it so it works.
Is there a manual that shows what all the parameters for AVAILABLE and ERROR are? For example, the second set of quotes in ERROR is acting as a placeholder, but for what? Or in the AVAILABLE what is the 0,1,-1 at the end? I'm assuming these are LOGICAL operators but would love to know where to find a list of all the combinations.
In regards to my disk percentage number, is there a way to get that to display in the interface? Right now when my circle turns orange, it simply tells me that disk1usedspace is the problem, but I would think I could get it to say the % as well.

Finally, and I think you answered this in your previous post but I didn't understand the answer, is there a way to make an NAME:Alldisksusedspace check whereupon we use arrays to look at all disk percentages, and if any of them is over 80% used it alerts? I went and made disk1, disk2, disk3, and disk4 alerts and apply them as appropriate since I needed to get this up and running, but I really don't care WHAT disk flags as an alert for me. Whichever one flags will cause me to have to investigate, so that is why I wonder if i can use some type of array concept.

Thanks for your help, it was very educational and I couldn't have created the above code snippets without your posts. Enjoy.
  • 1
  • 2
  • 3
  • 4
  • 5
  • 7

Who is online

Users browsing this forum: No registered users and 17 guests