Lebowski and other probe-crazies ;)
I tried out the probe/function ideas that you assembled in the wiki at
http://wiki.mikrotik.com/wiki/Getting_s ... and_probes
however, I ran into a problem:
[code]#sho proc cpu sort
CPU utilization for five seconds: 46%/9%; one minute: 63%; five minutes: 44%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
196 21695537441564813062 1386 31.35% 44.85% 29.82% 0 SNMP ENGINE
191 2585685402353894333 0 2.87% 2.68% 2.28% 0 IP Input
184 2824427122856171744 0 1.83% 1.78% 1.21% 0 IP SNMP
10 58310624 390475751 149 0.87% 0.74% 0.66% 0 ARP Input
185 472294481442857906 32 0.23% 0.30% 0.19% 0 PDU DISPATCHER
148 4694300 99594750 47 0.07% 0.03% 0.02% 0 HSRP IPv4
[/code]
This is from a 7600/6500 Cisco router - quite some impact :)
I think it may be a little too much to call a whole SNMP walk every time you execute the function and call the same function 5 times in the probe.. but that may be just me...
I have set the SNMP timeouts to standard 3000 ms and the retries to 5, later back to 3 but it kept flooding my router. back to square one, I guess.
Maybe I really should distribute the load over several Dude servers...
Any hints would be appreciated
Andreas
My code:
[code]
<?xml version="1.0" ?>
<dude version="4.0beta2">
<Function>
<sys-type>57</sys-type>
<sys-id>1026822</sys-id>
<sys-name>Cisco_CPU_1min</sys-name>
<code>if(array_size(oid_column("1.3.6.1.4.1.9.2.1.57", 10 ,29)), oid("1.3.6.1.4.1.9.2.1.57.0", 10, 29)+1 ,"False")</code>
<descr>Reads the 1 minute CPU of a Cisco device.</descr>
</Function>
</dude>
<?xml version="1.0" ?>
<dude version="4.0beta2">
<Probe>
<sys-type>13</sys-type>
<sys-id>1026825</sys-id>
<sys-name>CiscoCPU</sys-name>
<typeID>8</typeID>
<functionAvailable>Cisco_CPU_1min() <> "False"</functionAvailable>
<functionError>if(Cisco_CPU_1min()<>"False",if(Cisco_CPU_1min() < 60, "", concatenate("Warning: high CPU: ", Cisco_CPU_1min(), "%")), "CPU polling fault")</functionError>
<functionValue>oid("1.3.6.1.4.1.9.2.1.57.0",10,29)</functionValue>
<functionUnit>%</functionUnit>
</Probe>
[snip a lot of service IDs and other stuff the Dude adds over time...]
Here the interface probes:
<?xml version="1.0" ?>
<dude version="4.0beta2">
<Function>
<sys-type>57</sys-type>
<sys-id>1106054</sys-id>
<sys-name>if_0_status</sys-name>
<code>if(array_size(oid_column("1.3.6.1.2.1.2.2.1.8", 10 ,29)), oid_raw("1.3.6.1.2.1.2.2.1.8.0", 10, 29),"False")</code>
<descr>polls the status of ifindex 0 (1 means 'up', 2 means 'down')</descr>
</Function>
</dude>
<?xml version="1.0" ?>
<dude version="4.0beta2">
<Function>
<sys-type>57</sys-type>
<sys-id>1106327</sys-id>
<sys-name>if_0_name</sys-name>
<code>oid("1.3.6.1.2.1.2.2.1.2.0", 10, 29)</code>
<descr>polls the interface name of ifindex 0 </descr>
</Function>
</dude>
<?xml version="1.0" ?>
<dude version="4.0beta2">
<Function>
<sys-type>57</sys-type>
<sys-id>1106600</sys-id>
<sys-name>if_0_desc</sys-name>
<code>oid("1.3.6.1.4.1.9.2.2.1.1.28.0", 10, 29)</code>
<descr>polls the interface description of ifindex 0 </descr>
</Function>
</dude>
<?xml version="1.0" ?>
<dude version="4.0beta2">
<Probe>
<sys-type>13</sys-type>
<sys-id>1106873</sys-id>
<sys-name>IFindex_0</sys-name>
<typeID>8</typeID>
<functionAvailable>if_0_status() = 1</functionAvailable>
<functionError>if(if_0_status()<>"False",if(if_0_status() = 1, "", concatenate(if_0_desc()," connected to interface: ",if_0_name(),"is down") ), "SNMP polling fault - most likely false alarm")</functionError>
</Probe>
</dude>
[/code]