Probe (if () do ros coomand) Problem

Hi,
I have some issue with CPU usage probes
First Ive made this function

Name: get_rb_cpu_load
Description: get router cpu load
Code: ros_command(“:put [/system resource get cpu-load]”)

And this is my first Probe

Name: high cpu usage
Type: Function
Agent: default
Available: get_rb_cpu_load()
Error: if(get_rb_cpu_load() < 60, “”, “high cpu usage”)
Value: get_rb_cpu_load()
Unit:%

This Probe is working correctly, Ive made some btests (to get higher >60% CPU load) after short time I get error “high cpu usage” as expected.
Also place on the device as appearance Label: [get_rb_cpu_load()], CPU load is showing as expected.

Now Ive made second probe if CPU load is >90% run ros command /system reboot

Name:reboot on high cpu ussage
Type: Function
Agent: default
Available: get_rb_cpu_load()
Error: if(get_rb_cpu_load()<90, “”, ros_command(“/system reboot”))
Value: get_rb_cpu_load()
Unit:%

Problem is no meter what variable I m gating from get_rb_cpu_load() function (no meter if Cpu load is 1, 4,20% etc.), Dude is always executing command line “, ros_command(”/system reboot"
I m not sure why ? can it be a some sort of bug?
Thanks.

Still no one has idea haw can I accomplish this
“Error: if(get_rb_cpu_load()<90, “”, ros_command(”/system reboot"))"

Is there another way?
any idea will be appreciated.

There is a flaw with the logic, anytime the probe can’t read the CPU load that router will reboot also if the cpu is actually 0% utilization it will cause a reboot.
Error: if(get_rb_cpu_load()<90, “”, ros_command(“/system reboot”))

Make your function so it returns “False” when the probe can’t read the SNMP OID also add 1 to the probe so it is never 0 (zero is always false)
I think the rb cpu oid is = 1.3.6.1.2.1.25.3.3.1.2.1 but I am not sure. So make sure you correct the following OID.

This function returns false if the oid is not available or the oid Value+1 if it is available.
Function: get_rb_cpu_load
Code: if(string_size(oid(“1.3.6.1.2.1.25.3.3.1.2.1”, 10 ,5)), oid(“1.3.6.1.2.1.25.3.3.1.2.1”, 10, 5)+1 ,“False”)

Probe Name: high cpu reboot
Type: Function
Agent: default
Available: get_rb_cpu_load() <> “False”
Error: if(get_rb_cpu_load()<>“False”,if(get_rb_cpu_load()-1< 95, “”, ros_command(“/system reboot”) ), “Cisco Device down”)
Value: get_rb_cpu_load()-1
Unit:%

I would suggest not rebooting the router when it is busy. Certainly you want to fix the network automatically but that can bite you and cause extra trouble. This reboots immediately when the value returned is above 95% CPU. It does not wait for 3 failures to reboot, I suggest moving the reboot command to the notification and then change it to 10 failures so it lets the router stew at max cpu for a bit then reboots.

HTH,
Lebowski

Thanks lebowski
First I want to use this probe on OmniTIK U-5HnD, I send a lot of post about high cpu usage problem, I never find out what is triggering this anomaly from time to time, from some users on this forum (that have same experience) a read that its lack of memory problem and only happens in ver. 6.x, I contact support they told me to enable wireless_cm2 :slight_smile: but problem continued.

In my function I m not using SNMP (SNMP is disable) because after enabling it CPU usage is almost >40% (also there are tons of post about that problem)
I m using simple ros command

Name: get_rb_cpu_load
Description: get router cpu load
Code: ros_command(“:put [/system resource get cpu-load]”)
About zero, 0 is always <90

This code
"Function: get_rb_cpu_load
Code: if(string_size(oid(“1.3.6.1.2.1.25.3.3.1.2.1”, 10 ,5)), oid(“1.3.6.1.2.1.25.3.3.1.2.1”, 10, 5)+1 ,“False”)

Probe Name: high cpu reboot
Type: Function
Agent: default
Available: get_rb_cpu_load() <> “False”
Error: if(get_rb_cpu_load()<>“False”,if(get_rb_cpu_load()-1< 95, “”, ros_command(“/system reboot”) ), “Cisco Device down”)
Value: get_rb_cpu_load()-1
Unit:%"
Its just fine but dont get me wrong I cant use SNTP I try it later with ros_command

I have another theory, while entering Dude to the device in that case OmniTik (400Mhz CPU) while checking CPU usage executing command ros_command(“:put [/system resource get cpu-load]”)
it may in that moment CPU usage go above 90% up to 100% just for a second.

"Function: get_rb_cpu_load
Code: if(string_size(oid(“1.3.6.1.2.1.25.3.3.1.2.1”, 10 ,5)), oid(“1.3.6.1.2.1.25.3.3.1.2.1”, 10, 5)+1 ,“False”)

Probe Name: high cpu reboot
Type: Function
Agent: default
Available: get_rb_cpu_load() <> “False”
Error: if(get_rb_cpu_load()<>“False”,if(get_rb_cpu_load()-1< 95, “”, ros_command(“/system reboot”) ), “Cisco Device down”)
Value: get_rb_cpu_load()-1
Unit:%"

I tried this code but same result device is rebooting no meter CPU is <90%

Sorry I don’t have a OmniTIK U-5HnD for testing, change it to warning…

Error: if(get_rb_cpu_load()<>“False”,if(get_rb_cpu_load()-1< 95, “”, concatenate("Warning: high CPU = ", get_rb_cpu_load(), “%”, rebooting) ) ), “Device down”)

Create a notification with execute and put the reboot command in the notification…
new notification
Name: Reboot
enabled
type: execute on server
ros_command(“/system reboot”)
in advanced clear all marks except for unstable->down

Note:If you don’t want to use SNMP still move the reboot to a notification, you can attach this notification to that device only…

Thats good one and it works this way :slight_smile:
Thanks lebowski
I will post “solution” code later

Cool, glad I could help.