Community discussions

 
stam
just joined
Topic Author
Posts: 21
Joined: Mon May 16, 2011 11:36 am

CCR Health Monitoring

Mon Jan 22, 2018 3:56 pm

Hello everybody,

I wrote a simple script that monitoring CCR's temperatures (i choose system<50C & CPU<60C) and power supply state. If something goes wrong, router sends a warning email with some additional device information.
To avoid spamming your mailbox, only 1 mail per day is allowed.
Of course you can monitor your devices with Dude or SNMP, but i prefer this way.
Also, don't forget to setup your email client settings. [/tools email]

Any corrections/suggestions/improvements are welcome!

Email Format
Subject
Warning from CCR-name IP:X.X.X.X
Message
Warning Indicators Measurements
Sys.Temp: 31C
CPU Temp: 52C
PSU1 state: ok
PSU2 state: false

Device Details
Identinty: CCR-name
Model: CCR1009-7G-1C-1S+
Serial Number: XXXXXXXXXXXXX
ROS Version: 6.37.1 (stable)
Firmware: 3.33
Free Memory: 1737MiB
System Uptime: 7w4d10:31:01


# Router Health Monitor Script (for CCR ONLY)
# Script V1.0
# WARNING EMAIL RECIPIENT ADDRESS
:local toEmail "RECIPIENT@ADDRESS.COM";

# integer temperature
:local systemTemp [:tonum [/system health get temperature]];
:local cpuTemp [:tonum [/system health get cpu-temperature]];

# ok / false
:local PSU1state [/system health get psu1-state];
:local PSU2state [/system health get psu2-state];

#DEBUG
#:set PSU2state "false";

# SET MAXIMUM TEMPERATURE VALUES
if ($systemTemp>=50 || $cpuTemp>=60 || $PSU1state!="ok" || $PSU2state!="ok") do={

# Get system name,model,serial,ros version,firmware
:local deviceName [/system identity get name];
:local model [/system routerboard get model];
:local serialNum [/system routerboard get serial-number];
:local rosVersion [/system resource get version];
:local deviceFirmware [/system routerboard get current-firmware];

# Free Ram (MB)
:local freeRam (([:tonum [/system resource get free-memory]]/1024)/1024)

# System Uptime 
:local sysUptime [/system resource get uptime];

#Public IP address
:local currentIP [:resolve myip.opendns.com server=208.67.222.222];

#DEBUG
#:log info $systemTemp;
#:log info $cpuTemp;

#Email Subject
:local mailSubject ("Warning from ".$deviceName." IP:".$currentIP);
#Email Body
:local mailBody ("Warning Indicators Measurements
Sys.Temp: ".$systemTemp. "C
CPU Temp: " . $cpuTemp."C
PSU1 state: ".$PSU1state."
PSU2 state:  ".$PSU2state."\n\n
Device Details
Identinty: ".$deviceName. "
Model: ".$model. "
Serial Number: ".$serialNum. "
ROS Version: ".$rosVersion."
Firmware: ".$deviceFirmware. "
Free Memory: ".$freeRam. "MiB
System Uptime: ".$sysUptime);

#DEBUG
#:log info $mailSubject;
#:log info $mailBody;

:global mailDate;
:local currentDate [/system clock get date;];

    if ($mailDate!=$currentDate) do={
     #Send warning email
    /tool e-mail send body=$mailBody start-tls=yes subject=$mailSubject to=$toEmail
    :set mailDate $currentDate;
    }
}
Scheduler every 1 hour, script, Email client setup
# jan/22/2018 12:28:17 by RouterOS 6.33.3
# software id = XXXX-XXXX
#
/system scheduler
add interval=1h name=HealthMonitor on-event=CCR-HealthMonitor policy=\
    ftp,reboot,read,write,policy,test,password,sniff,sensitive start-time=\
    startup
/system script
add name=CCR-HealthMonitor owner=admin policy=\
    ftp,reboot,read,write,policy,test,password,sniff,sensitive source="# Route\
    r Health Monitor Script (for CCR ONLY)\r\
    \n# Script V1.0 Sp\r\
    \n# WARNING EMAIL RECIPIENT ADDRESS\r\
    \n:local toEmail \"RECIPIENT@ADDRESS.com\";\r\
    \n\r\
    \n# integer temperature\r\
    \n:local systemTemp [:tonum [/system health get temperature]];\r\
    \n:local cpuTemp [:tonum [/system health get cpu-temperature]];\r\
    \n\r\
    \n# ok / false\r\
    \n:local PSU1state [/system health get psu1-state];\r\
    \n:local PSU2state [/system health get psu2-state];\r\
    \n\r\
    \n#DEBUG\r\
    \n#:set PSU2state \"false\";\r\
    \n\r\
    \n# SET MAXIMUM TEMPERATURE VALUES\r\
    \nif (\$systemTemp>=50 || \$cpuTemp>=60 || \$PSU1state!=\"ok\" || \$PSU2st\
    ate!=\"ok\") do={\r\
    \n\r\
    \n# Get system name,model,serial,ros version,firmware\r\
    \n:local deviceName [/system identity get name];\r\
    \n:local model [/system routerboard get model];\r\
    \n:local serialNum [/system routerboard get serial-number];\r\
    \n:local rosVersion [/system resource get version];\r\
    \n:local deviceFirmware [/system routerboard get current-firmware];\r\
    \n\r\
    \n# Free Ram (MB)\r\
    \n:local freeRam (([:tonum [/system resource get free-memory]]/1024)/1024)\
    \r\
    \n\r\
    \n# System Uptime \r\
    \n:local sysUptime [/system resource get uptime];\r\
    \n\r\
    \n#Public IP address\r\
    \n:local currentIP [:resolve myip.opendns.com server=208.67.222.222];\r\
    \n\r\
    \n#DEBUG\r\
    \n#:log info \$systemTemp;\r\
    \n#:log info \$cpuTemp;\r\
    \n\r\
    \n#Email Subject\r\
    \n:local mailSubject (\"Warning from \".\$deviceName.\" IP:\".\$currentIP)\
    ;\r\
    \n#Email Body\r\
    \n:local mailBody (\"Warning Indicators Measurements\r\
    \nSys.Temp: \".\$systemTemp. \"C\r\
    \nCPU Temp: \" . \$cpuTemp.\"C\r\
    \nPSU1 state: \".\$PSU1state.\"\r\
    \nPSU2 state:  \".\$PSU2state.\"\\n\\n\r\
    \nDevice Details\r\
    \nIdentinty: \".\$deviceName. \"\r\
    \nModel: \".\$model. \"\r\
    \nSerial Number: \".\$serialNum. \"\r\
    \nROS Version: \".\$rosVersion.\"\r\
    \nFirmware: \".\$deviceFirmware. \"\r\
    \nFree Memory: \".\$freeRam. \"MiB\r\
    \nSystem Uptime: \".\$sysUptime);\r\
    \n\r\
    \n#DEBUG\r\
    \n#:log info \$mailSubject;\r\
    \n#:log info \$mailBody;\r\
    \n\r\
    \n:global mailDate;\r\
    \n:local currentDate [/system clock get date;];\r\
    \n\r\
    \n    if (\$mailDate!=\$currentDate) do={\r\
    \n     #Send warning email\r\
    \n    /tool e-mail send body=\$mailBody start-tls=yes subject=\$mailSubjec\
    t to=\$toEmail\r\
    \n    :set mailDate \$currentDate;\r\
    \n    }\r\
    \n}\r\
    \n"
/tool e-mail
set address=smtp.office365.com from=your@email.com password=\
    "yourpassword" port=587 start-tls=yes user=yourUsername
 
petertosh
newbie
Posts: 28
Joined: Wed Mar 21, 2018 9:42 am

Re: CCR Health Monitoring

Mon May 07, 2018 10:00 pm

Thank you very much!
 
tomislav91
Frequent Visitor
Frequent Visitor
Posts: 81
Joined: Fri May 26, 2017 12:47 pm

Re: CCR Health Monitoring

Thu Sep 05, 2019 9:18 pm

I cant get psu state, just empty field.
 
User avatar
eworm
Member
Member
Posts: 377
Joined: Wed Oct 22, 2014 9:23 am
Location: Oberhausen, Germany
Contact:

Re: CCR Health Monitoring

Thu Sep 05, 2019 11:39 pm

Sorry for hijacking this thread, but I would like to introduce an alternative. I had some extra requirements:
  • should integrate with my RouterOS scripts, to re-use some basic functionality like notifications (inkl. Telegram)
  • should support every RouterOS device with health values
  • support notifications on voltage jumping up or down (which may indicate failed PSU after UPS kicked in)
  • should send notifications for alert and recovery (and more than once a day after recovery)
  • configurable thresholds
To use this you need to install and configure the basic scripts, see RouterOS scripts. Then install check-health and add a scheduler. Thresholds are configurable in global configuration.

Feel free to contact me on issues, missing sensors or other whish list.
Manage RouterOS scripts and extend your devices' functionality: RouterOS Scripts

Who is online

Users browsing this forum: No registered users and 6 guests