[Script] Healthchecks notification

I’d like to share a very simple script to monitor any RouterOS device (and even more devices, sites etc.) with ‘damn simple’ application https://healthchecks.io . The application can be used from author site (requires registration, but it’s for free for small instances) or can be selfhosted. The main advantage is extremely easy setup and integrations with bunch of services (actually email and every messenger working with webhooks). At RouterOS site you need just script and scheduler:

/system script
add dont-require-permissions=no name=healthchecks policy=read,write,test source=":global healthchecksaddress \"healthchecks.io/ping/\";\
    \n:local routerid \"e6dcad98-12b4-416b-a0d0-dca84f3aa98e\";\
    \n:local printerid \"c4a36fdb-24ce-49e1-b038-c158fdeb8022\";\
    \n:local tvid \"a8cd4b3a-fa83-494a-bd1b-f649265af614\";\
    \n:local ruptime [/system resource get uptime];\
    \n/tool fetch duration=10 output=none http-data=\"Uptime: \$ruptime\" http-method=post url=\"https://\$healthchecksaddress\$routerid\";\
    \n:if ([/ping 192.168.0.191 count=1] = 1) do={/tool fetch duration=10 output=none http-method=post url=\"https://\$healthchecksaddress\$printerid\"};\
    \n:if ([/ping 192.168.0.201 count=1] = 1) do={/tool fetch duration=10 output=none http-method=post url=\"https://\$healthchecksaddress\$tvid\"};\
    \n"

/system scheduler
add interval=5m name=healthchecks on-event=healthchecks policy=read,write,test

In the example I send device uptime to Healthchecks to have an overview there were no short power outages and how long the router is really working.

This is just an example (ids are random) and you need to create your own checks in Healthchecks (setup schedule and integrations) and set up proper variables in script.
It’s just “working concept” and I’d appreciate any ideas how to improve it (eg. make an array: ips → healthchecks uris), or any other fancy ideas what and how to monitor with it. . If you use it with your Mikrotiks pls. share your configs or ideas here.

If you like to do it all your self and not put it on an external server, you can use Splunk (Free for up to 500MB log pr day)

See my signature.
http://forum.mikrotik.com/t/tool-using-splunk-to-analyse-mikrotik-logs-3-3-graphing-everything/121810/1

I’m surprised at the lack of interest here. I came looking for something specific and I’m going to check this out to see if it will work.

Our CCR sits directly on a fiber connection and has a decent sized battery backup that can run the router, CSS, 4 APs and a P2P for at least 4 hours that we’ve seen. The site has serious power issues though and occasionally we still lose service for over 150 customers on this site because until it goes down, we don’t actually know it’s on battery.

One a planned power outage this past weekend we had a chance to look more closely at this. In winbox/System Health it shows psu1state-OK, psu2state-OK. During this time, psu-2 state showed down. So it appears that all I really need is a way to get the router to send out a notification when psu-1 or psu2 states change. Hopefully your link resolves it. If so, I thank you in advance. If there’s a better solution that you know of, I’d appreciate any suggestions you might have. :slight_smile:

You could have a look at Notify about health state, which supports monitoring the PSU state, and other health values.