Help with running a script...

Hello ! I just bought a hex Lite recently: this is my first time with Mikrotik, I have a few routers with openwrt, but I need a heat-resistant router for controlling our wood pellet boilers, and the manufacturer of the boilers told me to buy this.
Now, I wanted to run a ping monitor, so I set up e-mail. That’s working. Then I wrote a bash script and asked chatGPT to translate it to routerOS. Now, tried to paste it on ssh window but it doesn’t seem to be accepted properly, so I am trying with webfig. I clicked “system”, then “scripts”. I let the check boxes as they were, except checking “don’t require permissions”, and pasted the following:

:local hosts {
        "Heiz-links"="192.168.1.40";
        "Heiz-rechts"="192.168.1.41";
        "Heiz-folge"="192.168.1.42";
    }

    :foreach host in=$hosts do={
        :local ip [:pick $hosts $host];
        :local name $host;

        :local pingResult [/ping $ip count=1];
        :if ($pingResult = "") do={
            :if ([:typeof [/file find where name=("$name.fail1")]] = "nil") do={
                :log info "$name is down";
                /file print file="$name.report" message="Host $name ($ip) is down";
                /file set "$name.fail1" contents="failed";

                :local emailBody ("$name is down");
                /tool e-mail send to="xx@gmail.com" subject="$name is down" body=$emailBody;
            }
        } else={
            :if ([:typeof [/file find where name=("$name.fail1")]] != "nil") do={
                :log info "$name is online";
                /file remove "$name.fail1";
                /file print file="$name.report" message="Host $name ($ip) is online";

                :local emailBody ("$name is online");
                /tool e-mail send to="xx@gmail.com" subject="$name is online" body=$emailBody;
            }
        }
    }

Then clicked apply, and run script. Now the boilers are not connected with the router, so there is nothing with IP. So I should be getting some email. But I don’t get anything. As I said, e-mail is set up properly.

I do /tool e-mail> send to="xx@gmail.com" subject="test"

and I do get an email.

Now, obviously I was being too lazy just asking chatGPT to translate it. If the script is all wrong, then I should just try to learn, or give it up and put an openwrt access point and run my bash script there for ping. In fact, the latter is the easiest way, but I am not sure if my fritzbox4020 survive the heat, I am inclined to do the script on mikrotik. Since I have no plan to use mikrotik for other places, I don’t really want to put too much effort on it… if it’s avoidable.

I will appreciate it if you could please tell me what’s wrong, or all wrong, or I’m wrong in relying on chatGPT. I original bash script is attached, it is in use, and is working. (Currently they were on EdgeRouterX with openwrt.)
checkhost_heiz_anonym.txt (1.49 KB)

I will be the first person to tell you that I am terrible a RouterOS scripts. Most of the ones I have originated from someone elses script that I modified a bit to meet my purposes. However when I have needed to troubleshoot a script, a couple things that can help. Start by adding a bunch of info log entries (i.e. line 1, line 2, line 3, etc) scattered in the script. Run the script and see how far it gets. If it won’t run at all (often the case), remark out anything questionable and run it - even if that means the script won’t accomplish any thing useful. Once you remark out enough that it will at least run, watch the log to see how far it is getting. Then you can un-remark lines so see where it crashes. Once you find the line or lines that cause the crash, look at documentation if needed to determine the correct syntax. Repeat as needed…
Obviously, once you have it working properly, remove the extra log entries.

You should look at /tool/netwatch. That can do the ping part (and other tests, like icmp which will can look at latency/etc). In Netwatch, you can add the /tool/e-mail/send line to the “down-script” in new netwatch listener & repeat for your three hosts.

See: https://help.mikrotik.com/docs/display/ROS/Netwatch

This avoid the need for all the code that UNNECESSIARLY uses files to store results & skips the need for checking ping results — netwatch does that part for you.

Thank you to both of you Californians :slight_smile: I used to be there, too (Riverside), but now I’m in Germany. Now, Netwatch looks very good. So, “On Up”, “On down”, should I type in like

 tool e-mail  send to=xxx@gmail.com subject="Boiler 1 is offline"

to get a notification ? What is the function of “comment” ?

Basically, if /tool/e-mail is working from the CLI, you should be able to use same line in the “On Down” script. The netwatch scripts allow you to use variables from the netwatch config&results in the script. See the allowed variables here:
https://help.mikrotik.com/docs/display/ROS/Netwatch#Netwatch-Probestatistics/variables

For the subject and body… you can use any of those variables in the subject= and body= of the /tool/e-mail/send. So the “comment” field on the netwatch can be used store something like the friendly server name, and then access in the “On Up”, “On Down” scripts via $comment. The results of netwatch are also variables, so stuff like $status and $since can be used too. Again assuming e-mail setup/works, something like this should work:


/tool e-mail send to="xx@gmail.com" subject="$host ($comment) is $status" body="$host ($comment) netwatch ($type) got status $status since $since"

The “icmp” check is more complex, but you can actually get more stats on the ping. Just note that the icmp check is enforces all the max-ttl, max-XXX stuff — so it can fail even if ping “worked” but not fast enough/etc. The default “simple” network type is still icmp/ping, but doesn’t allow parameters like the number of pings/% success stuff.

@Amm0 thank you for your hints! On the webfig, there are parameters with default values “Interval: 00:01:00”, and “Timeout: 1000ms”. Does that mean that it will check every minute, and wait 1000ms (=1sec) to get ICMP reply, and if it doesn’t come back, it will notify me as “down”, and when next time the response comes back, it will notify me as “up” ?

I don’t really need subtle info: sometimes a device freezes totally, then I will have to go there physically to reboot it. An occasional hiccup is not a problem.

There is also /system/watchdog (https://help.mikrotik.com/docs/display/ROS/Watchdog), which does exactly that e.g. reboot if ping fails for a certain period.


/system watchdog set auto-send-supout=yes send-email-to=user@example.com ping-start-after-boot=2m ping-timeout=30s watch-address=8.8.8.8

It only does one host, but if combined with the netwatch on your servers, you’d likely cover all the bases. But it has the option to send email when it happens, including a “supout.rif” which Mikrotik support need to diagnose a problem (and you can view using https://mikrotik.com/client/supout).

O, I meant by rebooting not the router, but a part of boilers. I have to go and do it by hand.
I would like to do something like this: ping 10 times, 1s at a time, and if the packets are 100% lost, then I will get an email. This every minute. I haven’t really understood how to enter things on CLI. I guess

tool netwatch host=192.168.1.40 type=icmp thr-loss-percent=100% packet-interval=1000ms interval=60s up-script="tool e-mail send to=xxx@gmail.com subject=“Boiler 1 is online” down-script="tool e-mail send to=xxx@gmail.com subject=“Boiler 1 is offline”

Would this work? I’m not sure when I need " and when I don’t… I put " for the scripts because in the manual it says Default:“”, so I figured that the script is supposed to be written in " ". I’m not sure if I need %, ms, etc either.

But speaking of rebooting, is it a good practice to reboot the router periodically, say, once a week?

You should not need to. The only time I reboot any of mine is when they get a firmware update. I just looked, my primary home router has an uptime of 159 days 16 hours.

It’s fine to use winbox. It’s just easy to show options in CLI format — if you remove the “-” the name is same as winbox/webfig :wink:.

What you have works, it just ping 10 times in the first 10 seconds, then wait ~50 seconds for the interval value (60 seconds in your case). To “spread” the pings across one minutes, your correct that you set the “interval” to 60s & “packet-interval” of 1000ms. But you need to set “packet-count” to 59 — every 1 sec * 59 packets.

Netwatch’s ICMP tests will use the default values for EVERYTHING. So “fail” can be also that it took longer the 100ms (thr-rtt-avg), etc. If you start getting false-positives (e.g. it fails when then internet is up), then you might need to adjust the values starting with “thr-” since those are what define failure. The default values are as followed:

packet-interval (Default: 50ms) The time between ICMP-request packet send
packet-count (Default: 10) Total count of ICMP packets to send out within a single test
packet-size (Default: 54 (IPv4) or 54 (IPv6)) The total size of the IP ICMP packet
thr-rtt-max (Default: 1s) Fail threshold for rtt-max (a value above thr-max is a probe fail)
thr-rtt-avg (Default: 100ms) Fail threshold for rtt-avg
thr-rtt-stdev (Default: 250ms) Fail threshold for rtt-stdev
thr-rtt-jitter (Default: 1s) Fail threshold for rtt-jitter
thr-loss-percent (Default: 85.0%) Fail threshold for loss-percent
thr-loss-count (Default: 4294967295(max)) Fail threshold for loss-count

Agreed, no weekly reboot should be needed. Once configured, Mikrotik do generally keep on running.

NOW… I still think /system/watchdog is worth setting up… if something ever does crash, you’d know and already have the “supout.rif” file.

Thank you very much to both for your tips! OK, I will change thr thing to avoid false alarm. Spreading pings across one minute is a good idea!
As for watchdog, the WAN side of hex Lite is not reliable (VDSL converter over the telephone line: twisted pair), that side is watched by Raspberry pi: it restarts the converter if the hex lite doesn’t react. (and that might not solve, because the VDSL converter/power adapter really dies sometimes!) If hexlite is still unreachable after restarting of vdsl, I will get an SMS. Then I can go and figure out whose fault that is.

Actually, that was the reason why I decided to have a router in the boiler room, instead of hooking the boilers to edgerouterX 120 Meter away over VDSL Converter: the boilers+controller need a very stable DHCP server to talk to each other, but internet-connection is not crucial for their functionality: it’s used for monitoring the temperature etc over the server of the manufacturer, but they will keep working without it, as long as they have local IP address. (Static config of the boilers disables the connection with the server, so I can’t use it!)

Make sense.

If you compare the status tab for the netwatch on the thr-* things & compare with defaults in help(/my post above), you can see how close they are to triggering. But should be fine if local (e.g. >100ms for ping to a device locally might actually be problem).

Now I tried to set it up: first am going to ping-monitor Fritzbox 4020 with OpenWRT (AP Mode), which currently does the monitoring. I wanted to start the monitor on RouterOS, then reboot OpenWRT to see if I get an email. However, my commands are not accepted…

[admin@RouterOS] /tool netwatch> host=192.168.1.46 type=icmp thr-loss-percent=100% thr-rtt-avg=1000ms packet-interval=1000ms interval=60s packet-count=59 up-script="tool e-mail send to=xx@gmail.com subject="FB-Heiz is online"" down-script="tool e-mail send to=xx@gmail.com subject="FB-Heiz is offline""
syntax error (line 1 column 5)

I tried with and without " " for the scripts. I don’t know how to count the columns, can’t tell where the problem is, but I looked everything and I don’t know what’s wrong! Column 5 should be pretty close to the beginning… could you please tell me what’s wrong?

You’re messing the “add” before the command under /tool/netwatch & the subject needs to have a escape char (slash ) before/after the quotes (e.g. "FB-Heiz is offline")


* There is also no harm in using winbox/webfig too. The escaping does get tricky sometimes at CLI. In winbox/webfig, the there is no need to escape the subject there – but at CLI, the script is part of the option, so it needs to be escaped.

@Amm0 thank you for your quick reply! It still doesn’t work, though… I checked winfig and winbox, there is no place to put things like type=icmp. Just host, interval, timeout, on up, on down, comment. Nothing else.

[admin@RouterOS] /tool netwatch> add host=192.168.1.46 type=icmp thr-loss-percent=100% thr-rtt-avg=1000ms packet-interval=1000ms interval=60s packet-count=59 up-script="tool e-mail send to=xx@gmail.com subject=\"FB-Heiz is online\"" down-script="tool e-mail send to=xx@gmail.com subject=\"FB-Heiz is offline\""
expected end of command (line 1 column 23)

The error message is different, though…

Oh… If we’re talking about v6, there is no type=icmp. That’s v7 feature. It’s always the “simple” ping in v6.

Aahh, I didn’t think that there is a newer version!! I bought hexlite less than two weeks ago! How come it came with an old version? On the other hand, I am looking at “Packages”, it says the latest version is 6.49.11. Is v7 perhaps not available for hexlite?

If you go to System > Package, select the “upgrade” channel. That’s how you’d get to V7. AFAIK it should run fine.

After upgrade to V7, select “stable” and do another check to make sure you at the “stable” build (which should be 7.13).

I should add… you should do a “/system routerboard upgrade” and reboot BEFORE doing the upgrade. There is BOTH a OS (in System > Packages) and firmware (in System> Routerboard). Ideally the firmware is current BEFORE you upgrade the OS.

After upgrade, you’d likely want to do another “/system routerboard upgrade” and reboot again.