Netwatch & slightly flaky tunnel - ideas how to handle that?

kiler129 · Wed Dec 27, 2023 10:46 pm

I have a few tunnels from branch offices to the hq. In the spirit of bringing monitoring closer to the actual tunnel, as I test I set-up a Netwatch on MikroTiks to watch if/when these tunnels go down for some reason.

Current setup
Currently I'm doing "tcp-conn" with 10 minutes interval to one of the concentrators accessible only over the tunnel. However, RouterOS seems to be limited a bit and I had to put a clever up and down scripts:

# for "UP" it sets the interval to probe native 10 minutes; for down it decreases it to 2 minutes
:global newInterval 00:10:00

if ([/tool/netwatch get [find host=$host] interval] != $newInterval) do={
  :log info "Probe to $host changed to $status - setting interval to $newInterval"
  /tool/net watch set [find host=$host] interval=$newInterval
}

On "test" I'm reporting the status to the monitoring system:

:local hcUrl "https://<pingService>/......"

if ($status != "up") do={:set hcUrl "$hcUrl/fail"}
/tool/fetch keep-result=no duration=2s http-method=post http-data="Probe $type to $host is $status since $since (failed $"failed-tests" of $"done-tests")" "$hcUrl"

Floppy tunnel issue
In some areas local ISPs seem to have some sort of scheduled reboots which can knock down the tunnel for 1-5 minutes during the night. While the tunnel is legitimately down, it's a non-actionable thing for me. I would like to set some threshold, but I don't think Netwatch has any facility to script that? Unfortunately changing the time with my up/down scripts resets the "failed-tests" and "done-tests" as well.

Is Netwatch a wrong tool here? It feels like it's missing a few options, or I am misusing it.

Netwatch & slightly flaky tunnel - ideas how to handle that?

Netwatch & slightly flaky tunnel - ideas how to handle that?

Who is online