Community discussions

MikroTik App
 
RandyRiver88
just joined
Topic Author
Posts: 14
Joined: Fri May 15, 2020 7:28 pm

Netwatch on ROS7 False Down?

Wed Nov 16, 2022 4:42 am

Hi,

I am having trouble with the new Netwatch, I am trying to monitor if my Wireguard-VPN is up and passing traffic.

I created a simple static route for example 8.8.8.8/32 through the Wireguard-Gateway, and 2nd rule for 8.8.8.8 as a black hole.

I am using a ICMP Netwatch check.

Configured as:
Host: 8.8.8.8
Interval: 00:01:00
Packet Interval: 1.00
Packet Count: 10
Thr Loss Count: 10

Netwatch keeps randomly reporting as DOWN.

When I check the Status for the last check I am seeing:

Sent Count: 10
Response Count: 10
Loss Count: 0
Loss Percent: 0%

Any idea what could be going on here? I am running 7.7 b6 but this issue has been on going for the last few releases, I would go as far to say when this new Netwatch was implemented.

Am I doing something wrong here or could this actually be a bug?

I have also tested this with 1.1.1., 9.9.9.9, 8.8.8.8 and another public server. All with same intermittent results, across 3 different routers/sites.

Here is an example with 9.9.9.9 on a 5 min interval:


Status: down
Since: Nov/15/2022 21:44:24
Done Tests: 274
Failed Tests: 23
Sent Count: 60
Response Count: 60
Loss Count: O
Loss Percent: 0.0 %
RTT Avg: 123.666 ms
RTT Min: 82.502 ms
RTT Max: 160.734 ms
RTT Jitter: 78.232 ms
RTT Stdev: 17.151 ms

Status: up
Since: Nov/15/2022 21:49:24
Done Tests: 275
Failed Tests: 23
Sent Count: 60
Response Count: 60
Loss Count: 0
Loss Percent: 0.0 %
RTT Avg: 99.896 Ms
RTT Min: 61.099 ms
RTT Max: 152.939 ms
RTT Jitter: 91.840 ms
RTT Stdev: 21.360 ms
 
Guntis
MikroTik Support
MikroTik Support
Posts: 153
Joined: Fri Jul 20, 2018 1:40 pm

Re: Netwatch on ROS7 False Down?  [SOLVED]

Wed Nov 16, 2022 8:31 am

It's due to thr-rrt-avg value, https://help.mikrotik.com/docs/display/ROS/Netwatch ,you can increase it above 100 to avoid this.
 
RandyRiver88
just joined
Topic Author
Posts: 14
Joined: Fri May 15, 2020 7:28 pm

Re: Netwatch on ROS7 False Down?

Wed Nov 16, 2022 3:43 pm

Thankyou.

However, this is poor design.

In theory, if the field is 'not enabled' and has 'no value set' then it shouldn't be used when determining if the test result is UP/DOWN.

If this field is going to be used regardless in determining a UP/DOWN result then it should be a mandatory field with 100ms as default.
 
RandyRiver88
just joined
Topic Author
Posts: 14
Joined: Fri May 15, 2020 7:28 pm

Re: Netwatch on ROS7 False Down?

Wed Nov 16, 2022 4:16 pm

This still doesn't work.

After changing it to 200, it doesn't report a down when losing all packets.

Status: up
Since: Nov/16/2022 09:03:01
Done Tests: 26
Failed Tests: 0
Sent Count: 10
Response Count: 0
Loss Count: 10
Loss Percent: 100.0 %
RTT Avg: 0.000 ms
RTT Min: -0.001 ms
RTT Max: 0.000 ms
RTT Jitter: 0.000 ms
RTT Stdev: 0.000 ms

Edit: Changing Thr Loss Percent from (100%) and making it check Loss Count (10) instead it seems to work now. Took 10 years to get a decent netwatch update but seems poor implementation at best.
Last edited by BartoszP on Wed Nov 16, 2022 5:20 pm, edited 1 time in total.
Reason: removed abusive wording ... please be more nice.
 
User avatar
Znevna
Forum Guru
Forum Guru
Posts: 1347
Joined: Mon Sep 23, 2019 1:04 pm

Re: Netwatch on ROS7 False Down?

Wed Nov 16, 2022 4:34 pm

It's not rocket science to understand the few settings presented with the default values also mentioned in the manual:
https://help.mikrotik.com/docs/display/ ... obeoptions
 
challado
newbie
Posts: 44
Joined: Tue Jul 01, 2008 2:53 am

Re: Netwatch on ROS7 False Down?

Fri Aug 11, 2023 4:18 pm

I disagree. Yes is a rocket science because every vendor do your understanding about Word, nature, Global Warming, etc.
In manual simply but "thr-loss-percent". But HOW thr is these? Is average of all tests? What is? and thr-rtt-avg? What is?
And, obviously, these information is ommited on status. Only basic information is displayed, and you can't do nothing because you can't guess the values.
 
pe1chl
Forum Guru
Forum Guru
Posts: 10183
Joined: Mon Jun 08, 2015 12:09 pm

Re: Netwatch on ROS7 False Down?

Fri Aug 11, 2023 4:52 pm

I agree it is confusing and incomplete. Like so many other chapters in the HELP system, it immediately dives into explaining properties, without even spending a single paragraph on a global description on how things fit together.

At first when I saw the new modes in Netwatch I believed that I could have the normal way of ping checking, but with some threshold on failures.
E.g. I want to do a single ping each minute, but only after 3 of them have failed I want to go into "down" state.
But it does not seem that is what the new "icmp" type can do, except after very careful tweaking of the config. It would send 3 pings in quick succession every minute and alert me when they do not reply, but it would still be difficult to have a way of monitoring that can e.g. tolerate a reboot of a remote system and come back within 3 minutes.

And if it can do it, I will need to craft the proper settings myself.
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3169
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: Netwatch on ROS7 False Down?

Fri Aug 11, 2023 6:19 pm

Docs could be better – totally agree some explanatory text and examples are missing.

But the real issue is that all the ICMP params have some default value that used if not set. I'm okay with defaults BUT the netwatch ICMP ones are too restrictive.

e.g. these aggressive defaults are what's cause the "false down" – and since you may not have set the failing one, it's not obvious at all... And the default are NOT very visible in UI either, so very hard to know what's failing.

IMO you'd rather "tighten down" setting from more forgiving default ... rather than "guess how high" something needs to be to not fail....

Like the new network concept, but agree needs some work/better docs... And using "type=simple" is also "icmp"/ping and avoid this issue if one doesn't care to monitor the ping metadata...
 
pe1chl
Forum Guru
Forum Guru
Posts: 10183
Joined: Mon Jun 08, 2015 12:09 pm

Re: Netwatch on ROS7 False Down?

Fri Aug 11, 2023 7:07 pm

Yes, type=simple is the classic Netwatch type: a ping sent every [interval] seconds, no reply -> down event.
I would have liked a simple extension that allows "N missed pings" before it declares a down event.
What we got was more sophisticated than that, but difficult to tame.
 
challado
newbie
Posts: 44
Joined: Tue Jul 01, 2008 2:53 am

Re: Netwatch on ROS7 False Down?

Sat Aug 19, 2023 3:33 pm

I think that "if the value is not set, I WON'T WILL use these metric", but mikrotik put on it a DEFAULT value, and all metricits works only with AND operator, not with OR. Simply is inneficient and bad designed.
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18958
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: Netwatch on ROS7 False Down?

Sat Aug 19, 2023 5:45 pm

Is this netwatch ping, to replace doing so in IP Routes?
In other words to ensure internet is actually reachable through ISP?
 
challado
newbie
Posts: 44
Joined: Tue Jul 01, 2008 2:53 am

Re: Netwatch on ROS7 False Down?

Tue Aug 22, 2023 1:34 am

Anav, I won't understand so good your question, but my English is poor.
In really check-gateway=ping is a good choice to detect problems to your router, but... If the problem is ACROSS the router, never detect link downs. Here we have several problems with that with two links for redundancy. The internet downs but gateway is still active, because problem is AFTER the gateway. Then, redundancy is useless in this case. I use netwatch to ping to some destinations (static routes on link 1 or link 2) to detect if LINK1 is DOWN or LINK2 (redundancy) is down too. But some links are STARLINK, and ping have abrupt latency variations, but NOT packet loss (link is active, but with a poor performance, but... NOT OFFLINE).
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3169
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: Netwatch on ROS7 False Down?

Tue Aug 22, 2023 1:47 am

Is this netwatch ping, to replace doing so in IP Routes?
In other words to ensure internet is actually reachable through ISP?
I put in a feature request for the /ip/route's check-gateway= to support "linking" to one (or more) netwatch entries (and any allowed things like http and icmp with jitter/etc stats). viewtopic.php?t=192844&hilit=feature

But today check-gateway=ping is just the next-hop router, and no IP to check is allowed (other than using recursive routes which your familar).

Who is online

Users browsing this forum: Bing [Bot], freemannnn and 69 guests