Community discussions

MikroTik App
 
demars
just joined
Topic Author
Posts: 2
Joined: Mon Nov 30, 2020 9:07 pm

Netwatch wrong status

Mon Mar 11, 2024 1:34 am

I have 2 routers:
Hap AC2 with public IP and L2tp server and RB-m11g with LTE modem.
L2tp tunnel without IPsec is set up between us
Netwach has been started on the HAP AC2 to check availability LTE router and send messages to telegramm
Netwatch settings:
netwatch settings.png
Netwatch is constantly triggered, although there is no packet loss. The screenshot shows that all packages have been received, but the status is still "down"
netwatch status.png
I did a traffic capture and it can be seen in wireshark that the packets were not lost
RouterOS version latest 7.14, CPU loading less 5% in both devices. The uplinks are almost not loaded
It is bug? Or I have incorrect settings in watchdog?
You do not have the required permissions to view the files attached to this post.
Last edited by holvoetn on Mon Mar 11, 2024 8:33 am, edited 1 time in total.
Reason: Clarified title
 
Guntis
MikroTik Support
MikroTik Support
Posts: 169
Joined: Fri Jul 20, 2018 1:40 pm

Re: Netwatch wrong status

Mon Mar 11, 2024 11:04 am

ICMP threshold values are exceeding the fail criteria, you need to adjust them: https://help.mikrotik.com/docs/display/ ... obeoptions
 
holvoetn
Forum Guru
Forum Guru
Posts: 5500
Joined: Tue Apr 13, 2021 2:14 am
Location: Belgium

Re: Netwatch wrong status

Mon Mar 11, 2024 11:09 am

For my understanding ...
Even if those values have not been specified, the defaults are being used ?

That's a tricky one to take into account.
 
Guntis
MikroTik Support
MikroTik Support
Posts: 169
Joined: Fri Jul 20, 2018 1:40 pm

Re: Netwatch wrong status

Mon Mar 11, 2024 11:18 am

Yes, the default values are always taken into account. The "new" Netwatch implementation works in this manner.
 
holvoetn
Forum Guru
Forum Guru
Posts: 5500
Joined: Tue Apr 13, 2021 2:14 am
Location: Belgium

Re: Netwatch wrong status

Mon Mar 11, 2024 11:21 am

Might be worthwhile to clarify this on the help page ?

Unless specified, default values are always evaluated.
 
Guntis
MikroTik Support
MikroTik Support
Posts: 169
Joined: Fri Jul 20, 2018 1:40 pm

Re: Netwatch wrong status

Mon Mar 11, 2024 12:05 pm

Thank you for the suggestion, note has been added addressing this.
 
jaclaz
Long time Member
Long time Member
Posts: 667
Joined: Tue Oct 03, 2023 4:21 pm

Re: Netwatch wrong status

Mon Mar 11, 2024 1:01 pm

Probably in the OP case the issue is
thr-rtt-avg (Default: 100ms)

I may well be wrong, but at firsts sight the 100 ms is not "proportional" to the:
thr-rtt-max (Default: 1s)

I mean you could have in theory a "perfectly stable" connection that is slowish and has always 101 ms pings (1/10 of the max allowed time) and it would trigger Netwatch, probably a value here 1/3, 1/4 or 1/5 of the rtt-max would be more sensible. :roll:
I.e. it seems to me that the rtt-avg will likely be (falsely) triggering the Netwatch much more often than rtt-max.

Also (again only a consideration) the default:
packet-count (Default: 10)
seems to me not "smart" when coupled with the:
thr-loss-percent (Default: 85.0%)

If I get this right :? , you send 10 pings, then if only 2 go through and return, Netwatch is triggered, if 3 go through and return it is fine.
So 85% behaves just like 80%?


Re-thinking about it, it should be:
If I get this right :? , you send 10 pings, then if only 1 goes through and return, Netwatch is triggered, if 2 go through and return it is fine.
So 85% behaves just like 90%?
Last edited by jaclaz on Mon Mar 11, 2024 6:29 pm, edited 1 time in total.
 
Guntis
MikroTik Support
MikroTik Support
Posts: 169
Joined: Fri Jul 20, 2018 1:40 pm

Re: Netwatch wrong status

Mon Mar 11, 2024 1:36 pm

I do see the point that is raised here with default values, but they are unlikely to be changed at this time unless there is a very good reason to do so, as:
1) Netwatch and its default values are widely used already, and any changes would directly impact devices in the field.
2) There are no perfect values, while you can find some good points where the default values could have been better 80% vs 85% for 10 packets, in the end it is still up to the client to decide what the ICMP probes use case is. Current values are more oriented towards reachability than connection quality, with the exception of rtt-avg-threshold, which could have been higher, by default.
In case default values don't match your use case, you can always change them. For connection quality measurements, we would recommend increasing packet count anyway.
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3505
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: Netwatch wrong status

Mon Mar 11, 2024 2:40 pm

IDK about changing the defaults, that could break someone using it...and there is "simple" if you don't want the advanced controls of ICMP.

But ICMP defaults are confusing. One option, for winbox at least, is showing the default values as "greyed out" (until set). Then someone know would know what was getting used, without checking docs. (I put a feature request for this a while back for this, SUP-115036)
 
jaclaz
Long time Member
Long time Member
Posts: 667
Joined: Tue Oct 03, 2023 4:21 pm

Re: Netwatch wrong status

Mon Mar 11, 2024 4:59 pm

Yes, I understand that changing the default values could trigger any kind of issue on existing working devices.

The suggestion/proposal by Amm0 makes however a lot of sense, these default values being "hidden" is most likely to be the root cause of perplexities.

Still, leaving the default values as they are, there could be a (reasoned) guideline on how the threshold values are interconnected, even if settings depends on the specific connection, when you start changing one value, the other values should be proportionally adjusted, i.e. tailored to the observed behaviour of the connection.

Taking the OP values as reference:
RTT Avg= 124.479 -> thr-rtt-avg should be raised to (say) 120%-150% of that, 150-180 ms
RTT Min= 60.451-> no corresponding setting, useful only for the above, as a first approximation (228+60)/2=288/2=144 ms
RTT Max=228.463 -> thr-rtt-max should be lowered to (say) 120%-150% of that, 275-350 ms

RTT jitter= 168.012 -> thr-rtt-jitter should be lowered from 1 second? To what? 2xvalue? 340 ms?

RTT Stdev=74.182 ->thr-rtt-stdev should be lowered from 250 ms? To what? 2xvalue? 150 ms?

packet-count-> is it better to increase to - say - 20?
thr-loss-percent -> should go hand in hand with packet count, 85 % for 10, should it be increased or decreased if packet count=20?

thr-loss-count is set to 4294967295 (I would say a rather high value) but before or later it may be reached :shock: , what would be then the behaviour of the netwatch:
1) trigger netwatch script AND reset value to 0 (sort of "run once")
or
2) trigger netwatch script AND do nothing (if this is the case manual intervention would be needed, as the Netwatch would be triggered continuously once the threshold is reached)
? :?:
 
Guntis
MikroTik Support
MikroTik Support
Posts: 169
Joined: Fri Jul 20, 2018 1:40 pm

Re: Netwatch wrong status

Mon Mar 11, 2024 5:35 pm

We understand, the benefit in being able to see the "default" values within RouterOS, but they will not be exposed, at least not in near future, due to the way Netwatch is implemented, same goes for Wifi default values.

Thank you for suggestion of adding some "templates" in Documentation for how to use Netwatch, we will consider expanding the page.

Some default values are set in a way, so they would not impact most scenarios unnecessarily and could be left alone.

"thr-loss-count is set to 4294967295 (I would say a rather high value)" - if you want to fail Netwatch probe after "n" lost packets, you can change the value, but in most cases it's not needed as "thr-loss-percent" performs this function. Thresholds only count for a single test, so if you send out 10 packets, and lose 5, they will not count towards "thr-loss-count" - once Netwatch starts sending the 11th packet, or rather sending the 1st packet (if packet-count was set to 10) thr-loss-count will show "0" - as a new test has started.
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3505
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: Netwatch wrong status

Mon Mar 11, 2024 5:44 pm

We understand, the benefit in being able to see the "default" values within RouterOS, but they will not be exposed, at least not in near future, due to the way Netwatch is implemented, same goes for Wifi default values.
Thanks Guntis for the clarity. If it's not easy, I get it. But would be useful...
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3505
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: Netwatch wrong status

Mon Mar 11, 2024 5:51 pm

One note here on "tuning" – the "Status" tab will show all the RTT value (even if it's failing due to one of the defaults). So you can use the netwatch's "Status" tab as a guide on what to set.

But if you adding a new one, you can add it, then look at what the actuals are use +% over those as a guide as a good place to start. Since
I do suspect none of the default are likely actually "right" for any situation – they just fail less/more.

For fiber/etc, that may be enough. But for LTE or other wireless, where there might more "jitter issues" (e.g. interference, congestion, time-of-day, weather, etc). That does require more adjustment (and trial-and-error over some long period).

Likely best to set all the ICMP netwatch values explicitly IMO – that avoid needing to cross-reference the docs (or if defaults change in future ;)).
 
jaclaz
Long time Member
Long time Member
Posts: 667
Joined: Tue Oct 03, 2023 4:21 pm

Re: Netwatch wrong status

Mon Mar 11, 2024 6:27 pm

Thanks Guntis. :)

I am a bit tough, but :
packet-count=10
thr-loss-percent=100
thr-loss-count=1

is triggering Netwatch in the exact sane manner as:
packet-count=10
thr-loss-percent=85 <- this should behave same as 90
thr-loss-count=4294967295

Who is online

Users browsing this forum: Amazon [Bot], GoogleOther [Bot], jaclaz, maurizio, mcengiz54 and 37 guests