Amm0 did, see posts around #13 in the mentioned thread, but without results yet, I believe.
I’ll just add the pointer to the MikroTik documentation page that shows how to do this with only a couple of config items instead of the insanely complicated config depicted above:
The 7.21 beta will have new capbilities like setting the ping parameters, and I also suggested to have a simple log message indicating the ping has failed so it is easier to trigger some warning message when it happens.
You mean the settings in Filo's original method?
In that you have two wide 0.0.0.0 rules in the main table, with the primary active (but with a higher distance) and the backup one (with a lower disatance) deactivated.
Then you have an added routing table (DSL) with the same 0.0.0.0 route through the primary gateway.
The mangle rule marks the packets going to 8.8.8.8 to be routed through this latter added routing table.
So, normally, all destinations (but not 8.8.8.8) go through the "main" table that has only the route through the primary gateway active, the packets sent to 8.8.8.8 go through the added DSL table that has an identical route as well through the primary gateway.
When the Netwatch senses (since the watched address is 8.8.8.8 it goes through the added "DSL" routing table) that the primary gateway and route are not working, it enables the second (backup) route in main table, that since it has a lower distance, takes precedence over the primary one, the route in the DSL table is not touched, so the netwatch still attempts to go through it.
Everything (but not 8.8.8.8) is reachable through "main" routing table through the backup gateway.
As soon as 8.8.8.8 is again reachable by netwatch, the backup gateway route in main is disabled and everything is back to normal.
The same (forcing packets with destination 8.8.8.8 to use not the "main" table, but rather the "DSL" one) could be done using instead of mangle, routing rules, actually the help page has exactly this as an example:
The "simplest" method puts instead the narrow route to 8.8.4.4 (changed from the 8.8.8.8 in the example) in "main" table, getting rid of the added table and of the mangle (or routing) rule.
This needs the addition of the blackhole route, because in the "main" table, in some possible cases of failures (ethernet cable disconnected) both the "wide" route through the primary gateway and the narrow one, might go poof, so the 8.8.4.4 would take the route through the backup gateway and netwatch would find the connection as up. With the blackhole it is assured that 8.8.4.4 is not reachable keeping netwatch status down until the route through the primary gateway is re-established.
Nice observation Pe1chl, found it:
route - added options in /routing/settings to adjust check-gateway=ping timers;
Also interesting usage of ECMP type recursive routes, combined with Primary/Secondary failover in the example.
However, I think they have an error and want you to comment.
EDIT: Nope error was mine. I am used to my setups where I clearly delineate recursive route with the larger target scope, and the closer route to the router with the lower target scope like 12 with gateway of canary DNS as gatewayIP and then 11 for local gateway IP. IN their case they played it using the default TS of 10 for the closer route and 11 for the recursive route.
To recap rules...........
The resolving route (DIRECT - connected route) with dst-address TO the "real WWW IP (dns site)" and with local ISP gateway IP, has Target-Scope=X and the recursive route (INDIRECT - external route) with gateway IP VIA the "real work WWW gateway IP (dns site)" has Target-Scope=X+1.
In other words, the farther one gets from the router, the TS increases by one.
Second Rule. Between the same two routes being compared, the Direct , connected route, with local ISP gateway IP (resolving route) has to have a SCOPE that is equal to or less than the TARGET SCOPE of the recursive route.
In other words, the scope of the route must be equal or less than the target scope of the next farthest route.
An example let me do Two WANS, using their technique with four canaries.
So taking a look at normal routing for failover , you sourcnat both WANs as normal for masquerade, which clears connections quickly, for a flaky ISP one would want srcnat, which holds onto connections.
Note also in 7,21beta, they are going to allow to modify the ping settings for check-gateway=ping!!
IP route - (recursive route farthest away hop)
( WAN1 )
add dst-address=0.0.0.0 gateway=8.8.8.8 ts=12 distance=1 check-gateway=ping
add dst-address=0.0.0.0 gateway=1.1.1.1 ts=12 distance=1 check-gateway=ping
add dst-address=0.0.0.0 gateway=9.9.9.9 ts=12 distance=1 check-gateway=ping
add dst-address=0.0.0.0 gateway=208.67.222.222 ts=12 distance=1 check-gateway=ping
(WAN2)
add dst-address=0.0.0.0 gateway=8.8.4.4 ts=12 distance=2 check-gateway=ping
add dst-address=0.0.0.0 gateway=1.0.0.1 ts=12 distance=2 check-gateway=ping
add dst-address=0.0.0.0 gateway=9.9.9.9 ts=12 distance=2 check-gateway=ping
add dst-address=0.0.0.0 gateway=208.67.220.220 ts=12 distance=2 check-gateway=ping
IP route - (resolving direct route closest hop to router )
(WAN1)
add dst-address=8.8.8.8/32 gateway=gwyIP-ISP1 ts=11
add dst-address=1.1.1.1/32 gateway=gwyIP-ISP1 ts=11
add dst-address=9.9.9.9/32 gateway=gwyIP-ISP1 ts=11
add dst-address=208.67.222.222/32 gateway-gwyIP-ISP1 ts=1 1
(WAN2)
add dst-address=8.8.8.8/32 gateway=gwyIP-ISP2 ts=11
add dst-address=1.0.0.1/32 gateway=gwyIP-ISP2 ts=11
add dst-address=9.9.9.9/32 gateway=gwyIP-ISP2 ts=11
add dst-address=208.67.220.220/32 gateway-gwyIP-ISP2 ts=11
Assuming this would be the easiest method of checking four addresses.
Its not clear if the new check-gateway-ping interval is flexible.
Can it be:
a. change ping interval ( every 10 seconds to something else min 2 max 30 or something )
b. change decision point ( after which negative response to change interface status ). As noted, some people may prefer a shorter interval for 3 assessments, or have a decision made based on 2 assessments.
I setup a route to the 4 external ping check systems via the 2 ISP default gw with ping check, then I setup routes to 0.0.0.1 and 0.0.0.2 via two of those also with ping check, and finally the default route via 0.0.0.1 and 0.0.0.2 at two different distances.
This was described in a Help system article I can no longer find, it is basically the 2-ping-systems check version of the article I pointed above.
And because we use load balancing as well, I have that in 3 routing tables to use with PCC and route marking.
My ECMP approach was wrong and its now fixed above.
So yes, the recursive approach although many lines is simpler than netwatch in many ways.
If we get added flexibility with ping settings for check-gateway=ping that would be a nice bonus.
However I am still interested in looking at using NO DNS canaries and relying on icmp return message possibility.
Well this approach for recursive is load balancing for the Recursive canaries, how to add use recursive canaries and Load Balance as well between two or three ISPs, is realistic.
Throw in a subnet that has to use WAN X, regardless,
Throw in wireguard which needs to go into a specific WAN ![]()
Lets see what you got LOL
Yes I briefly thought “why do I need that 0.0.0.1 and 0.0.0.2 in between, can just set the default routes to directly point to the ping-checked systems and have two of them” but then I remembered I also need routes that always are via one ISP and don’t failover to the other, like for the GRE/IPsec tunnels that I also have. Those have the failover on BGP level on top of that. So the tunnel destinations have a route via 0.0.0.1 as well.
All this approach needs is routes in the table, no separate netwatch, scripts, logic in the scripts, on-startup code to make sure it cleanly starts, etc.
Well trying to avoid any scripts, except of course needed to update dynamic ISP gateway IPs.