RouterOS v7 - WAN failover

Hello!

I have been trying to implement WAN failover in RouterOS 7 - currently working with version 7.1.3 on a Chateau device (D53G-5HacD2HnD)
I have used this document as a reference - I think this is the new documentation, and the history of the document looks recent:
https://help.mikrotik.com/docs/pages/viewpage.action?pageId=26476608
https://help.mikrotik.com/docs/pages/viewpreviousversions.action?pageId=26476608

I have followed the guide and done some testing.
The failover certainly seems to work if I simply pull the cable - In my lab example, I have 2 WAN interfaces:

  • ether1
  • lte1

Both the simple single host (Google DNS) example and multi host (Google + Open DNS) examples work when I just yank the ether1 interface.
The problem is that in the real world, this is not always how WAN fails - there are many issues that could stop internet service for a particular WAN interface.

In this case, I also have control of the upstream router for ether1 WAN, and when I just drop all traffic from that network, the failover does not work.
I also tested at this point, on simple single host config, that:

  • I can’t ping 8.8.8.8
  • I can ping 8.8.4.4
  • but my internet test ping to 1.1.1.1 is still failing and there is no “failover” for my WAN route

This is basic single host script I have used for testing
** I used WAN interface names, as LTE IP changes and this seemed like a better way to do it, although I have tested with static gateway IPs and the failover still does not seem to work when traffic is dropped

{
# wan or isp reference names 
:local Wan1Reference "Primary_WAN"
:local Wan2Reference "Failover_LTE"

# actual interface names for the 2 wans
:local Wan1InterfaceName "ether1"
:local Wan2InterfaceName "lte1"

# hosts for checking internet status
## google dns
:local Wan1CheckHostA "8.8.8.8"
:local Wan2CheckHostA "8.8.4.4"

# add routing tables
/routing/table add fib name=("to-".$Wan1Reference)
/routing/table add fib name=("to-".$Wan2Reference)

# ip firewall mangle rules for marking connections and routing 
/ip/firewall/mangle add chain=output connection-state=new connection-mark=no-mark action=mark-connection new-connection-mark=($Wan1Reference."-conn") out-interface=$Wan1InterfaceName
/ip/firewall/mangle add chain=output connection-mark=($Wan1Reference."-conn") action=mark-routing new-routing-mark=("to-".$Wan1Reference) out-interface=$Wan1InterfaceName
/ip/firewall/mangle add chain=output connection-state=new connection-mark=no-mark action=mark-connection new-connection-mark=($Wan2Reference."-conn") out-interface=$Wan2InterfaceName
/ip/firewall/mangle add chain=output connection-mark=($Wan2Reference."-conn") action=mark-routing new-routing-mark=("to-".$Wan2Reference) out-interface=$Wan2InterfaceName

# route config
/ip/route add dst-address=$Wan1CheckHostA scope=10 gateway=$Wan1InterfaceName
/ip/route add dst-address=$Wan2CheckHostA scope=10 gateway=$Wan2InterfaceName

/ip/route add distance=1 gateway=$Wan1CheckHostA routing-table=("to-".$Wan1Reference) check-gateway=ping
/ip/route add distance=2 gateway=$Wan2CheckHostA routing-table=("to-".$Wan1Reference) check-gateway=ping

/ip/route add distance=1 gateway=$Wan2CheckHostA routing-table=("to-".$Wan2Reference) check-gateway=ping
/ip/route add distance=2 gateway=$Wan1CheckHostA routing-table=("to-".$Wan2Reference) check-gateway=ping

}

Hoping someone can help me get this right.
Or maybe share a better way to do WAN failover in Router OS 7.
I did consider writing a script using detect-internet function, although it is a lot slower to failover, and it seems to be limited to a single host / not customisable to add more host checks.

Yes, and that’s the very motivation for this whole setup, where the WAN “transparency” is being checked using pings to the Reference addresses.

An unusual thing I can spot in your setup is that the gateway parameters of the routes towards the reference addresses are set to interface names rather than to IP addresses of the gateway devices; this is a possible setting but the gateway device must act as an ARP proxy, responding with its own MAC address to ARP request for any IP address.

Other than that, the configuration seems incomplete to me. You have two routing tables, named to-Primary_WAN and to-Failover_LTE, but you didn’t configure (show?) any default routes via Reference addresses to be added to routing table main, nor have you shown any rules that would assign routing-mark values except in mangle chain output. And the latter ones are confusing to me.

So unless something is missing in what you’ve shown, add the default routes via 8.8.8.8 and via 8.8.4.4 to routing table main, disable the rules in chain output of mangle, and try pinging 1.1.1.1 again. If you use a src-nat or masquerade rule(s) on the WAN interfaces, you have to bear in mind that if you break the Primary WAN transparency to 8.8.8.8 during pinging, the connection tracking will not change the NAT handling for that ping sequence so the pings will fail even though the route via LTE will be active. To check that the Backup LTE kicked in, you have to stop pinging for 10 seconds and then try again.

Can you elaborate a bit further, please ?
Do you mean that, for this to work, “chosen references (here 8.8.8.8 and 8.8.4.4) SHOULD act as ARP proxy but they don’t” ?

Not the references themselves (they are far away so our ARP requests cannot reach them), the routers to which WAN1 and WAN2 are physically connected. Only if the LTE WAN is an internal LTE card or an USB one, giving its name as gateway of a route makes sense as the LTE interface is either an L3 one, or an L2 one but acting as proxy-arp.

But it is not only important for WAN failover - even if there was none, you still need that either the external routers support proxy-arp or the routes’ gateways are set to IP addresses, not interface names, of these external routers.

Thanks for the comments - they were helpful - it appears to be working after a bit of testing, at least for the lab I have setup - probably requires a bit more real world testing before I’m confident

For anyone interested, here is the same script block with adjustments I used to test / implement

The best way to test a “true” failover situation seems to be by using the firewall test rule for ICMP to the primary WAN
When this is enabled while pinging some address
You will see after some moments, the primary recursive route becomes unreachable
And traffic should switch to use sescondary WAN
You can watch the traffic switch interface using torch tool - that is what I did

{
# testing comment ease of removal of added config
:local TestingComment "testing-comment-remove-me"

# wan or isp reference names 
:local Wan1Reference "Primary_WAN"
:local Wan2Reference "Failover_LTE"

# actual interface names for the 2 wans
:local Wan1InterfaceName "ether1"
:local Wan2InterfaceName "lte1"

# interface ip or gateway
## lte1 gets from dhclient and not gateway shown
## getting address and removing cidr notation
# appears as though lte1 interface name can be used and using ip does not work
:local Lte1IP ([[/ip/route/get number=[/ip/route/find gateway="lte1" distance=0]] as-value]->"dst-address")
:local Lte1IP [:pick $Lte1IP 0 [:find $Lte1IP "/"]]
## ether1 is dhclient
## has gateway address
:local PrimaryWanGatewayIP [/ip/dhcp-client/get ether1 gateway]

# hosts for checking internet status
## google dns
:local Wan1CheckHostA "8.8.8.8"
:local Wan2CheckHostA "8.8.4.4"

# add routing tables
/routing/table add fib name=("to-".$Wan1Reference) comment=$TestingComment
/routing/table add fib name=("to-".$Wan2Reference) comment=$TestingComment

# ip firewall mangle rules for marking connections and routing 
/ip/firewall/mangle add chain=output connection-state=new connection-mark=no-mark action=mark-connection new-connection-mark=($Wan1Reference."-conn") out-interface=$Wan1InterfaceName comment=$TestingComment
/ip/firewall/mangle add chain=output connection-mark=($Wan1Reference."-conn") action=mark-routing new-routing-mark=("to-".$Wan1Reference) out-interface=$Wan1InterfaceName comment=$TestingComment
/ip/firewall/mangle add chain=output connection-state=new connection-mark=no-mark action=mark-connection new-connection-mark=($Wan2Reference."-conn") out-interface=$Wan2InterfaceName comment=$TestingComment
/ip/firewall/mangle add chain=output connection-mark=($Wan2Reference."-conn") action=mark-routing new-routing-mark=("to-".$Wan2Reference) out-interface=$Wan2InterfaceName comment=$TestingComment

# route config
/ip/route add dst-address=$Wan1CheckHostA scope=10 gateway=$PrimaryWanGatewayIP comment=$TestingComment
/ip/route add dst-address=$Wan2CheckHostA scope=10 gateway=$Wan2InterfaceName comment=$TestingComment

/ip/route add distance=1 gateway=$Wan1CheckHostA target-scope=11 routing-table=("to-".$Wan1Reference) check-gateway=ping comment=$TestingComment
/ip/route add distance=2 gateway=$Wan2CheckHostA target-scope=11 routing-table=("to-".$Wan1Reference) check-gateway=ping comment=$TestingComment

/ip/route add distance=1 gateway=$Wan2CheckHostA target-scope=11 routing-table=("to-".$Wan2Reference) check-gateway=ping comment=$TestingComment
/ip/route add distance=2 gateway=$Wan1CheckHostA target-scope=11 routing-table=("to-".$Wan2Reference) check-gateway=ping comment=$TestingComment

# some test rules which can be enable / disable to test failover
/ip/firewall/filter/add chain=output action=drop out-interface=ether1 place-before=1 disabled=yes comment=$TestingComment
/ip/firewall/filter/add chain=output action=drop out-interface=lte1 place-before=1 disabled=yes comment=$TestingComment

/ip/firewall/filter/add chain=output action=drop out-interface=ether1 place-before=1 protocol=icmp dst-address=8.8.8.8 disabled=yes comment=$TestingComment
/ip/firewall/filter/add chain=output action=drop out-interface=lte1 place-before=1 protocol=icmp dst-address=8.8.4.4 disabled=yes comment=$TestingComment

}

I also tried the second example (from references in original post), as I wanted to use two hosts for the failover config.
I was not able to get the virtual ip stuff to work - not sure exactly what is required - I think it maybe something to do with having default routes on the WAN interfaces or target-scope precedence or something else.
If the virtual IP example is a better way to do it, would love to hear someones explanation of why, and some more notes on how to implement it as the reference is not very good IMO.

After the virtual ip stuff didn’t work, I just tried doing it the normal way (first example) and this seemed to work, and for my testing did seem to ONLY failover when BOTH hosts were not reachable.

Here is that example for anyone interested:

{
# testing comment ease of removal of added config
:local TestingComment "testing-comment-remove-me"

# wan or isp reference names 
:local Wan1Reference "Primary_WAN"
:local Wan2Reference "Failover_WAN"

# actual interface names for the 2 wans
:local Wan1InterfaceName "ether1"
:local Wan2InterfaceName "lte1"

# gateways
## need different things here depending on gateway type - your mileage may vary and some testing likely required
:local Wan1Gateway [/ip/dhcp-client/get ether1 gateway] 
:local Wan2Gateway "lte1"

# hosts for checking internet status
## google dns
:local Wan1CheckHostA "8.8.8.8"
:local Wan2CheckHostA "8.8.4.4"
## open dns
:local Wan1CheckHostB "208.67.222.222"
:local Wan2CheckHostB "208.67.220.220"

# add routing tables
/routing/table add fib name=("to-".$Wan1Reference) comment=$TestingComment
/routing/table add fib name=("to-".$Wan2Reference) comment=$TestingComment

# ip firewall mangle rules for marking connections and routing 
/ip/firewall/mangle add chain=output connection-state=new connection-mark=no-mark action=mark-connection new-connection-mark=($Wan1Reference."-conn") out-interface=$Wan1InterfaceName comment=$TestingComment
/ip/firewall/mangle add chain=output connection-mark=($Wan1Reference."-conn") action=mark-routing new-routing-mark=("to-".$Wan1Reference) out-interface=$Wan1InterfaceName comment=$TestingComment
/ip/firewall/mangle add chain=output connection-state=new connection-mark=no-mark action=mark-connection new-connection-mark=($Wan2Reference."-conn") out-interface=$Wan2InterfaceName comment=$TestingComment
/ip/firewall/mangle add chain=output connection-mark=($Wan2Reference."-conn") action=mark-routing new-routing-mark=("to-".$Wan2Reference) out-interface=$Wan2InterfaceName comment=$TestingComment

# route config
/ip/route add dst-address=$Wan1CheckHostA scope=10 gateway=$Wan1Gateway comment=$TestingComment
/ip/route add dst-address=$Wan2CheckHostA scope=10 gateway=$Wan2Gateway comment=$TestingComment

/ip/route add dst-address=$Wan1CheckHostB scope=10 gateway=$Wan1Gateway comment=$TestingComment
/ip/route add dst-address=$Wan2CheckHostB scope=10 gateway=$Wan2Gateway comment=$TestingComment

/ip/route add distance=1 gateway=$Wan1CheckHostA target-scope=11 routing-table=("to-".$Wan1Reference) check-gateway=ping comment=$TestingComment
/ip/route add distance=2 gateway=$Wan2CheckHostA target-scope=11 routing-table=("to-".$Wan1Reference) check-gateway=ping comment=$TestingComment

/ip/route add distance=1 gateway=$Wan1CheckHostB target-scope=11 routing-table=("to-".$Wan1Reference) check-gateway=ping comment=$TestingComment
/ip/route add distance=2 gateway=$Wan2CheckHostB target-scope=11 routing-table=("to-".$Wan1Reference) check-gateway=ping comment=$TestingComment

/ip/route add distance=1 gateway=$Wan2CheckHostA target-scope=11 routing-table=("to-".$Wan2Reference) check-gateway=ping comment=$TestingComment
/ip/route add distance=2 gateway=$Wan1CheckHostA target-scope=11 routing-table=("to-".$Wan2Reference) check-gateway=ping comment=$TestingComment

/ip/route add distance=1 gateway=$Wan2CheckHostB target-scope=11 routing-table=("to-".$Wan2Reference) check-gateway=ping comment=$TestingComment
/ip/route add distance=2 gateway=$Wan1CheckHostB target-scope=11 routing-table=("to-".$Wan2Reference) check-gateway=ping comment=$TestingComment

# some test rules which can be enable / disable to test failover
/ip/firewall/filter/add chain=output action=drop out-interface=ether1 place-before=1 protocol=icmp dst-address=208.67.222.222 disabled=yes comment=$TestingComment
/ip/firewall/filter/add chain=output action=drop out-interface=ether1 place-before=1 protocol=icmp dst-address=8.8.8.8 disabled=yes comment=$TestingComment

}

I have exactly the same problem. I am also not able to get the virtual IP stuff working.

My setup:
CRS326 with RouterOS 7.1.3
WAN1 on ether1 with IP DHCP client with IP 10.0.10.13 and Gateway 10.0.10.1
WAN2 on ether2 with IP DHCP client with IP 10.10.11.2 and Gateway 10.10.11.1

Bridge with ether3 to ether24 with DHCP Server

For my understanding, the default route on the WAN interfaces must be deactived. But than no connection is possible.

My actual (not working) config:

{
/interface bridge
add admin-mac=CC:2D:XXXXXXXX auto-mac=no comment=defconf name=bridge protocol-mode=none
/interface list
add name=WAN
add name=LAN
/ip pool
add name=dhcp ranges=192.168.1.3-192.168.1.254
/ip dhcp-server
add address-pool=dhcp interface=bridge lease-time=2w10m name=dhcp1
/port
set 0 name=serial0
/routing bgp template
set default as=65530 disabled=no name=default output.network=bgp-networks
/routing table
add fib name=to_ISP1
add fib name=to_ISP2
/user group
set full policy=local,telnet,ssh,ftp,reboot,read,write,policy,test,winbox,password,web,sniff,sensitive,api,romon,dude,tikapp,rest-api
/caps-man manager
set ca-certificate="CAPsMAN CA" certificate=CAPsMAN
/caps-man manager interface
set [ find default=yes ] forbid=yes
add disabled=no interface=bridge
/interface bridge port
add bridge=bridge comment=defconf disabled=yes ingress-filtering=no interface=ether1
add bridge=bridge comment=defconf disabled=yes ingress-filtering=no interface=ether2
add bridge=bridge comment=defconf ingress-filtering=no interface=ether3
add bridge=bridge comment=defconf ingress-filtering=no interface=ether4
add bridge=bridge comment=defconf ingress-filtering=no interface=ether5
add bridge=bridge comment=defconf ingress-filtering=no interface=ether6
add bridge=bridge comment=defconf ingress-filtering=no interface=ether7
add bridge=bridge comment=defconf ingress-filtering=no interface=ether8
add bridge=bridge comment=defconf ingress-filtering=no interface=ether9
add bridge=bridge comment=defconf ingress-filtering=no interface=ether10
add bridge=bridge comment=defconf ingress-filtering=no interface=ether11
add bridge=bridge comment=defconf ingress-filtering=no interface=ether12
add bridge=bridge comment=defconf ingress-filtering=no interface=ether13
add bridge=bridge comment=defconf ingress-filtering=no interface=ether14
add bridge=bridge comment=defconf ingress-filtering=no interface=ether15
add bridge=bridge comment=defconf ingress-filtering=no interface=ether16
add bridge=bridge comment=defconf ingress-filtering=no interface=ether17
add bridge=bridge comment=defconf ingress-filtering=no interface=ether18
add bridge=bridge comment=defconf ingress-filtering=no interface=ether19
add bridge=bridge comment=defconf ingress-filtering=no interface=ether20
add bridge=bridge comment=defconf ingress-filtering=no interface=ether21
add bridge=bridge comment=defconf ingress-filtering=no interface=ether22
add bridge=bridge comment=defconf ingress-filtering=no interface=ether23
add bridge=bridge comment=defconf ingress-filtering=no interface=ether24
add bridge=bridge comment=defconf ingress-filtering=no interface=sfp-sfpplus1
add bridge=bridge comment=defconf ingress-filtering=no interface=sfp-sfpplus2
/ip neighbor discovery-settings
set discover-interface-list=!dynamic
/ip settings
set max-neighbor-entries=8192
/interface detect-internet
set detect-interface-list=WAN
/interface list member
add interface=ether1 list=WAN
add interface=bridge list=LAN
add interface=ether2 list=WAN
/ip address
add address=192.168.1.1/24 comment=defconf interface=bridge network=192.168.1.0
/ip dhcp-client
add add-default-route=no interface=ether1 use-peer-dns=no use-peer-ntp=no
add add-default-route=no interface=ether2 use-peer-dns=no use-peer-ntp=no
/ip dhcp-server lease
#here are some static IP addresses of LAN devices
/ip dhcp-server network
add address=192.168.1.0/24 dns-server=192.168.1.151,8.8.8.8,8.8.4.4 gateway=192.168.1.1 netmask=24 ntp-server=91.206.8.70
/ip dns
set allow-remote-requests=yes servers=8.8.8.8,8.8.4.4,1.1.1.1,1.0.0.1
/ip firewall filter
add action=accept chain=forward connection-nat-state=dstnat disabled=yes
add action=accept chain=forward dst-address=192.168.1.152 dst-port=8080 protocol=tcp
/ip firewall mangle
add action=mark-connection chain=output connection-mark=no-mark connection-state=new new-connection-mark=ISP1_conn out-interface=ether1
add action=mark-routing chain=output connection-mark=ISP1_conn new-routing-mark=to_ISP1 out-interface=ether1
add action=mark-connection chain=output connection-mark=no-mark connection-state=new new-connection-mark=ISP2_conn out-interface=ether2
add action=mark-routing chain=output connection-mark=ISP2_conn new-routing-mark=to_ISP2 out-interface=ether2
/ip firewall nat
add action=masquerade chain=srcnat out-interface-list=WAN
add action=dst-nat chain=dstnat dst-port=8080 in-interface=all-ethernet protocol=tcp to-addresses=192.168.1.152 to-ports=8080
add action=masquerade chain=srcnat out-interface=ether1
add action=masquerade chain=srcnat out-interface=ether2
/ip route
add comment=WANFailover dst-address=8.8.8.8 gateway=10.0.10.1 scope=10
add comment=WANFailover dst-address=8.8.4.4 gateway=10.10.11.1 scope=10
add comment=WANFailover dst-address=208.67.222.222 gateway=10.0.10.1 scope=10
add comment=WANFailover dst-address=208.67.220.220 gateway=10.10.11.1 scope=10
add check-gateway=ping comment=WANFailover distance=1 gateway=8.8.8.8 routing-table=to_ISP1 target-scope=11
add check-gateway=ping comment=WANFailover distance=2 gateway=8.8.4.4 routing-table=to_ISP1 target-scope=11
add check-gateway=ping comment=WANFailover distance=1 gateway=208.67.222.222 routing-table=to_ISP1 target-scope=11
add check-gateway=ping comment=WANFailover distance=2 gateway=208.67.220.220 routing-table=to_ISP1 target-scope=11
add check-gateway=ping comment=WANFailover distance=1 gateway=8.8.4.4 routing-table=to_ISP2 target-scope=11
add check-gateway=ping comment=WANFailover disabled=no distance=2 dst-address=0.0.0.0/0 gateway=8.8.8.8 pref-src="" routing-table=to_ISP2 scope=30 \
    suppress-hw-offload=no target-scope=11
add check-gateway=ping comment=WANFailover distance=1 gateway=208.67.220.220 routing-table=to_ISP2 target-scope=11
add check-gateway=ping comment=WANFailover distance=2 gateway=208.67.222.222 routing-table=to_ISP2 target-scope=11
/ip service
set telnet disabled=yes
set ftp disabled=yes
set www disabled=yes
set ssh disabled=yes
set api disabled=yes
set api-ssl disabled=yes
/ipv6 nd
set [ find default=yes ] advertise-dns=no
/system clock
set time-zone-name=Europe/Vienna
/system identity
set name=CRS326
/system ntp client
set enabled=yes
/system ntp client servers
add address=94.199.173.123
add address=91.206.8.36
/system routerboard settings
set boot-os=router-os
}

Hi Everyone, i do also have problems with this config to work - in my opinion it lacks default routes in main table (0.0.0.0/0) or am i missing something else?
I’ve tested both B2ONX configs/scripts, with no effect (nothing goes out from router), but … after adding:

/ip/route add distance=1 gateway=8.8.8.8 target-scope=11 check-gateway=ping
/ip/route add distance=2 gateway=8.8.4.4 target-scope=11 check-gateway=ping

(in main routing table) i’m able to send some icmp traffic from the router to remote host (other than google/open DNS servers written in config script), but then when i simulate WAN1 failure (ICMP firewall block on WAN1 device), even though route switches to WAN2 and seems to be working (at least for a while), then … out of nowhere i get timeouts and packet losses (~20% on avg) with “network unreachable” from WAN1 device (during WAN1 ICMP blockade and when ICMP packets should be traveling through WAN2).

Does someone have working config for simple “WAN Backup” on mikrotik routerOSv7?

I appreciate ANY help with this, cause i want to get rid of my old linux box where everything works like a charm - wan backup with load balance (not needed now), router-on-a-stick scenario (but there’s no problem with vlans as they are working on mikrotik too). THX in advance!

Good reference to multiwan here → http://forum.mikrotik.com/t/multiwan-with-routeros/163698/1

My own articles include multiwan and backup —>https://forum.mikrotik.com/viewtopic.php?t=182373

Start at item: I