Advanced Routing Failover without Scripting

_A comment about PPP uplinks (like PPPoE): https://forum.mikrotik.com/viewtopic.php?p=814682#p814682_ - important in RouterOS before v7

Introduction

Let us suppose that we have several WAN links, and we want to monitor, whether the Internet is accessible through each of them. The problem can be everywhere.
If your VPN cannot connect - then there’s no problem, your default route with gateway=that-vpn-connection will be inactive.
If your ADSL modem is down - then check-gateway=ping is on stage, and no problem again.
But what if your modem is up, and telephone line is down? Or one of your ISP has a problem inside it, so traceroute shows only a few hops - and then stops…
Some people use NetWatch tool to monitor remote locations. Others use scripts to periodically ping remote hosts. And then disable routes or in some other way change the behaviour of routing.
But RouterOS facilities allow us to use only /ip routes to do such checking - no scripting and netwatch at all!

Implementation

Basic Setup

Let’s suppose that we have two uplinks: GW1, GW2. It can be addresses of ADSL modems (like 192.168.1.1 and 192.168.2.1), or addresses of PPP interfaces (like pppoe-out1 and pptp-out1). Then, we have some policy routing rules, so all outgoing traffic is marked with ISP1 (which goes to GW1) and ISP2 (which goes to GW2) marks. And we want to monitor Host1 via GW1, and Host2 via GW2 - those may be some popular Internet websites, like Google, Yahoo, etc.
First, create routes to those hosts via corresponding gateways:

/ip route
add dst-address=Host1 gateway=GW1 scope=11
add dst-address=Host2 gateway=GW2 scope=11

Now we create rules for ISP1 routing mark (one for main gateway, and another one for failover):

/ip route
add distance=1 gateway=Host1 target-scope=11 routing-mark=ISP1 check-gateway=ping
add distance=2 gateway=Host2 target-scope=11 routing-mark=ISP1 check-gateway=ping

Those routes will be resolved recursively (see Manual:IP/Route#Nexthop_lookup), and will be active only if HostN is pingable.
Then the same rules for ISP2 mark:

/ip route
add distance=1 gateway=Host2 target-scope=11 routing-mark=ISP2 check-gateway=ping
add distance=2 gateway=Host1 target-scope=11 routing-mark=ISP2 check-gateway=ping

Multiple host checking per Uplink

If Host1 or Host2 in #Basic Setup fails, corresponding link is considered failed too. For redundancy, we may use several hosts per uplink: let’s monitor Host1A and Host1B via GW1, and Host2A and Host2B via GW2. Also, we’ll use double recursive lookup, so that there were fewer places where HostN is mentioned.
As earlier, first we need routes to our checking hosts:

/ip route
add dst-address=Host1A gateway=GW1 scope=11
add dst-address=Host1B gateway=GW1 scope=11
add dst-address=Host2A gateway=GW2 scope=11
add dst-address=Host2B gateway=GW2 scope=11

Then, let’s create destinations to “virtual” hops to use in further routes. I’m using 10.1.1.1 and 10.2.2.2 as an example:

/ip route
add dst-address=10.1.1.1 gateway=Host1A scope=12 target-scope=11 check-gateway=ping
add dst-address=10.1.1.1 gateway=Host1B scope=12 target-scope=11 check-gateway=ping
add dst-address=10.2.2.2 gateway=Host2A scope=12 target-scope=11 check-gateway=ping
add dst-address=10.2.2.2 gateway=Host2B scope=12 target-scope=11 check-gateway=ping

And now we may add default routes for clients:

/ip route
add distance=1 gateway=10.1.1.1 target-scope=12 routing-mark=ISP1
add distance=2 gateway=10.2.2.2 target-scope=12 routing-mark=ISP1
add distance=1 gateway=10.2.2.2 target-scope=12 routing-mark=ISP2
add distance=2 gateway=10.1.1.1 target-scope=12 routing-mark=ISP2

Workaround 1

In ROS versions at least up to 4.10 there’s a bug, and if your ethernet interface goes down (for example, your directly connected ADSL modem is powered off) and then brings up, recursive routes are not recalculated (or something) and all traffic still goes via another uplink. As a workaround, additional rules for each HostN may be used. When adding them, all is recalculated correctly:

/ip route
add dst-address=Host1 type=blackhole distance=20
add dst-address=Host2 type=blackhole distance=20

Thanks to

  • Valens Riyadi, on Poland MUM 2010 he mentioned casually that using of ‘scope’ attribute is possible for remote host checking for failover implementation
  • Martín (Ibersystems) - he asked for a solution, and I invented what you see above =)
  • Robert Urban (treborr) - he faced a problem mentioned in Workaround1, and we both solved it =)

As MikroTik decided to completely kill user-contributed Wiki and deleted all non-MikroTik staff accounts, I’m moving the article here to think what’s the best place for it and edit it some time later to add info about PPP connections (as recursive routing lookup doesn’t work with interface routes in RouterOS).

Good idea to have place to speak fee about wiki.
MikroTik build now new wiki/KB at: https://help.mikrotik.com/docs/
.
My problems with those solution are:

  • This not work with interface like lte1, we must use IP address as HOSTx. This is purpose to stop using LTE Passthrough mode because DualNAT give possibility to main router to use Recursive Routing.
  • When users do a speedtest then nexthop detection via icmp (can and I sure that) reach a timeout. Means speedtest-s on LAN break whole ISP Recursive Routing Path.

Can’t you just use gateway IP? I’m not familiar with LTE interfaces…

.

[marcin.przysowa@SXTR_LTE6] > ip address print detail where interface=lte1
Flags: X - disabled, I - invalid, D - dynamic 
 0 D address=37.109.59.226/32 network=37.109.59.226 interface=lte1 actual-interface=lte1

.
Nope, all combination checked. Route Filter-s not have any additional action to help in this. Gateway in LTE is only interface.
This should be added by MikroTik support into (not work now) wiki page: https://wiki.mikrotik.com/wiki/Advanced_Routing_Failover_without_Scripting that Recourse Routing works only via IP address.

Just wondering whether a crutch like “add any fake address to that interface with network=100.69.69.69/32 and then use that 100.69.69.69 as gateway IP” can work for LTE…

This was my first try when I buy my own sxtr.

Before I try help with this case: Load balancing with internal LTE modem (recursive resolution not working) but this is a limitation inside ROS.
All static way’s was checked. No work. When we use Dynamic address/route we can modify dynamic route via Route>Filters but still Recouring Routing not work at it.

In ROS versions at least up to 4.10 there’s a bug, and if your ethernet interface goes down (for example, your directly connected ADSL modem is powered off) and then brings up, recursive routes are not recalculated (or something) and all traffic still goes via another uplink. As a workaround, additional rules for each HostN may be used. When adding them, all is recalculated correctly:

/ip route
add dst-address=Host1 type=blackhole distance=20
add dst-address=Host2 type=blackhole distance=20

>
> **<big>Thanks to</big>**
>
> * Valens Riyadi, on Poland MUM 2010 he mentioned casually that using of 'scope' attribute is possible for remote host checking for failover implementation
> * Martín (> [Ibersystems](https://forum.mikrotik.com/memberlist.php?mode=viewprofile&u=5217)> ) - he asked for a solution, and I invented what you see above =)
> * Robert Urban (> [treborr](https://forum.mikrotik.com/memberlist.php?mode=viewprofile&u=40000)> ) - he faced a problem mentioned in Workaround1, and we both solved it =)
>

Thanks for this, I also notice this bug when configuring my Policy Base Routing wherein when the 2nd WAN gets down it won't reconnect back and added blackhole to counter it.

Q1. What is the plan when the dst-address does NOT equal a static fixed WANIP, but instead is a dynamic WANIP?

anav

You can use the dhcp-client parameters and inside is the script’s who can do some additional works.
Key work can do the routing filter who can change the Dynamic interface parameter’s like /routing filter … set-scope= .

Here you can find an example script (sorry for Russian, please use Google Translate, but generally there’s only a single variable in the script):
https://forum.mikrotik.by/viewtopic.php?t=323

@Chupaka, this is great. Would it be correct to assume that if I were to have 3 recursive failovers, it would look like this based on your code above:

/ip route
add dst-address=Host1 gateway=GW1 scope=10
add dst-address=Host2 gateway=GW2 scope=10
add dst-address=Host3 gateway=GW3 scope=10

/ip route
add distance=1 gateway=Host1 routing-mark=ISP1 check-gateway=ping
add distance=2 gateway=Host2 routing-mark=ISP1 check-gateway=ping
add distance=3 gateway=Host3 routing-mark=ISP1 check-gateway=ping

/ip route
add distance=1 gateway=Host2 routing-mark=ISP2 check-gateway=ping
add distance=2 gateway=Host3 routing-mark=ISP2 check-gateway=ping
add distance=3 gateway=Host1 routing-mark=ISP2 check-gateway=ping

/ip route
add distance=1 gateway=Host3 routing-mark=ISP3 check-gateway=ping
add distance=2 gateway=Host2 routing-mark=ISP3 check-gateway=ping
add distance=3 gateway=Host1 routing-mark=ISP3 check-gateway=ping

/ip route
add dst-address=Host1 type=blackhole distance=20
add dst-address=Host2 type=blackhole distance=20
add dst-address=Host3 type=blackhole distance=20

Is that correct? Plus, if I wanted to add load-balancing to this method, would I have to add other routes?

Looking forward to hearing your thoughts

Correct. Those routes are enough for LB setup, just mark routing on packets accordingly.

Thanks @Chupaka

In case of using another Mikrotik as your router (SXTR only as a modem with passthrough) it’s working fine. Your WAN on the second Mikrotik should get something like this:

2 D address=xxx.xxx.xxx.xxx/30 network=yyy.yyy.yyy.yyy interface=ether9 actual-interface=ether9

so you should be able to use the network address as a gateway.
It’s working fine in my configuration

WiruSSS

But I write about other config, without passthrough, directly at RB who have lte1 interface who get IP from ISP you cannot use RecursiveRouting at dynamic interface. This is a case.
Your config it’s works but it’s other network scenario.

Yes yes, i know it. I’ve just written this as a workaround.

Thank you for moving this here.

I have a couple of questions, hope you guys could help me out.

1.- Do I have to routing mark the packets in Firewall Mangle so the…

/ip route
add distance=1 gateway=10.1.1.1 routing-mark=ISP1
add distance=2 gateway=10.2.2.2 routing-mark=ISP1
add distance=1 gateway=10.2.2.2 routing-mark=ISP2
add distance=2 gateway=10.1.1.1 routing-mark=ISP2

… routing-mark=ISP1 and routing-mark=ISP2 works? if so, how would I do that? Or the routing-mark=ISP1 and routing-mark=ISP2 is just for having 2 Routing Tables?


2.- Are this still necessary with newer ROS versions? There is a mention of a Ver-4.10 bug, but someone said it helped to reconnect back the WAN and I assume that is in a recent version.

"add dst-address=Host1 type=blackhole distance=20"

3.- What is “better”, this recursive routes failover or a script based one, and why?

Any help would be appreciated,
Thanks in advance!
Regards,
SN

Well, if you create routing tables (by setting routing-mark on routes), you need to send traffic to them: either by marking in Firewall Mangle or by IP → Route → Rules.

It won’t break anything, and it prevents traffic to Host1 going via another route. I don’t remember the details of that bug, but I prefer to keep this rule in place :slight_smile:

What is better: a car or a bike? I think, it depends on your task :slight_smile: Recursive routes are “automagic”, but there are limits (like you cannot use it with interface routes, including “gateway=1.2.3.4%ether1” in case you have the same gateway IP/subnet on two uplinks.

recursive routes are not recalculated (or something) and all traffic still goes via another uplink

About 2 months ago that i made a lab for recursive routes and failover, as far as i remember the recursive routes were recalculated… version was 6.4x.y something…