Advanced Routing Failover without Scripting

http://forum.mikrotik.com/t/advanced-routing-failover-without-scripting/136599/67

PPP>Profile, create new.

Yes, I had read that post, but I still don’t understand how to “copy” the “PPP profile” used for my PPPoE connection. I can create a new PPP profile alright, but then what?

Cheers,
Toby.

And then you set remote-address to the host you want to check. On connection establishment, a route to the remote-address will be automagically added to the ‘main’ routing table.

Hi,

O_o I suspect I still don’t understand. So you are saying that, for instance, if I use 9.9.9.9 as the canary host, then this:

/ppp profile
add local-address=127.1.1.1 name=Primary_A remote-address=9.9.9.9

should produce a route automagically? Because on my system it doesn’t. How would the router be supposed to know what interface to use?

Cheers,
Toby.

It will use your PPPoE Interface. You should select your new profile in PPPoE Client properties.

P.S. Not sure if setting local-address won’t break anything.

Hi there,

Ah! That’s the information I am looking for. So you are saying I should tweak the existing PPPoE config to use the new profile (so as to override the automagically-assigned remote address)?

So you’re saying just to leave the local-address setting empty?

Cheers,
Toby.

Hi again,

Alright, so this seems to work:

/ppp profile
add name=Primary_A remote-address=9.9.9.9

/interface pppoe-client
add comment="Uplink" disabled=no interface=vlan-up-fiber keepalive-timeout=disabled name=pppoe-uplink password=*removed* profile=Primary_A user=*removed*

However, the thread above suggests that multiple canary hosts should be able to be configured as well with this, but how? The posts above always seem to leave out the interesting bits of information somehow.

Cheers,
Toby.

Aaand hello again,

I have been able to piece the puzzle together.

So if I understand correctly (and at the very least, things seem to work on my router with this setup), this is what needs to be done to be able to have multiple canary hosts AND PPPoE. I’ll use 1.1.1.1 and 9.9.9.9 as canaries here, as above. The general routing is as in my MWE above, but we’ll assume that the primary uplink is done via a PPPoE link. What’s different then is that you need a fake static remote address that is stable across reconnects of the PPPoE connection; we’ll use 127.1.1.1 in this example. This can be done with a PPP profile:

/ppp profile
add name=pppoe-static-profile remote-address=127.1.1.1

And we need to tell our PPPoE client to use this profile as well:

/interface pppoe-client
add comment="Primary uplink" disabled=no interface=uplink keepalive-timeout=disabled name=pppoe-uplink password=*redacted* profile=pppoe-static-profile user=*redacted*

This then means that the IP address 127.1.1.1 is available all the time as a gateway address, so we can just use that (instead of the 192.168.1.254 we used in the MWE above):

add comment="Primary route" distance=1 dst-address=1.1.1.1/32 gateway=127.1.1.1 scope=10

Neat.

Thanks for your help!

Cheers,
Toby.

Just for the record again, here’s what I believe to be my complete MWE with two uplinks (primary using PPPoE via ether1, secondary using DHCP via ether2), two canaries (1.1.1.1 and 9.9.9.9), and high availability for the canary hosts for clients attached via bridge0, including a rudimentary update script for the secondary uplink. The direct route for the secondary uplink is initially set to something bogus but will be overwritten once the DHCP lease on the secondary uplink is acquired:

# Static routes
/ip route
add gateway=1.1.1.1 distance=1 check-gateway=ping routing-mark=HA comment="Primary virtual route A (HA)"
add gateway=9.9.9.9 distance=1 check-gateway=ping routing-mark=HA comment="Primary virtual route B (HA)"
add gateway=127.1.1.2 distance=2 routing-mark=HA comment="Secondary virtual route (HA)"

add gateway=1.1.1.1 distance=1 check-gateway=ping comment="Primary virtual route A"
add gateway=9.9.9.9 distance=1 check-gateway=ping comment="Primary virtual route B"
add gateway=127.1.1.2 distance=2 comment="Secondary virtual route"

add dst-address=1.1.1.1/32 gateway=127.1.1.1 scope=10 comment="Primary route A"
add dst-address=9.9.9.9/32 gateway=127.1.1.1 scope=10 comment="Primary route B"
add dst-address=127.1.1.2/32 gateway=127.1.1.2 scope=10 comment="Secondary route"

# Clients attached via bridge0 get to use the HA routing table
/ip route rule
add interface=bridge0 table=HA

# Primary uplink via PPPoE
/interface pppoe-client
add comment="Primary uplink" interface=ether1 keepalive-timeout=disabled name=pppoe-primary password=*password* profile=pppoe-static-profile user=*user*
/ppp profile
add name=pppoe-static-profile remote-address=127.1.1.1

# Secondary uplink via DHCP
/ip dhcp-client
add add-default-route=no disabled=no interface=ether2 script=\
    "# Update secondary route\
    \n:if (\$bound=1) do={\
    \n  /ip route set [/ip route find where gateway!=\$\"gateway-address\" and comment~\"Secondary route\"] gateway=\$\"gateway-address\"\
    \n}" use-peer-dns=no use-peer-ntp=no

So I did something like this with multiple hosts:

PPP Profile 1 for ISP1: Remote Address: 127.0.0.2
PPP Profile 2 for ISP2: Remote Address: 127.0.0.3

Finally:

add comment="Route for reaching ISP1's Recursive 1" distance=1 dst-address=8.8.8.8/32 gateway=127.0.0.2 scope=10
add comment="Route for reaching ISP1's Recursive 2" distance=1 dst-address=1.0.0.1/32 gateway=127.0.0.2 scope=10

add comment="Route for reaching ISP2's Recursive 1" distance=1 dst-address=1.1.1.1/32 gateway=127.0.0.3 scope=10
add comment="Route for reaching ISP2's Recursive 2" distance=1 dst-address=8.8.4.4/32 gateway=127.0.0.3 scope=10

add check-gateway=ping comment="Default Route for ISP1 (Recursive 1)" distance=1 gateway=8.8.8.8
add check-gateway=ping comment="Default Route for ISP1 (Recursive 2)" distance=2 gateway=1.0.0.1
add check-gateway=ping comment="Default Route for ISP2 (Recursive 1)" distance=3 gateway=1.1.1.1
add check-gateway=ping comment="Default Route for ISP2 (Recursive 2)" distance=4 gateway=8.8.4.4

The objective is to keep it simple to achieve efficiency/simplicity and reduce chances of failure/issues later on if you decided to add more complicated policy routing/load-balancing, whatever.

Hi,

Well, that’s exactly what I have done, isn’t it? Except in my MWE ISP2 does not use PPPoE and thus has a DHCP startup script.

The objective is to keep it simple to achieve efficiency/simplicity and reduce chances of failure/issues later on if you decided to add more complicated policy routing/load-balancing, whatever.

Obviously. But again, what you are doing is not equivalent to what I am doing, is it? I totally agree that this is all a delicate trade-off balance. In your solution, if ISP1 fails, attached clients have no way to reach 8.8.8.8 or 1.0.0.1. This is avoided in my solution at the price of slightly more complex routing. Obviously, whether the trade-off is worth it is something that everybody has to decide individually (and, in fact, I am not certain myself, but I wanted do get it to work).

Arguably, your code is more complex than really necessary (provided that ISP 2 is just a backup link anyway and not meant to do load-balancing, as I have assumed and explicitly stated in my example). In that case, I don’t see any benefit to be gained from actively checking whether the ISP 2 uplink is actually up and running because there is nothing you can do about it anyway, so you could just as well just leave out all the checking via 8.8.4.4 and 1.1.1.1. That would simplify your setup as well.

(As a side note, my MWE is somewhat longer because I actually included all relevant configuration aspects, not just parts. These missing but crucial config lines cost me the the better part of two days to figure out. I think it is much better to include everything that is necessary to actually reproduce something and, also, state assumptions explicitly.)

Cheers,
Toby.

Wait, what? If ISP1 fails, all clients can still reach 8.8.8.8 & 1.0.0.1 via ISP2. The routing table will automatically drop those dead ISP1 routes including the custom route for 8.8.8.8 & 1.0.0.1 which is routed via ISP1’s interface, the routing table would fall back to the “main” routing table which now still has backup recursive routes for ISP2 and hence re-direct traffic through it including traffic destined towards 8.8.8.8 & 1.0.0.1. Don’t believe me? Use my config, ping the IP with -n flag, kill ISP1, see how long it takes for RouterOS to re-route to ISP2, it would be well below 1ms.


You clearly asked for “recursive routing failover without load balancing”, that is what I based my solution on, what makes you think I don’t load balance? I have a fairly complex policy routing setup for load balancing with a combination of Nth and PCC while accounting for HTTPS traffic via TCP and QUIC protocol as to ensure they don’t break with having multiple source IPs such as banking sites/PayPal etc while still ensuring all multi-threaded and SCTP traffic are able to achieve bandwidth aggregation through both ISPs simultaneously.

Hi there,

I cannot test this, to be honest, because I have no way to make my ISP drop traffic at their router, but does the pppoe/ppp interface behave differently from a plain ethernet interface in this regard? I mean, obviously, if the pppoe connection fails, then I would expect the route to disappear, but what I was talking about was what happens if the direct connection to ISP1 (i. e., the pppoe connection) is still valid (and the direct gateway of ISP1 reachable), but then packets are dropped somewhere farther upstream at ISP1. Then, of course, the canary hosts will not be pingable, so the corresponding recursive routes will be disabled. But will the direct routes also disappear? For a regular ether interface, those routes will stay and effectively blackhole the canary hosts. This is discussed in http://forum.mikrotik.com/t/advanced-routing-failover-without-scripting/136599/93 and the following posts as well.

Well, no. I disagree. Without load balancing, it makes no sense to monitor the backup ISP uplink, so your solution can be simplified because the monitoring of the backup uplink can be dropped for all practical intents and purposes. I never said that you don’t use failover in your setup, so your code might be totally appropriate (and minimal) for your use case, but for my use case, I still think it is simplifiable.

Cheers,
Toby.

The whole point of @Chupaka making this thread/guide was to bypass ISP gateway completely. My setup and his setup relies on external hosts, which if unreachable means dead ISP gateway.

Whatever man, enjoy your setup. Thanks to @Chupaka for this guide and thanks for your suggestion of using a null-address on the “remote address” field on the PPP profile to enable N number of hosts for the recursive routing failover.

Hi,

What? No, of course the ISP gateway cannot be bypassed. Originally, this thread was about having a check for routing failures that are further upstream than the direct link without having to resort to scripting. But this method of checking also means that unless you do some routing sub-table voodoo, the canary hosts used for checking are unreachable if the corresponding uplink fails. That’s all I’m saying (and, again, that’s discussed and mitigated in the posts I have linked to).

Whatever man, enjoy your setup. Thanks to @Chupaka for this guide and thanks for your suggestion of using a null-address on the “remote address” field on the PPP profile to enable N number of hosts for the recursive routing failover.

Indeed, thanks to all the contributors to this discussion! :slight_smile:

Cheers,
Toby.

Hello, I have 2 wan with static private IP (192.168.1.100 wan 1 and 192.168.0.100 wan 2)

I am using the first’s post routing rules but i have a strange problem, the Cloud DDNS does not update when these rules are applied. The balancing and failover seems to work fine. I cannot figure why my router can’t reach mikrotik’s cloud and update my public address.

Do i need to configure anything else?

these are my routes




add check-gateway=ping distance=1 gateway=8.8.8.8 routing-mark=to_ISP1
add check-gateway=ping distance=2 gateway=8.8.4.4 routing-mark=to_ISP1
add check-gateway=ping distance=1 gateway=8.8.4.4 routing-mark=to_ISP2
add check-gateway=ping distance=2 gateway=8.8.8.8 routing-mark=to_ISP2
add distance=1 dst-address=8.8.4.4/32 gateway=192.168.0.1 scope=10
add distance=20 dst-address=8.8.4.4/32 type=blackhole
add distance=1 dst-address=8.8.8.8/32 gateway=192.168.1.1 scope=10
add distance=20 dst-address=8.8.8.8/32 type=blackhole

and these are my mangle rules

/ip firewall mangle
add action=accept chain=prerouting dst-address=192.168.1.0/24 in-interface=\
    bridge
add action=accept chain=prerouting dst-address=192.168.0.0/24 in-interface=\
    bridge
add action=mark-connection chain=prerouting connection-mark=no-mark \
    in-interface=ether1 new-connection-mark=ISP1_conn passthrough=yes
add action=mark-connection chain=prerouting connection-mark=no-mark \
    in-interface=ether2 new-connection-mark=ISP2_conn passthrough=yes
add action=mark-connection chain=prerouting connection-mark=no-mark \
    dst-address-type=!local in-interface=bridge new-connection-mark=ISP1_conn \
    passthrough=yes per-connection-classifier=both-addresses:2/0
add action=mark-connection chain=prerouting connection-mark=no-mark \
    dst-address-type=!local in-interface=bridge new-connection-mark=ISP2_conn \
    passthrough=yes per-connection-classifier=both-addresses:2/1
add action=mark-routing chain=prerouting connection-mark=ISP1_conn \
    in-interface=bridge new-routing-mark=to_ISP1 passthrough=yes
add action=mark-routing chain=prerouting connection-mark=ISP2_conn \
    in-interface=bridge new-routing-mark=to_ISP2 passthrough=yes
add action=mark-routing chain=output connection-mark=ISP1_conn \
    new-routing-mark=to_ISP1 passthrough=yes
add action=mark-routing chain=output connection-mark=ISP2_conn \
    new-routing-mark=to_ISP2 passthrough=yes

Do I have to set anything else for a proper loadbalancing with failover? The wan connections are 2 ADSL lines (I don’t have the option for PPPoE setup due to ISP limitation at their modem/router)

Is there anything wrong with the above setup?

Also how I can figure out the weight ratio of the above setup? How are connections distribute between the 2 WANs ? How i can “push” more connections at WAN 1 for example?

You’re in a double NAT situation. Ask the ISP to bridge the CPE. Then establish PPPoE at the router level. That’s is the right way to do it.

Double NAT will create all sorts of weird issues for obvious reasons.

as far as I among other folks on this topic have checked recursive route for the purpose of failover does not come along with PCC load balancing if two or more route marks are being used.

I use PCC+Nth mangle load balancing + recursive routes just fine. I’ve shared the config in previous posts.

I know i am at dual nat state. From the ISP side is not possible for a pppoe connection. Also the CPE manages the Voip telephony so my only option is connecting with just an IP.

Is there any way to manage or handle the dual nat problems?

btw how does the above config distributes the connections? Is it round robin LB? is there any way to create “weight” ratios between theese 2 WAN?

Send an email to your ISP’s ASN’s NOC team, that’s their L3 and above layer team, ask them to enable bridge mode functionality for you, they will do it. The VoIP+VLAN tagging (if any) can easily be replicated on RouterOS once your ISP releases any L2.5 MAC binding and create fresh ones for RouterOS.

Never work with double NAT shit, chances are you’re in a triple NAT, one CGNAT at ISP level, one NAT at CPE, one NAT at RouterOS level.

MikroTik’s load balancing is complex and flexible. You can use PCC to weight distribute or even Nth. Many possible combos. I use PCC for 80/443 traffic and Nth for all other ports. The end result of my method? You will get aggregated bandwidth from both ISPs for multi-threaded traffic or SCTP traffic (which works in RouterOS by default).

I have looked at Cisco, Juniper, VyOS, pfSense etc. No vendor apart from MikroTik offers L3/L4 load balancing combo features.