ECMP recursive routes

mrz · August 12, 2024, 7:15pm

Hashing is not used to select active routes, it is used to select forwarding path of the packet.

Selection process of active routes is described here:
https://help.mikrotik.com/docs/display/ROS/IP+Routing#IPRouting-RouteSelection

FIB lookup process is described here:
https://help.mikrotik.com/docs/display/ROS/IP+Routing#IPRouting-Routingtablelookup
And here hashing policies are applied for ECMP paths.

jaclaz · August 13, 2024, 11:51am

Let’s quote it:

Route Selection
There can be multiple routes with the same destination received from various routing protocols and from static configurations but only one (best) destination can be used for packet forwarding. To determine the best path, RIB runs a Route Selection algorithm that picks the best route from all candidate routes per destination.
Only routes that meet the following criteria can participate in the route selection process:
Route is not disabled.
If the type of route is unicast it must have at least one reachable next-hop. ( if a gateway is from a connected network and there is a connected route active, the gateway is considered as reachable)
Route should not be synthetic.
The candidate route with the lowest distance becomes an active route. If there is more than one candidate route with the same distance, the selection of the active route is arbitrary.

Since we are in the case of only one source of routes (static), and presumably they are all valid/eligible, it should boil down to only:

The candidate route with the lowest distance becomes an active route. If there is more than one candidate route with the same distance, the selection of the active route is arbitrary.

So, for routes with the same distance, in a nutshell, random.

anav · August 13, 2024, 12:12pm

mrz, you must be related to sindy, both of you put my brain into a fog.

Amm0 · August 13, 2024, 5:49pm

Going back to the OP’s original config… I’m not sure what ECMP to the same gateway is trying to do. If the goal trying to use recursive route with ECMP to create a “OR” on the canary address (e.g. keep WAN active if EITHER of the canary address are up), the config is not going to do that… So clarity why ether1 has multiple routes be good here. ECMP makes why more sense if there are different WANs…

Assuming CPE-like defaults, setting the same distance= on a route (either statically in /ip/route, for via default-route-distance in dhcp-client/LTE/etc). Below is why that may not be enough in all cases…

And where this gets particularly confusing is the relationship between scope/target-scope and distance. And why I generally use /tool/netwatch, with per-LAN routing tables, to disable bad route, instead of recursive routes. While RR can work with ECMP, netwatch disabling a route just more clear than all RIB/FIB/“next-hop” stuff (and icmp check in netwatch more rich with stuff like latency as a metric which is not possible recursive routes). I use /routing/rule and additional routing table called “ecmp” that does load balance, so the “main” routing table is just using failover - which allow assigning some clients to load balancing or be directed to a specific WAN using /routing/rules. See http://forum.mikrotik.com/t/routing-rule-use-cases/163178/1

Theoretically yes no mangle rules are need for ECMP with different WAN - at least for outbound connections (like LAN with HTTP traffic to two WANs). The reason why is NAT, i.e. connection tracking will keep the packets flowing to same WAN for a connection/“flow” after the initial ECMP decision.

Now…where this is NOT true is for new"/untracked inbound connections – mangle rules are needed. The inbound traffic does go to a single, specific WAN IP/port (hosting a web/other “server”, VoIP, some multiplayer WAN games), they need to go out the SAME WAN as the came in on. But ECMP (and PCC too) may result in different selection for the outbound path – which is not going to work.

So to add @mrz’s ECMP summer reading list, the PCC doc have a section on “policy routing” that applies the same to ECMP – this section of the PCC docs is also applicable to ECMP:

https://help.mikrotik.com/docs/display/ROS/Per+connection+classifier#Perconnectionclassifier-Policyrouting

And I’d recommend those rules with any usage of ECMP since adding connection marks has an important benefit to the admin: /ip/firewall/connection will the show the selected WAN mark – so you can “see” the ECMP decision there by looking at the “connection mark” column in winbox/CLI/webfig.

One tip to test ECMP (or even PCC) is using a BitTorrent client (like Transmission on Mac, or whatever) to download the Ubuntu ISO image using the BT magnet links (https://ubuntu.com/download/alternative-downloads). Not saying it is the best testing methodology — but BT is pretty quick to see how load balancing is functioning since it gets pretty diverse set of IP/ports to quickly experiment with load balancing.

Apparently there is some “T2Node” thing/concept which is not described.
Each ‘Dst’ requires one or more ‘T2Node’ objects as well.
Although not sure add a description of this T2Node help folk understand any better . But does seem like there is a missing sentence or two in that section to fill in a few blanks.

The docs could still be clearer on ECMP (and PBR) more generally - it is spread in a few places without links to cross-reference.

Like the main “Load Balancing” page at help.mikrotik.com has a nice table of options - but no links to how one might set those up:
https://help.mikrotik.com/docs/display/ROS/Load+Balancing

sindy · August 13, 2024, 6:30pm

@anav, even if @mrz was my twin brother and we joined forces, we would be unable to create a fraction of the fog you create on your own by using vague terms or upfront mixing them up:

A route state is set to “active” (and thus the routing is allowed to use that route) if

its gateway interface (the one through which the nexthop is reachable) is up,
check-gateway, if enabled, gives a positive response,
there is no route with same dst-address and routing-table but lower distance that also meets all the other conditions.

If multiple routes match the above conditions at the same time, which implies that all of them have the same distance, all of them are “active”, and ECMP is then used to choose a particular one among these active ones, not “to choose an active route”. Precise wording is critical for understanding.

From yet another perspective, ECMP comes into play while routing a particular packet; the route state is updated whenever one of the conditions mentioned above changes, regardless whether any packet needs to be routed at the moment.

mrz · August 13, 2024, 6:38pm

And if we return to OP, if gateways are equal ROSv7 adds it only once.
So ECMP with gateway x1,x1,x2 will be added as x1,x2 in the FIB.

anav · August 13, 2024, 6:42pm

Ray of sunshine there Sindy… ECMP is then used to choose a particular one among these active ones, not “to choose an active route”. ( but how randomly?? )
Then mrz blocked the sun…

sindy · August 13, 2024, 7:04pm

Here, do you mean the configured gateway values of the routes (i.e. the “canary addresses” like 1.1.1.1) or the actual nexthops (the “physical” gateways resolved by the recursive next-hop search)?

Not that it would matter much in OP’s context, as there, the actual nexthops are the same, so depending on which variant is correct, it is either x1, x1, x1 (for configured gateways) or just x1 (for actual gateways), just for overall understanding.

mrz · August 13, 2024, 9:03pm

In OP example they are not the same, what is intended is to try to install forwarding path over ether1 twice and through ether2 once, leading to forwarding where ether1 is chosen twice as much as ether2.
This is not going to work in v7, because, like I said previously, equal gateways are installed in the FIB only once.

sindy · August 14, 2024, 7:38am

Sorry, a chain of misundersandings on my side In the OP, the two routes via the same actual gateway have the same distance (1), and the third one via another actual gateway has a higher distance (2). So the first two will be shown as ECMP ones and as long as one of the canary addresses of these routes will respond, the third one will not be used. Due to this I have understood the OP as a question whether it may stay like that or whether changing the distance of the 2nd and 3rd route to keep the priority of the 3rd one the lowest but remove the ambiguity between the first two will improve anything. If the actual question was whether all three can have the same distance and hence ECMP would work atop all of them, then I did not get it and the answer is that the outcome will be different, as it would end up as 1:1 distribution of connections between ether1 and ether2 rather than preference of ether1 whenever the uplink on it is transparent all the way to the internet and only using ether2 otherwise.

S8T8 · August 19, 2024, 6:14pm

@Amm0, may I ask an export or example based on what you described in the post for an easier understanding? Thanks

S8T8 · September 12, 2024, 3:01pm

It would be interesting to read what @mrz thinks about the last @sindy’s post

Amm0 · September 12, 2024, 6:33pm

Well, learn something new.

And @mrz’s example here is pretty clear:

And if we return to OP, if gateways are equal ROSv7 adds it only once.
So ECMP with gateway x1,x1,x2 will be added as x1,x2 in the FIB.

I never do the “weird splits” in practice (e.g. generally lte1 and lte2 in ECMP)… so never run into that in V7. I guess I thought it was same as V6 WRT to ECMP… (& pretty sure I’ve suggested this using same gateway multiple times would work elsewhere in the forum)

This more philosophical. I like RouterOS “table-based” config scheme (interface-list, address-list, etc.). And /routing/rule follows that IMO (more declarative and clear, than mucking with firewall to do routing.) At a high level, I just create one table per WAN, plus another one called “ecmp” that includes all/some of the other WANs. With the idea being “main” largely used by router services, and generally an LAN/VLAN/etc (e.g. forwarded things) are always assigned to through some routing table.

But I’ve been meaning to write-up some ECMP-based “multiwan” example for a while using two LTE-enabled routers, with 2 ethernet WAN, and VRRP on LAN/VLAN side, to show high availability on LAN/VLAN side. The config is easy, the explaining/writing takes more time.

S8T8 · September 13, 2024, 12:44am

I’ll go over the case from the first post one more time,
based on this example (it might sound silly, but it’s for learning purposes):

add comment=WAN1 dst-address=1.1.1.1 gateway=ether1 scope=10
add comment=WAN1 dst-address=9.9.9.9 gateway=ether1 scope=10
add comment=WAN2 dst-address=8.8.8.8 gateway=ether2 scope=10

add check-gateway=ping distance=1 dst-address=0.0.0.0/0 gateway=1.1.1.1 target-scope=11
add check-gateway=ping distance=1 dst-address=0.0.0.0/0 gateway=9.9.9.9 target-scope=11
add check-gateway=ping distance=2 dst-address=0.0.0.0/0 gateway=8.8.8.8 target-scope=11

distance should be set 1,2,3
having distance 1,1,2 won’t change anything, except that ECMP is used incorrectly because the gateway is still ether1

If this is right, let’s make it more complicated;

add check-gateway=ping distance=1 dst-address=0.0.0.0/0 gateway=ether1
add check-gateway=ping distance=1 dst-address=0.0.0.0/0 gateway=ether1
add check-gateway=ping distance=2 dst-address=0.0.0.0/0 gateway=ether2

I’ve read some documentation/posts (related to v6), that ECMP could “assign” more traffic on a WAN (like PCC) using dst-address=0.0.0.0/0 gateway=ether1,ether1,ether2,
this is not possible anymore with v7 due to “ECMP with gateway x1,x1,x2 will be added as x1,x2 in the FIB” ?

Thanks @Amm0, your guides are always very helpful, I understand the concept of the alternative routing table called “ECMP”, but I’m still confused about the mangle rules suggested in a previous post, that’s why I requested an export or example, but I can wait for a more in-deep explanation. Thank you for all the effort you’re investing in the community!

S8T8 · September 17, 2024, 3:03pm

I would be interested in reading a comment from the expert @mrz about the last @sindy’s post.

Amm0 · September 18, 2024, 10:00pm

Basically my suggestion is never use ECMP in the "main" routing table and you avoid a lot of unknowns here.

I created simplified/partial example config to show using a routing table for ECMP below, and added some comments to fill-in the gaps. Assume rest of config is some "home ap" like default config. If you had more LAN VLANs/etc, those be fine, and not needed for example here.

Also I don't show recursive routes... but if you just make sure all the distance= are different in "main" from your original config, and distance= in order of preference (i.e. main routing ONLY uses failover). In the routing tables shown below, the canary address for each WAN as gateway= in the ecmp/lte1/wan10 routing tables would be used instead.

Config assume all the "Home AP" defaults are present - but this config CANNOT be applied as-is - for example only...

Config shows a "dual WAN" with DHCP WAN and LTE,

using an routing tables, without PCC, with default be failover from DHCP WAN to LTE

but an ecmp table to enable load balance,

and route rules can assign an IP or subnet to use "ecmp" table.

While unneeded here...uses VRRP to aid using two routers for multiwan (my more common use case).

Assumes VLAN-filtered bridge with WAN on bridged port assigned to pvid=10...

/interface bridge set [find comment=defconf] name=bridge vlan-filtering=yes
/interface vrrp add interface=bridge name=vrrp1
/interface vlan add interface=bridge name=wan10 vlan-id=10

... WAN port needs to be bridge port with pvid=10

/interface bridge port add interface=ether1 pvid=10 frame-type=allow-only-untagged
/interface bridge vlan {}

... in 7.16+ RouterOS, bridge vlans should be dynamically configured, so nothing needed

... but, V6 and early V7, vlan 10 would need to be added as a bridge vlan & tagged=bridge be needed to.

Setup LTE as need (here Verizon with CGNAT)

... note the APN's default-route-distance defaults to 2, so it the "backup WAN" by default.

/interface lte apn add apn=vzwinternet ipv6-interface=bridge
/interface lte set [ find default-name=lte1 ] allow-roaming=no apn-profiles=vzwinternet sms-protocol=auto sms-read=no

Reduce the default pool to allow 254 to be used for router IP (since .1 is)

/ip pool set [find name=dhcp] ranges=192.168.88.101-192.168.88.199

Add routing table for ECMP, and each WAN used

/routing table add disabled=no fib name=ecmp
/routing table add disabled=no fib name=lte1
/routing table add disabled=no fib name=wan10

Set rp-filter to "loose" to limit [some] spoofing

/ip settings set rp-filter=loose

... in most recent RouterOS, you can use ipv4-multipath-hash-policy=l4 to further randomize ECMP (may need to remove)

/ip settings set ipv4-multipath-hash-policy=l4

Ignoring IPv6...

/ipv6 settings set disable-ipv6=yes forward=no

For firewall, ensure VRRP interface(s), and WAN interface are in the right interface list

/interface list member add interface=wan10 list=WAN
/interface list member add interface=lte1 list=WAN
/interface list member add interface=vrrp1 list=LAN

Renumber LAN IP to use 254 (so a 2nd router with another ISP could be 253 and as backup router)

/ip address add address=192.168.88.254/24 comment=defconf interface=bridge network=192.168.88.0

Set the VRRP address to the old gateway

/ip address add address=192.168.88.1 interface=vrrp1 network=192.168.88.0

DHCP WAN - this is important - need a script to set check-gateway=yes

... this is needed to disable WAN if nexthop is down, which will trigger failover to LTE

/ip dhcp-client add comment=defconf interface=wan10 script="/ip/route set [find gateway=$"gateway-address"] check-gateway=ping
\n/ip/route set [find comment=$interface routing-table=$interface] gateway=$"gateway-address"
\n/ip/route set [find comment=$interface routing-table=ecmp] gateway=$"gateway-address""

In most cases, you'll need to disable fasttrack

/ip firewall filter set [find action=fasttrack-connection chain=forward] disabled=yes

Mark connections - not actually need for ECMP - but allow /ip/firewall/connections to visually show what WAN is used per connection

/ip firewall mangle add action=mark-connection chain=prerouting connection-mark=no-mark in-interface=wan10 new-connection-mark=WAN10 passthrough=yes
/ip firewall mangle add action=mark-connection chain=prerouting connection-mark=no-mark in-interface=lte1 new-connection-mark=LTE1 passthrough=yes
/ip firewall mangle add action=mark-connection chain=postrouting connection-mark=no-mark new-connection-mark=WAN10 out-interface=wan10 passthrough=yes
/ip firewall mangle add action=mark-connection chain=postrouting connection-mark=no-mark new-connection-mark=LTE1 out-interface=lte1 passthrough=yes

For NAT, add masquerade rules specific to each of the WAN

... and BEFORE the default masquerade WAN rule

... note: may not be needed in recent V7, but more clear anyway

/ip firewall nat add action=masquerade ipsec-policy=out,none chain=srcnat out-interface=wan10
/ip firewall nat add action=masquerade ipsec-policy=out,none chain=srcnat out-interface=lte1

Add ECMP routes to routing table

... 0.0.0.0 will get replace by DHCP client script, if static IP on WAN then use correct gateway

... note: the ECMP is created by both using distance=1 (and main has LTE with distance=2, so it a backup there)

/ip route add comment=lte1 disabled=no distance=1 dst-address=0.0.0.0/0 gateway=lte1 routing-table=ecmp scope=30 suppress-hw-offload=no target-scope=10
/ip route add check-gateway=ping comment=wan10 disabled=no distance=1 dst-address=0.0.0.0/0 gateway=0.0.0.0 routing-table=ecmp scope=30 suppress-hw-offload=no target-scope=10

... note 2: any route in a routing table MUST also exist in the "main" routing table,

here, the dhcp-client and lte will automatically add a route to "main"...

but, if "wan10" using a static IP, then a static route need to be manually create with routing-table=main along with all the other rules here

Add LTE to the "LTE-first" routing table

/ip route add comment=lte1 disabled=no distance=1 dst-address=0.0.0.0/0 gateway=lte1 routing-table=lte1 scope=30 suppress-hw-offload=no target-scope=10

Add DHCP route to the "DHCP WAN 'first'" routing table

... which be same as using "main", but allows to be explict about which WAN to use in /routing/rules

/ip route add check-gateway=ping comment=wan10 disabled=no dst-address=0.0.0.0/0 gateway=0.0.0.0 routing-table=wan10 suppress-hw-offload=no

Add /routing/rule to steer WAN traffic

... first rule says to use "main" for ANYTHING NOT GOING TO INTERNET - which is important

/routing rule add action=lookup comment="min-prefix=0, all except 0.0.0.0/0" disabled=no min-prefix=0 table=main

Finally, you can assign either a specific IP to use "ecmp" to mean load balancing

/routing rule add action=lookup disabled=no src-address=192.168.88.199/32 table=ecmp

... or if you had another device on LAN that needed to always use LTE

/routing rule add action=lookup disabled=no src-address=192.168.88.91/32 table=lte1

... or could be entire VLAN/subnet like 10.88.100.0/24

/routing rule add action=lookup disabled=no src-address=10.88.100.0/24 table=ecmp

I do have a few mangle rules... but those are just "UI helper" so you can get a "quick look" what connections are going out which WAN from the /ip/firewall/connections - which is super handy. The firewall's connection tracking should "save" the ECMP decision, so mangle shouldn't be needed. There is more you can do than above to aid "recovery time" after a failed WAN but if that is rare occurrence, adding more complexity may just cause a different kind outage .

Lastly, one tip, using BitTorrent can very handy to test/watch ECMP working in winbox/webfig's interface view, as the does produce a wide variety of IP for load balancing — so thing should be "equal" in terms of traffic over the two WANs. Specifically, Ubuntu has a Magnet link for Transmission/whatever client, so I often use that as quick test ECMP/load balancing. And by using a routing table+rules, you can set just your IP to use ECMP to test it safely.

S8T8 · September 22, 2024, 2:54pm

Impressive! MikroTik should value contributions from the experts more.
Read a couple of times, well explained and comprehensive as always @Amm0!
I did some tests, Mangle Rules are very useful to see which WAN is being used, trying to complete marks with chain input and output from / to LAN (but not sure if this is correct), first rule says to use “main” for ANYTHING NOT GOING TO INTERNET doesn’t seem to be working in my case (using recursive routes - but probably my fault), doing few test ECMP with my PC isn’t working right (again, probably my fault), it’s using always WAN2 (BitTorrent not tested), will spend more time on this.

Back to post #34, is ECMP effective with recursive routes and can recursive routes be used for load-balancing?

Amm0 · September 22, 2024, 4:09pm

Yes, as long as you want things split “equally” between the WANs. It’s ONLY when you wanted to a split like 66% to WAN1 and 33% to WAN2) you CANNOT do per @mrz. The recursive route stuff stays in main in my scheme and just uses failover (so distance are different on 0.0.0.0 routes). And in the “ecmp” route table, you’d use the canary address going to a WAN as gateway=, so the two routes have different gateway=1.1.1.1 and gateway=8.8.8.8 but same distance= is still ECMP.

If something isn’t working, the config with the current issue be helpful.

Amm0 · September 22, 2024, 4:39pm

FWIW…

I don’t show recursive route in the example for the same reason as I don’t use PCC: the config gets complex. And complexity is another way to get outages, which is what you’re trying to avoid by having multiple WANs .

So I perfer “netwatch script” to just disable routes for these cases over recursive routing. To use this approach it’s just some additional /routing/rules to force netwatch’s canary (8.8.8.8 in OP example) out a specific WAN routing table. Netwatch has way more sophestated options to “monitor” the WAN (like using “icmp” check, which also look at the latency). The basic check-gateway=ping does it’s “3 pings” and that’s it - while netwatch you could disable a route if something like a web site didn’t work, etc.

Also, while Mikrotik/everyone seems to use Google/etc DNS as the “host to check”, there is a side-effect since traffic to 8.8.8.8 does actually go out only ONE WAN. So if that WAN fails, all client requests to 8.8.8.8 will NOT failover to another route. Now most folks have multiple DNS servers so likely not fatal - just something to keep in mind.

Anyway, netwatch is something to consider as that avoids getting into the weeds of RIB/FIBs/nexthop - just a on-down/on-up script and couple more routing rules.