Mangle rules with unexpected behavior

Hello people!!!

In a RB11000AHx4 (RouterOS and Firmware v6.47.1) with 2 WAN interfaces, I did some marks trying to accomplish the following:

  • Which is incomming from WAN1, goes out through WAN1, the same for WAN2
  • Some random local IPs goes out through WAN1, and other through WAN2
  • There are few public dst-addresses which should use an specific WAN

For test purpouse I added in the 6th position a rule to use WAN1 for connections from 192.168.1.54
In “IP → Firewall → Connections”, I filter src-address=192.168.1.54 and everything is with ISP2_conn connection mark
I have tried removing the “Passthrough” option to the 6th rule with the same result, so I think it is something before this rule
I have tried selecting only new packets in the 4th and 5th rules with the same result

Here is the code:

/ip firewall mangle
add action=accept chain=prerouting dst-address=181.30.25.184/29 src-address=192.168.1.0/24
add action=accept chain=prerouting dst-address=186.23.255.112/29 src-address=192.168.1.0/24
add action=accept chain=prerouting dst-address=172.16.0.0/23 comment="No Mark dst VPN"
add action=mark-connection chain=prerouting connection-mark=no-mark in-interface=ether1 new-connection-mark=ISP1_con passthrough=yes
add action=mark-connection chain=prerouting connection-mark=no-mark in-interface=ether2 new-connection-mark=ISP2_con passthrough=yes
add action=mark-connection chain=prerouting connection-mark=no-mark dst-address=!192.168.1.0/24 new-connection-mark=ISP1_con passthrough=yes src-address=192.168.1.54
add action=mark-connection chain=prerouting connection-mark=no-mark dst-address=!192.168.1.0/24 new-connection-mark=ISP1_con passthrough=yes src-address=192.168.1.4
add action=mark-connection chain=prerouting connection-mark=no-mark dst-address-list=ISP1publics new-connection-mark=ISP1_con passthrough=yes src-address=192.168.1.0/24
add action=mark-connection chain=prerouting connection-mark=no-mark dst-address-list=ISP2publics new-connection-mark=ISP2_con passthrough=yes src-address=192.168.1.0/24
add action=mark-connection chain=prerouting dst-address=!192.168.1.0/24 new-connection-mark=ISP1_con packet-mark=no-mark passthrough=yes per-connection-classifier=src-address:2/0 src-address=192.168.1.0/24
add action=mark-connection chain=prerouting dst-address=!192.168.1.0/24 new-connection-mark=ISP2_con packet-mark=no-mark passthrough=yes per-connection-classifier=src-address:2/1 src-address=192.168.1.0/24
add action=mark-routing chain=output connection-mark=ISP1_con dst-address=!192.168.1.0/24 new-routing-mark=to_ISP1 passthrough=no
add action=mark-routing chain=output connection-mark=ISP2_con dst-address=!192.168.1.0/24 new-routing-mark=to_ISP2 passthrough=no
add action=mark-routing chain=prerouting connection-mark=ISP1_con dst-address=!192.168.1.0/24 new-routing-mark=to_ISP1 passthrough=no src-address=192.168.1.0/24
add action=mark-routing chain=prerouting connection-mark=ISP2_con dst-address=!192.168.1.0/24 new-routing-mark=to_ISP2 passthrough=no src-address=192.168.1.0/24

I just removed all spanish comments and changed ISPs names with ISP1 and ISP2

When I do a ping to a public IP from 192.168.1.54 I see that the 6th rule increment the packet counter in 1
I cant see my mistake
Any idea?
Thanks in advance.

Regards,
Damián

  1. does the 14th rule (action=mark-routing chain=prerouting connection-mark=ISP1_con dst-address=!192.168.1.0/24 new-routing-mark=to_ISP1 passthrough=no src-address=192.168.1.0/24) count as well?
  2. is the action=fasttrack-connection rule in chain=forward of /ip firewall filter active?

I also cannot see any mistake in the mangle rules (I use a slightly more efficient approach but that may wait, first point is to find out what is wrong), but there may be a mistake elsewhere. So as usually, post the complete configuration export except sensitive information, not just the part you assume to be relevant.

Hello Sindy, thanks a lot

  1. does the 14th rule (action=mark-routing chain=prerouting connection-mark=ISP1_con dst-address=!192.168.1.0/24 new-routing-mark=to_ISP1 passthrough=no src-address=192.168.1.0/24) count as well?

Yes, but this has a lot of packets, because the mikrotik is in production, so I can not figure it out if any of those packets are for my ping

  1. is the action=fasttrack-connection rule in chain=forward of /ip firewall filter active?

There is not any fasttrack-connection rule

I attach all the settings bellow:

# sep/21/2020 14:49:52 by RouterOS 6.47.1
# model = RB1100Dx4

/interface bridge
add name=BridgeLAN
/interface ethernet
set [ find default-name=ether1 ] name=ether1 speed=100Mbps
set [ find default-name=ether2 ] name=ether2 speed=100Mbps
set [ find default-name=ether3 ] speed=100Mbps
set [ find default-name=ether4 ] speed=100Mbps
set [ find default-name=ether5 ] speed=100Mbps
set [ find default-name=ether6 ] speed=100Mbps
set [ find default-name=ether7 ] speed=100Mbps
set [ find default-name=ether8 ] speed=100Mbps
set [ find default-name=ether9 ] speed=100Mbps
set [ find default-name=ether10 ] speed=100Mbps
set [ find default-name=ether11 ] speed=100Mbps
set [ find default-name=ether12 ] speed=100Mbps
set [ find default-name=ether13 ] name=ether13-Management speed=100Mbps
/interface list
add name=WAN
add name=LAN
/interface wireless security-profiles
set [ find default=yes ] supplicant-identity=MikroTik
#error exporting /ip dhcp-client option
/ip pool
add name=PoolManagement ranges=192.168.88.200-192.168.88.254
add name=dhcp_l2tp ranges=172.16.0.100-172.16.0.150
add name=dhcp_sstp ranges=172.16.1.100-172.16.1.150
/ip dhcp-server
add address-pool=PoolManagement disabled=no interface=ether13-Management lease-time=1h name=DHCPmanagement
/ppp profile
add dns-server=8.8.8.8,8.8.4.4 local-address=172.16.0.1 name=L2TP remote-address=dhcp_l2tp
add dns-server=8.8.8.8,8.8.4.4 local-address=172.16.1.1 name=SSTP remote-address=dhcp_sstp use-encryption=required
/queue type
add kind=pcq name=PCQ_Descarga_5M pcq-classifier=dst-address pcq-dst-address6-mask=64 pcq-rate=5M pcq-src-address6-mask=64
add kind=pcq name=PCQ_Subida_5M pcq-classifier=src-address pcq-rate=5M
/queue simple
add name="Limitacion 5Mbps x IP" queue=PCQ_Subida_5M/PCQ_Descarga_5M target=192.168.1.0/24
/system logging action
set 0 memory-lines=10000
/user group
set full policy="local,telnet,ssh,ftp,reboot,read,write,policy,test,winbox,password,web,sniff,sensitive,api,romon,dude,tikapp"
/interface bridge port
add bridge=BridgeLAN interface=ether3
add bridge=BridgeLAN interface=ether4
add bridge=BridgeLAN interface=ether5
add bridge=BridgeLAN interface=ether6
add bridge=BridgeLAN interface=ether7
add bridge=BridgeLAN interface=ether8
add bridge=BridgeLAN interface=ether9
add bridge=BridgeLAN interface=ether10
add bridge=BridgeLAN interface=ether11
add bridge=BridgeLAN interface=ether12
/interface l2tp-server server
set authentication=mschap2 default-profile=L2TP enabled=yes ipsec-secret=PreSharedKey keepalive-timeout=60 max-mru=1460 max-mtu=1460 use-ipsec=required
/interface list member
add interface=ether1 list=WAN
add interface=ether2 list=WAN
add interface=ether3 list=LAN
add interface=ether4 list=LAN
add interface=ether5 list=LAN
add interface=ether6 list=LAN
add interface=ether7 list=LAN
add interface=ether8 list=LAN
add interface=ether9 list=LAN
add interface=ether10 list=LAN
add interface=ether11 list=LAN
add interface=ether12 list=LAN
add interface=ether13-Management list=LAN
/interface sstp-server server
set authentication=mschap2 certificate=Server default-profile=SSTP enabled=yes force-aes=yes pfs=yes
/ip address
add address=192.168.88.1/24 interface=ether13-Management network=192.168.88.0
add address=192.168.1.1/24 interface=BridgeLAN network=192.168.1.0
add address=ISP1-IP1/29 interface=ether1 network=ISP1-Network
add address=ISP1-IP2/29 interface=ether1 network=ISP1-Network
add address=ISP1-IP3/29 interface=ether1 network=ISP1-Network
add address=ISP1-IP4/29 interface=ether1 network=ISP1-Network
add address=ISP1-IP5/29 interface=ether1 network=ISP1-Network
add address=ISP2-IP2/29 interface=ether2 network=ISP2-Network
add address=ISP2-IP3/29 interface=ether2 network=ISP2-Network
add address=ISP2-IP4/29 interface=ether2 network=ISP2-Network
add address=ISP2-IP5/29 interface=ether2 network=ISP2-Network
add address=ISP2-IP1/29 interface=ether2 network=ISP2-Network
/ip cloud
set ddns-enabled=yes
#error exporting /ip dhcp-client
#error exporting /ip dhcp-server alert
/ip dhcp-server network
add address=192.168.88.0/24 gateway=192.168.88.1
/ip dns
set servers=8.8.8.8
/ip firewall address-list
add address=190.210.6.162 list=Allowed_VOIP
add address=172.16.0.100-172.16.0.150 list="L2TP Clients"
add address=172.16.1.100-172.16.1.150 list="SSTP Clients"
add address=ISP2-IP4 list=Public_for_VOIP
add address=ISP1-IP3 list=Public_for_VOIP
add address=216.109.104.11  list=Public_for_ISP1
add address=190.210.182.161  list=Public_for_ISP1
add address=179.43.118.76  list=Public_for_ISP1
add address=190.210.81.120  list=Public_for_ISP1
add address=201.212.8.35 list=Public_for_ISP1
add address=200.11.113.53 list=Public_for_ISP1
add address=45.60.18.228 list=Public_for_ISP1
add address=201.220.24.90 list=Public_for_ISP1
add address=190.220.3.173 list=Public_for_ISP1
add address=200.26.107.180 list=Public_for_ISP1
add address=200.80.196.58 list=Public_for_ISP1
add address=190.220.137.187 list=Public_for_ISP1
add address=190.210.161.49 list=Public_for_ISP1
add address=200.70.28.17 list=Public_for_ISP1
add address=190.221.144.119 list=Public_for_ISP1
add address=190.221.150.253 list=Public_for_ISP1
add address=200.43.227.37 list=Public_for_ISP1
add address=200.89.132.227 list=Public_for_ISP1
add address=201.235.98.17 list=Public_for_ISP1
add address=13.84.156.31 list=Public_for_ISP1
add address=181.10.162.157 list=Public_for_ISP1
add address=66.96.163.138 list=Public_for_ISP1
add address=200.26.107.178 list=Public_for_ISP1
add address=239.255.255.250 list=Public_for_ISP1
add address=200.61.38.0/24 list=Public_for_ISP2
/ip firewall filter
add action=accept chain=input protocol=icmp
add action=accept chain=input connection-state=established,related
add action=accept chain=input dst-port=AnotherWinboxPort,AnotherSshPort,AnotherHttpPort protocol=tcp
add action=accept chain=input dst-port=500,1701,4500 protocol=udp
add action=accept chain=input protocol=ipsec-esp
add action=accept chain=input protocol=ipsec-ah
add action=accept chain=input dst-port=443 protocol=tcp
add action=drop chain=input in-interface-list=WAN
add action=accept chain=forward connection-state=established,related
add action=drop chain=forward connection-state=invalid
add action=drop chain=forward connection-nat-state=!dstnat connection-state=new in-interface-list=WAN
/ip firewall mangle
add action=accept chain=prerouting dst-address=ISP1-Network/29 src-address=192.168.1.0/24
add action=accept chain=prerouting dst-address=ISP2-Network/29 src-address=192.168.1.0/24
add action=accept chain=prerouting dst-address=172.16.0.0/23
add action=mark-connection chain=prerouting connection-mark=no-mark in-interface=ether1 new-connection-mark=ISP1_con passthrough=yes
add action=mark-connection chain=prerouting connection-mark=no-mark in-interface=ether2 new-connection-mark=ISP2_con passthrough=yes
add action=mark-connection chain=prerouting connection-mark=no-mark dst-address=!192.168.1.0/24 new-connection-mark=ISP1_con passthrough=yes src-address=192.168.1.54
add action=mark-connection chain=prerouting connection-mark=no-mark dst-address=!192.168.1.0/24 new-connection-mark=ISP1_con passthrough=yes src-address=192.168.1.4
add action=mark-connection chain=prerouting dst-address-list=Public_for_ISP1 new-connection-mark=ISP1_con passthrough=yes src-address=192.168.1.0/24
add action=mark-connection chain=prerouting connection-mark=no-mark dst-address-list=Public_for_ISP2 new-connection-mark=ISP2_con passthrough=yes src-address=192.168.1.0/24
add action=mark-connection chain=prerouting dst-address=!192.168.1.0/24 new-connection-mark=ISP1_con packet-mark=no-mark passthrough=yes per-connection-classifier=src-address:2/0 src-address=192.168.1.0/24
add action=mark-connection chain=prerouting dst-address=!192.168.1.0/24 new-connection-mark=ISP2_con packet-mark=no-mark passthrough=yes per-connection-classifier=src-address:2/1 src-address=192.168.1.0/24
add action=mark-routing chain=output connection-mark=ISP1_con dst-address=!192.168.1.0/24 new-routing-mark=to_ISP1 passthrough=no
add action=mark-routing chain=output connection-mark=ISP2_con dst-address=!192.168.1.0/24 new-routing-mark=to_ISP2 passthrough=no
add action=mark-routing chain=prerouting connection-mark=ISP1_con dst-address=!192.168.1.0/24 new-routing-mark=to_ISP1 passthrough=no src-address=192.168.1.0/24
add action=mark-routing chain=prerouting connection-mark=ISP2_con dst-address=!192.168.1.0/24 new-routing-mark=to_ISP2 passthrough=no src-address=192.168.1.0/24
/ip firewall nat
add action=src-nat chain=srcnat out-interface=ether1 src-address=192.168.1.4 to-addresses=ISP1-IP3
add action=src-nat chain=srcnat out-interface=ether2 src-address=192.168.1.4 to-addresses=ISP2-IP4
add action=masquerade chain=srcnat out-interface=ether1
add action=masquerade chain=srcnat out-interface=ether2
add action=dst-nat chain=dstnat dst-port=5060,10000-20000 in-interface-list=WAN protocol=udp src-address-list=Allowed_VOIP to-addresses=192.168.1.4
add action=dst-nat chain=dstnat dst-address-list=Public_for_VOIP dst-port=80,443 in-interface-list=WAN protocol=tcp to-addresses=192.168.1.4
add action=dst-nat chain=dstnat dst-address=ISP2-IP2 dst-port=21 in-interface-list=WAN protocol=tcp to-addresses=192.168.1.16 to-ports=21
add action=dst-nat chain=dstnat dst-address=ISP1-IP1 dst-port=21 in-interface-list=WAN protocol=tcp to-addresses=192.168.1.16 to-ports=21
add action=dst-nat chain=dstnat dst-address=ISP2-IP2 dst-port=20 in-interface-list=WAN protocol=tcp to-addresses=192.168.1.16 to-ports=20
add action=dst-nat chain=dstnat dst-address=ISP1-IP1 dst-port=20 in-interface-list=WAN protocol=tcp to-addresses=192.168.1.16 to-ports=20
add action=dst-nat chain=dstnat dst-port=20 in-interface-list=WAN protocol=tcp to-addresses=192.168.1.12 to-ports=20
add action=dst-nat chain=dstnat dst-address=ISP2-IP2 dst-port=5001-5003 in-interface-list=WAN protocol=tcp to-addresses=192.168.1.16
add action=dst-nat chain=dstnat dst-address=ISP1-IP1 dst-port=5001-5003 in-interface-list=WAN protocol=tcp to-addresses=192.168.1.16
add action=dst-nat chain=dstnat dst-port=5001-5003 in-interface-list=WAN protocol=tcp to-addresses=192.168.1.12
add action=dst-nat chain=dstnat dst-port=1723 in-interface-list=WAN protocol=tcp to-addresses=192.168.1.16 to-ports=1723
add action=dst-nat chain=dstnat in-interface-list=WAN protocol=gre to-addresses=192.168.1.16
add action=dst-nat chain=dstnat dst-port=80 in-interface-list=WAN protocol=tcp to-addresses=192.168.1.10 to-ports=21
add action=dst-nat chain=dstnat dst-port=800 in-interface-list=WAN protocol=tcp to-addresses=192.168.1.12 to-ports=800
add action=dst-nat chain=dstnat dst-port=5005 in-interface-list=WAN protocol=tcp to-addresses=192.168.1.240 to-ports=3389
add action=dst-nat chain=dstnat dst-port=21 in-interface-list=WAN protocol=tcp to-addresses=192.168.1.12 to-ports=21
/ip route
add check-gateway=ping distance=1 gateway=ISP1-Network routing-mark=to_ISP1
add check-gateway=ping distance=1 gateway=ISP2-Network routing-mark=to_ISP2
add check-gateway=ping distance=10 gateway=ISP1-Network
add check-gateway=ping distance=20 gateway=ISP2-Network
/ip service
set telnet disabled=yes
set ftp disabled=yes
set www port=AnotherHttpPort
set ssh port=AnotherSshPort
set api disabled=yes
set winbox port=AnotherWinboxPort
set api-ssl disabled=yes
/ppp secret
add name=userL2TP password=Password profile=L2TP service=l2tp
add name=userSSTP password=Password profile=SSTP service=sstp
/system clock
set time-zone-name=America/Argentina/Salta
/system identity
set name="Main Router"
/system logging
add disabled=yes topics=ipsec,!packet
add disabled=yes topics=l2tp
/tool graphing interface
add interface=ether1
add interface=ether2

Regards,
Damián

I think I’ve got it - while reading the mangle rules in the OP, I have noticed that in the PCC rules you match on packet-mark=no-mark and was wondering why, but I’ve missed that the connection-mark=no-mark match condition is missing in these rules, so they rewrite any previously assigned connection-mark value. So I assume you have clicked a wrong item on the GUI when setting up these two rules?

Once you confirm that this was the issue, I’ll tell you how to optimize the order of the mangle rules to reduce CPU spending per packet.

Sindy, you are awesome!!
You are right.
I should take care of this or I will need to start to use code to configure mangle rules instead of GUI.
So, with the “passthrough” option disabled in the 6th rule, the router was putting the connection mark “ISP1_conn” in the first packet, but when the second packet entered, as it had already a mark, never matched the 6th rule and matched the PCC (or almost PCC) rules.

Thank you a lot.
The CPU is almost inactive all the time but it allways will be better to improve the settings, when you can, no rush.
Now it is working fine!

Regards,
Damián

Correct. And more than that, with passthrough=no on that “6th” rule, the first packet also hasn’t reached the action=mark-routing one.


The optimisation is based on the fact that action=jump should rather be called action=call - if the packet matching reaches the end of a built-in chain like forward and none of the matching rules provides a final verdict, the packet is accepted; if the packet matching reaches the end of a custom chain and none of the matching rules provides a final verdict, packet matching continues in the calling chain by the next rule after the action=jump one.

So what I do is the following:
chain=prerouting connection-mark=no-mark action=jump jump-target=mark-conn-pr
chain=prerouting connection-mark=use-main action=accept
chain=prerouting connection-mark=cm1 action=mark-routing new-routing-mark=rm1 passthrough=no
chain=prerouting connection-mark=cm2 action=mark-routing new-routing-mark=rm2 passthrough=no
chain=mark-conn-pr connection-mark=no-mark …classifying conditions… action=mark-connection new-connection-mark=cm1 passthrough=yes
chain=mark-conn-pr connection-mark=no-mark …classifying conditions… action=mark-connection new-connection-mark=cm2 passthrough=yes
chain=mark-conn-pr connection-mark=no-mark action=mark-connection new-connection-mark=use-main passthrough=yes

So mid-connection packets do not need to get matched against all the classification rules and get almost straight to the action=mark-routing ones. It is mandatory to assign some connection-mark to all connections, even to those which should use the default routing table main.

When combining load distribution over two WANs with redundancy, another useful thing is to have routes through both WANs in both routing tables, with the “home” WAN with a lower distance, and only assign connection-marks when handling packets received from WAN; the load distribution rules only assign routing-marks. So if a connection would normally be sent via WAN1 but WAN1 is currently down, it is sent through WAN2 because the backup route in the routing table “via-WAN1” is used. The response packet comes through WAN2, and the whole connection gets marked to use WAN2. So once WAN1 becomes available again, connections which should use it survive on WAN2.

Kinda offtopic, but I’d like to see a little brainstorming that leads to well, not the ultimate, but “almost complete multi-wan setup load-balancing WITH failover” with decent explanations and what ifs.
With the recent “online school” I had to make use of the current available tutorials on the wiki and some topics on the forum but none led to some pretty-good examples.
Why would we use masquerade instead of src-nat, and why wouldn’t we (yes there are scenarios when it would be better to use one or the other, EX: dynamic vs static wan ip, but the dynamic IP case could be scripted to update the src-nat rule, OR why would we keep some wan’s connections alive when it would be futile in a dynamic wan ip setup, etc.)
Or proper mangle rules, what to match where, the long discussed passtrough=yes vs no in the connection marking, mark what prerouting part to not waste cpu cycles, etc.
Or what would be the proper setup when dealing with inside servers that have to be reachable over the two wans (port forwarding).
For example, using ECMP as dual wan with masquerade would be the easiest dual wan setup with failover, but atleast when using pppoe as both of those wans you’d get shitload of problems, starting with packets going out of wan2 with src-addr of wan1 or viceversa because you can’t say in that route where you specify both gateways what src-addr should be used for each of them, and masquerade (i think) craps all over that. I had no issue after I switched to src-nat and scripted assign of IP (in case of change), that’s why i blame masquerade, i might be wrong though.
And .. other little issues regarding ECMP vs PCC. Also I’ve found some wiki entry That Was Written Like This WIth No Real Info In It And Incomplete, i won’t search for it now but .. how did it get in there. sheesh.
Some posts by @sindy were indeed helpful and got me going in the right direction. Atleast it made me add log rules in all the chains from mangle to see what works and what doesn’t, which packet goes where and how. But nobody sane would do that just to debug some assumed proper “working” tutorials.
Btw, @sindy should get a damn medal for the posts on this forum.
Sorry for the long post. Cheers!

There are just two reasons to use masquerade - the dynamic address of the interface and laziness. When the interface goes down, all connections whose src-nat behaviour has been activated via masquerade are removed from the tracking table, which causes a CPU spike. This operation is useless for a static WAN address but essential for a dynamic one, so a script doing the update of the src-nat rule’s to-addresses must also purge the connection tracking table, which requires to remember the old address somewhere or to look for connections which src-nat to an address currently non-existent in the system. Then you have those little bits like automatically created DHCP client on some (!) LTE interfaces, for which you cannot configure the script item, so there masquerade is the only way. And remember that not every Mikrotik owner is an IT enthusiast, and not every IT enthusiast understands coding, and not every IT enthusiast who does code wants to learn the syntax of YAFSL.


Again, why should passthrough be a subject of any discussion? The path of the packet through the rule list should be as short as possible. So where you need that it continues even after the action is taken, use passthrough=yes; as soon as all the virtual fields you need to attach to a packet (*-mark) have been attached and “physical” fields you need to change (dscp, ttl, …?) have been changed, use passthrough=no.

Most often you assign the connection-mark and then, still for the same packet, you need to assign a routing-mark and/or a packet mark based on that just assigned connection mark. So all these rules except the last one must have passthrough=yes, and the last one should have passthrough=no. On top of saving CPU by not letting packets matching that passthrough=no rule through to the subsequent rules, it may allow you to reduce the number of match conditions in those subsequent rules intended to match other packets (so you save CPU on handling those other packets).


There are just two common scenarios that need a specific care here:

  • you need to make the RouterOS choose the proper return route depending on the WAN through which the connection came from the internet (so policy routing with connection marking is necessary and you cannot use ECMP which is unable to respect connection marking),
  • you want to allow devices on the LAN to connect to the WAN address which is port-forwarded to LAN again. For the second case, the best solution is to have the server(s) in a dedicated subnet; if you decide to use hairpin NAT instead, you get another issue, which is to let the server know the actual IP address of the client. @Sob’s solution to this is to use netmap to an unused subnet instead of plain src-nat in the srcnat chain. So the server sees the client as coming from another subnet, but the last bits of the address are verbatim.


Masquerade kicks in after the routing decision, i.e. at the point where the actual out-interface for the initial packet of the connection has already been chosen, so I don’t think that’s the issue; the thing is that the ECMP choice is cached for some 10 minutes and then re-taken, whilst the reply-dst-address chosen by the masquerade action at the start of the connection is never changed.

And this is the same case also for src-nat, so again, unless you update the ongoing connection’s parameters (and there is no way to do that but to remove the connection from the tracking table), I cannot see how use of src-nat instead of masquerade could help resolve issues with NAT and ECMP and routing cache expiration. In my understanding, NAT and ECMP are mutually exclusive.