Persistent ECMP or basic ECMP on 3.0rc7/10 not working prope

Hi Guys,

We have just installed a new router to handle 4 gateways with ECMP + persistent connections using v3.0rc10, as per the Wiki. Tried downgrading to rc7, and still the same…

The main change from the wiki is the 4 gateways, and of course the syntax for ‘nth’ in v3

Now - the funky thing happening is that all upload traffic (to the Internet) only goes through one route - whether we use the standard ECMP or improved ECMP with persistent connections.

When using the improved ECMP, no upload traffic will pass unless a default route to any one of the gateways that has no routing marks is added.

All the connection marks & address lists are there, look about right and equal in number.

Anyone have any insight as to why this is happening?

There’s nothing else set up on this router except the mangle & routing rules as in the Wiki…

GWISA -
Search for posts on " nth " under Beta ROS - I remember that nth works different in ROS3 - I just don’t remember what the differences are.

Thom

Thanks - I know that. ‘nth’ only has two fields in 3.0 - ‘Every’ and ‘Packet’ - so with my 4 gateways, ‘every’ is set to ‘4’, and ‘Packet’ changes for the packet number…

That part is fine - all the connection marks are appearing as expected, as are the address-list entries, and these are based on the ‘nth’ rules…

It seems to be the routing marks, or ECMP is broken in 3.0?

As I said, we tried standard ECMP with multiple gateways (dst-address=0.0.0/0 gateway=, , ,), and get similar broken performance…

GWISA -
Well you got nth covered…

I am using dual gateways in ROS 3 and don’t seem to have the same issue you are describing so…post your mangle rules for connection / packet / routing marks…

Have you tried routing via ‘interface’ instead of IP? Well I guess a better question would be do you have multiple interfaces or just using multiple IPs?

Hi Thom,

Thanks for your assistance - it’s a multiple interface, multiple IP setup. I’ve tried your suggestion of searching, found some options and disabled interface 3 & 4.
I also changed the config to Janisk’s suggestion of only using one rule for nth, and using only one routing mangle rule… but still the same. Only have upload on one interface…

here’s the config:

Firewall:

/ip firewall filter 
add action=accept chain=forward comment="accept established" \
    connection-state=established disabled=no 
add action=accept chain=forward comment="accept related" \
    connection-state=related disabled=no 
add action=accept chain=forward comment="accept new" connection-state=new \
    disabled=no in-interface=Local 
add action=drop chain=forward comment="drop invalid" connection-state=invalid \
    disabled=no 
add action=drop chain=forward comment="drop new - not from local" \
    connection-state=new disabled=no in-interface=!Local 
add action=drop chain=forward comment="drop broadcast + multicast" disabled=no \
    dst-address-type=broadcast,multicast 

/ip firewall mangle 
add action=jump chain=prerouting comment="bypass routers" disabled=no \
    jump-target=local src-address-list=routers 
add action=mark-connection chain=prerouting comment="dsl1-src addr" \
    disabled=no new-connection-mark=dsl1 passthrough=yes src-address-list=dsl1 
add action=mark-routing chain=prerouting comment="dsl1-src addr" disabled=no \
    new-routing-mark=dsl1 passthrough=no src-address-list=dsl1 
add action=mark-connection chain=prerouting comment="dsl2-src addr" \
    disabled=no new-connection-mark=dsl2 passthrough=yes src-address-list=dsl2 
add action=mark-routing chain=prerouting comment="dsl2-src addr" disabled=no \
    new-routing-mark=dsl2 passthrough=no src-address-list=dsl2 
add action=mark-connection chain=prerouting comment="dsl3-src addr " \
    disabled=no new-connection-mark=dsl3 passthrough=yes src-address-list=dsl3 
add action=mark-routing chain=prerouting comment="dsl3 - src addr" disabled=no \
    new-routing-mark=dsl3 passthrough=no src-address-list=dsl3 
add action=mark-connection chain=prerouting comment="dsl4 - src addr" \
    disabled=no new-connection-mark=dsl4 passthrough=yes src-address-list=dsl4 
add action=mark-routing chain=prerouting comment="dsl4 - src addr" disabled=no \
    new-routing-mark=dsl4 passthrough=no src-address-list=dsl4 
add action=mark-connection chain=prerouting comment="dsl1 - packet 1" \
    connection-state=new disabled=no in-interface=Local \
    new-connection-mark=dsl1 nth=4,1 passthrough=yes 
add action=add-src-to-address-list address-list=dsl1 address-list-timeout=1h \
    chain=prerouting comment="dsl1 - conn mark dsl1" connection-mark=dsl1 \
    disabled=no in-interface=Local 
add action=mark-routing chain=prerouting comment="dsl1 - conn mark dsl1" \
    connection-mark=dsl1 disabled=no new-routing-mark=dsl1 passthrough=no 
add action=mark-connection chain=prerouting comment="dsl2 - packet 2" \
    connection-state=new disabled=no in-interface=Local \
    new-connection-mark=dsl2 nth=4,2 passthrough=yes 
add action=add-src-to-address-list address-list=dsl2 address-list-timeout=1h \
    chain=prerouting comment="dsl2 - conn mark dsl2" connection-mark=dsl2 \
    disabled=no in-interface=Local 
add action=mark-routing chain=prerouting comment="dsl2 - conn mark dsl2" \
    connection-mark=dsl2 disabled=no new-routing-mark=dsl2 passthrough=no 
add action=mark-connection chain=prerouting comment="dsl3 - packet 3" \
    connection-state=new disabled=no in-interface=Local \
    new-connection-mark=dsl3 nth=4,3 passthrough=yes 
add action=add-src-to-address-list address-list=dsl3 address-list-timeout=1h \
    chain=prerouting comment="dsl3 - conn mark dsl3" connection-mark=dsl3 \
    disabled=no in-interface=Local 
add action=mark-routing chain=prerouting comment="dsl3 - conn mark dsl3" \
    connection-mark=dsl3 disabled=no new-routing-mark=dsl3 passthrough=no 
add action=mark-connection chain=prerouting comment="dsl4 - packet 4" \
    connection-state=new disabled=no in-interface=Local \
    new-connection-mark=dsl4 nth=4,4 passthrough=yes 
add action=add-src-to-address-list address-list=dsl4 address-list-timeout=1h \
    chain=prerouting comment="dsl4 - conn mark dsl4" connection-mark=dsl4 \
    disabled=no in-interface=Local 
add action=mark-routing chain=prerouting comment="dsl4 - conn mark dsl4" \
    connection-mark=dsl4 disabled=no new-routing-mark=dsl4 passthrough=no 
add action=accept chain=local comment="" disabled=no

NAT:

/ip firewall nat 
add action=src-nat chain=srcnat comment="" disabled=no connection-mark=dsl1 \
    to-addresses=<ip 1> to-ports=0-65535 
add action=src-nat chain=srcnat comment="" disabled=no connection-mark=dsl2 \
    to-addresses=<ip 2> to-ports=0-65535 
add action=src-nat chain=srcnat comment="" disabled=no connection-mark=dsl3 \
    to-addresses=<ip 3> to-ports=0-65535 
add action=src-nat chain=srcnat comment="" disabled=no connection-mark=dsl4 \
    to-addresses=<ip 4> to-ports=0-65535

Routing:

/ip route 
add comment="" disabled=no distance=1 dst-address=0.0.0.0/0 \
    gateway=<ip 1> routing-mark=dsl1 scope=255 target-scope=10 
add comment="" disabled=no distance=1 dst-address=0.0.0.0/0 \
    gateway=<ip 2> routing-mark=dsl2 scope=255 target-scope=10 
add comment="" disabled=no distance=1 dst-address=0.0.0.0/0 \
    gateway=<ip 3> routing-mark=dsl3 scope=255 target-scope=10 
add comment="" disabled=no distance=1 dst-address=0.0.0.0/0 \
    gateway=<ip 4> routing-mark=dsl4 scope=255 target-scope=10 
add comment="default for router" disabled=no distance=1 dst-address=0.0.0.0/0 \
    gateway=<ip 1> scope=255 target-scope=10

The upload traffic all goes through the route with no routing mark.

GWISA -
At first glance it looks like you are trying to mark the traffic coming in your dsl interfaces instead of the traffic coming in your local interface. Also you are marking the incoming traffic by the dsl IP addresses - this just won’t work as the source address is not the dsl address but that of the user coming in on the dsl interface…and really what you are trying to do is mark your local user traffic going out to the Internet via your dsl interfaces.

So…what you need to do is mark your users traffic coming in your local interface going out to the internet… Connection mark / packet mark and finally routing mark all based on nth.

Not quite - all interfaces are defined as ‘in-interface=local’

You may be confusing my comments for the actual marking?

I did try removing some of the ‘in-interface’ definitions to see if it made any difference, and only left those where traffic was not previously marked with either a connection mark or address list entry.

This screen shot may help to visualize my setup:
ECMP-mangle.JPG

If you disable the entire mangle rule set and then use standard ECMP does it work? If ECMP, as defined in the manual, isn’t even working as it should then I wouldn’t bother getting the wiki stuff working yet. Test it in it’s basic form and see if that is even working, if not, then let’s figure out if its a bug or not.

Also, is connection-tracking on ?

Sam

I have tested basic ECMP - as stated in my 1st & 2nd post. :slight_smile:

Now - the funky thing happening is that all upload traffic (to the Internet) only goes through one route - whether we use the standard ECMP or improved ECMP with persistent connections.



As I said, we tried standard ECMP with multiple gateways (dst-address=0.0.0/0 gateway=, , ,),

Same thing happens, and tracking is on… I’ve set the tcp est timeout a bit shorter though - I’m testing with 1 hr on both that and address-list entries.

We’re going to try and see if downgrading to 2.9.49 helps at all tomorrow if we get an opportunity. Maybe reset & reprogram… although goodness knows I’ve looked for an error hard enough!

so disabling all route-marks and mangling gives you the same results (after a reboot because sometimes connection-marks / routing marks will still be present in conntrack table)? If so, I would send supout to support. 2.9.X ecmp works fine, I’ve used that for years, but havent had the chance to test that in 3.0 yet.

Sam

yup - that’s it in a nutshell… reboot after big changes…

Right - did the downgrade to 2.9, and standard ECMP now works fine.

The ‘improved load balancing with persistent connections’ still does the same though - if set up as in the wiki, all upload traffic only goes through one router.

If I enable the basic ECMP rule AND the rules above with routing marks, then traffic looks to be correct, but not I’m now so sure anymore…

So I have the above config, with

/ip route rule #1 set as 0.0.0.0/0 gateway=, , ,

One more thing - shouldn’t the ‘connection state’ be ‘new’ for all mark-connection rules?

in my last rules I use ‘connection state=new’. but if you add the rule on working server, your current connection will not be marked =) so first you may add rules w/o ‘connection state=new’, and later add this criterion. but hardware usage is not significant, as i can see. just graceful =)

of course you would have to reboot to catch ALL new connections…

if you do as I said below - you do not have to reboot.
as for me, I don’t want to reboot working server with 200 users online without necessity =)