We have just installed a new router to handle 4 gateways with ECMP + persistent connections using v3.0rc10, as per the Wiki. Tried downgrading to rc7, and still the same…
The main change from the wiki is the 4 gateways, and of course the syntax for ‘nth’ in v3
Now - the funky thing happening is that all upload traffic (to the Internet) only goes through one route - whether we use the standard ECMP or improved ECMP with persistent connections.
When using the improved ECMP, no upload traffic will pass unless a default route to any one of the gateways that has no routing marks is added.
All the connection marks & address lists are there, look about right and equal in number.
Anyone have any insight as to why this is happening?
There’s nothing else set up on this router except the mangle & routing rules as in the Wiki…
Thanks - I know that. ‘nth’ only has two fields in 3.0 - ‘Every’ and ‘Packet’ - so with my 4 gateways, ‘every’ is set to ‘4’, and ‘Packet’ changes for the packet number…
That part is fine - all the connection marks are appearing as expected, as are the address-list entries, and these are based on the ‘nth’ rules…
It seems to be the routing marks, or ECMP is broken in 3.0?
As I said, we tried standard ECMP with multiple gateways (dst-address=0.0.0/0 gateway=, , ,), and get similar broken performance…
I am using dual gateways in ROS 3 and don’t seem to have the same issue you are describing so…post your mangle rules for connection / packet / routing marks…
Have you tried routing via ‘interface’ instead of IP? Well I guess a better question would be do you have multiple interfaces or just using multiple IPs?
Thanks for your assistance - it’s a multiple interface, multiple IP setup. I’ve tried your suggestion of searching, found some options and disabled interface 3 & 4.
I also changed the config to Janisk’s suggestion of only using one rule for nth, and using only one routing mangle rule… but still the same. Only have upload on one interface…
GWISA -
At first glance it looks like you are trying to mark the traffic coming in your dsl interfaces instead of the traffic coming in your local interface. Also you are marking the incoming traffic by the dsl IP addresses - this just won’t work as the source address is not the dsl address but that of the user coming in on the dsl interface…and really what you are trying to do is mark your local user traffic going out to the Internet via your dsl interfaces.
So…what you need to do is mark your users traffic coming in your local interface going out to the internet… Connection mark / packet mark and finally routing mark all based on nth.
Not quite - all interfaces are defined as ‘in-interface=local’
You may be confusing my comments for the actual marking?
I did try removing some of the ‘in-interface’ definitions to see if it made any difference, and only left those where traffic was not previously marked with either a connection mark or address list entry.
If you disable the entire mangle rule set and then use standard ECMP does it work? If ECMP, as defined in the manual, isn’t even working as it should then I wouldn’t bother getting the wiki stuff working yet. Test it in it’s basic form and see if that is even working, if not, then let’s figure out if its a bug or not.
I have tested basic ECMP - as stated in my 1st & 2nd post.
Now - the funky thing happening is that all upload traffic (to the Internet) only goes through one route - whether we use the standard ECMP or improved ECMP with persistent connections.
As I said, we tried standard ECMP with multiple gateways (dst-address=0.0.0/0 gateway=, , ,),
Same thing happens, and tracking is on… I’ve set the tcp est timeout a bit shorter though - I’m testing with 1 hr on both that and address-list entries.
We’re going to try and see if downgrading to 2.9.49 helps at all tomorrow if we get an opportunity. Maybe reset & reprogram… although goodness knows I’ve looked for an error hard enough!
so disabling all route-marks and mangling gives you the same results (after a reboot because sometimes connection-marks / routing marks will still be present in conntrack table)? If so, I would send supout to support. 2.9.X ecmp works fine, I’ve used that for years, but havent had the chance to test that in 3.0 yet.
Right - did the downgrade to 2.9, and standard ECMP now works fine.
The ‘improved load balancing with persistent connections’ still does the same though - if set up as in the wiki, all upload traffic only goes through one router.
If I enable the basic ECMP rule AND the rules above with routing marks, then traffic looks to be correct, but not I’m now so sure anymore…
So I have the above config, with
/ip route rule #1 set as 0.0.0.0/0 gateway=, , ,
One more thing - shouldn’t the ‘connection state’ be ‘new’ for all mark-connection rules?
in my last rules I use ‘connection state=new’. but if you add the rule on working server, your current connection will not be marked =) so first you may add rules w/o ‘connection state=new’, and later add this criterion. but hardware usage is not significant, as i can see. just graceful =)