Community discussions

MUM Europe 2020
 
User avatar
LatinSuD
Member Candidate
Member Candidate
Topic Author
Posts: 174
Joined: Wed Jun 29, 2005 1:05 pm
Location: Spain
Contact:

ECMP bug and workaround

Wed Oct 01, 2008 6:38 pm

ECMP does not work for locally generated packets when using separate interfaces and not masquerading own external address.

Example scenario:
There are 2 gateways: 192.168.16.1 and 192.168.17.1.
- Interface1: 192.168.16.2/24
- Interface2: 192.168.17.2/24
- Interface0: 192.168.0.1/24
- Default route has multiple gateways: 192.168.16.1,192.168.17.1

Both gateways work separately, but when i configure the multiple gateway some things stop working:
- DNS stops working (routeros cannot resolve anymore), telnet also fails (cannot do /system telnet command from the box).
- On the other hand ping and traceroute still work.
- Traffic routed through, but not generated by, the routeros box works.

After examining traffic with torch and also with a sniffer i noticed that it was using wrong ip addresses and interface.
That is, it was sending packets to Interface2 with src address 192.168.16.2 !!
Moreover, when this happens it does not balance connections, almost always they were sent through the same interface and (wrong) src address.

The workaround:
Either masquerade interface or external ip:
- option 1: out-interface=Interface1 action=masquerade, out-interface=Interface2 action=masquerade
- option 2: src-address=192.168.16.1 action=masquerade, src-address=192.168.17.1 action=masquerade
(I was using masquerade on the client range (src=192.168.0.0/24 as set up by hotspot), but that is not enough).

The example in reference manual might not work, depending on how you do masquerade (the example does not specify it).
This has been tested on 2.9.51 and 3.14.
 
Nuke
newbie
Posts: 42
Joined: Mon Jul 31, 2006 7:35 pm
Location: South Africa
Contact:

Re: ECMP bug and workaround

Wed Oct 01, 2008 6:59 pm

Also had that problem, connections that goes though also get TCP_RST packets alot of the times, causing winbox/ssh/telnet to dc for router behind it(somethimes it cant stay connected for more than 30s). Thus I made sure OSPF only have one route to each destination by changing the path cost. Problem solved.

Trying to loadbalance by Nth term fails too, I'm guessing that the root of the problem is the same. Its work fine for a few packets, then starts dropping half the packets.

Who is online

Users browsing this forum: barracuda, MSN [Bot] and 70 guests