bonding performance issues

Hi all.

I have several problems with bonding performance. The idea was to join 2 Mikrotik boxes with bondingrr (Load balance and fault tolerance). They should be connected with a wireless link, now i test it with a pair of path cords between them, so directly connected with working cat5 patch cords.

I have box1 with 3 ethernet interfaces: box1lan1 and box1lan2 both of them 100mbps and configured as b onding1 with bondingrr. The other interface has a computer connected at 100mbps full duplex.
I have box2 with 3 ethernet interfaces: box2lan1 and box2lan2 both of them 100mbps and configured as bonding1 with bondingrr. The other interface has a computer connected at 100mbps full duplex.

The problem is that if i try to route using bonding interface it works but i do not get much more than 60mbps between both computers when i do a tcp bandwith test from one computer to the other one. The cpu of mikrotik boxes is a pentium 3 1000mhz and is it not working at 100%. Also there is nothing else configured, just a couple of ip address for the interfaces lan and bonding1 and a static route to keep all working. There are no lost packets too.

I would like to know if someone has the same problem and knows the solution.
I think that there is something wrong with software or bonding implementation that is not working fine because i m far away to get half the theoretical available bandwith and if i take a look to cpu resources the mikrotik boxes show unused cpu power.
The version running is 2.29.37 till today.

Thanks for your help and pardon my bad English.

try running BT through your routers, not from router to router, since BT uses lots and lots of CPU.

i do not use bandwith test at the routers. I use a bandwith utility at the computers. The bandwith check is from computer to computer through the mikrotik boxes. The routers cpu do not exceeds 50% at any case during tests.

Then try doing a bandwidth test from the routers themselves, the NIC cards of your computers might be the bottleneck, especially if they’re the cheap or onboard type.

-EDIT- I just saw that you mention 60Mbps, this is usually the case if you are running in half-duplex (or worse still, mismatched duplex settings). Set all your interfaces (both on the computers and on MT) to run at 100Mbit, full duplex.

I did the same test from one computer to the other and no mikrotik between them to discard nics and cable and i get 94mbps, so i do not think that computers were the problem. Also duplex and bandwith is autodetected and i think is 100mbit full duplex.

Thanks for replying

that could be a problem. disable autonegotiation on both Mikrotiks and PCs, and force everything to 100mbit / full duplex

I disabled autonegotiation of all interfaces, computers and mikrotik, everything is 100mbps full duplex with no autonegotiation.Same test again and results are the same (near 55mbps) in tcp connection between two computers when using bonding at mikrotik boxes.

thanks again

I disabled autonegotiation of all interfaces, computers and mikrotik, everything is 100mbps full duplex with no autonegotiation.Same test again and results are the same (near 55mbps) in tcp connection between two computers when using bonding at mikrotik boxes.

thanks again[/quote]

You said the words. Bonding + TCP = low performance.
TCP doesn’t like packet reordering.

well.
I tested with udp too and i do not get much more than 5 mbps! over tcp tests.

When i took a look at the manual there said:
“balance-rr - round-robin load balancing. Slaves in bonding interface will transmit and receive data in sequential order. Provides load balancing and fault tolerance.” I get fault tolerance and load balancing but the throughput is too bad for me.

If i link (bonding) two interfaces at 100mbps each, the theoretical maximum should be 200mbps and despite i use tcp or udp i hardly pass half the bandwith provided by only one interface alone!

Maybe there were packet reordering and a lot of processing, but there is just a two hops network with only two computers and nothing else, so it sounds strange for me that the reordering of packets could reduce throughput that way.

Assuming these routers are going to pass traffic from computers connected “through” the routers you also need to test through the routers using extra computers for bandwidth testing.

By the way:

  • RR bonding will never in practice reach the aggregate bandwidth of the two links. (Packet size differences).

  • The settings you find to work now will probably work poorly when you actually put the radios in to do the work.

  • You should try these links with 10mbit half duplex ethernet connections to better simulate the conditions it will be run in.

  • Buy me a beer!

We’ve experimented with bonding quite a bit. We ran into packet retransmits quite a bit because even though 3 cable modems had the same QOS they did not have the same ‘exact’ speeds. If 1 of the 3 packets is out of order you basically retransmit all 3 to get them into order again I believe. So a 100mb + 100mb pipe in theory is 200mb, but when you add TCP on top of that you can probably only stuff 66% down it (tcp retransmit overhead).

There is a really good article here that explains this way better than I can:
http://linux-net.osdl.org/index.php/Bonding

“balance-rr -
This mode is the only mode that will permit a single TCP/IP connection to stripe traffic across multiple interfaces. It is therefore the only mode that will allow a single TCP/IP stream to utilize more than one interface’s worth of throughput. This comes at a cost, however: the striping often results in peer systems receiving packets out
of order, causing TCP/IP’s congestion control system to kick in, often by retransmitting segments.”

"For a four interface balance-rr bond, expect that a single TCP/IP stream will utilize no more than approximately 2.3 interface’s worth of throughput, even after adjusting tcp_reordering. "

“If you are utilizing protocols other than TCP/IP, UDP for example, and your application can tolerate out of order delivery, then this mode can allow for single stream datagram performance that scales near linearly as interfaces are added to the bond.”

I do think there is some optimizing that could be done on the MT bonding… but it looks like they took the one on the URL above and plunked it into ROS almost as is. At least it’s something huh : )

here you can also read some:
http://wiki.mikrotik.com/wiki/MUM_2006_USA/Bonding

thanks all for your help. :smiley:

i will take a look at the links and will do more tests.

Well said mr ChangeIP! :slight_smile:

Tip to MT:

Since a lot of people would like to use bonding, but many are getting out-of-order slowdowns, why not implement a buffer which re-orders packets before spitting them out on the ‘outgoing’ interface? A counter can be implemented on the ‘sending’ side, and the ‘receiving’ side will use this counter to reorder the packets.

The buffer can be as small as 1 packet from each bonded interface, since the ‘transmitting’ interface will be sending the packets sequentially.

Another approach that works a little better (just a little, mind you) is to do the round robin routing manually. I’ve done this and it works very well. I won’t write the whole script for you, but here is the approach:

          |          | -> link1 <- |       |
LAN -> |  MT1   |                | MT2 | <- Other LAN
          |          | -> link2 <- |        |

Forgive the ugly ascii art. :sunglasses:
SO, what you want to do in MT1 is this:

/ip route 
add gateway=MT2LINK1 routing-mark=LINK1
add gateway=MT2LINK2 routing-mark=LINK2
/ip route rule
add routing-mark=LINK1 action=lookup table=LINK1
add routing-mark=LINK2 action=lookup table=LINK2

/ip firewall mangle
add chain=forward action=mark-routing new-routing-mark=LINK1 nth=2,1 passthrough=no
add chain=forward action=mark-routing new-routing-mark=LINK2 nth=2,2 passthrough=no

On MT2, you do the same thing, but of course, the gateways are not the same. This works pretty well. The only problem with this approach is that it does not autodetect failures. To get THAT, you can do something like this (in the routes):

/ip route 
add gateway=MT2LINK1 routing-mark=LINK1 check-gateway=ping distance=0
add gateway=MT2LINK2 routing-mark=LINK1 distance=200
add gateway=MT2LINK2 routing-mark=LINK2 check-gateway=ping distance=0
add gateway=MT2LINK1 routing-mark=LINK2 distance=200

On various occasions, I’ve used this functionality and it does fine.