Resilient LAG and adding new config parameters to existing bond interface via CLI

Hello all,

I configured a basic LAG. I have difficulty in adding to this existing config of bond, the load balancing and link monitoring items.

edit option only allows changing the existing parameters. And add option is for net-new interface.

What is the recommended CLI to add to an existing configuration of an interface or of a bonded interface please?

Thanks

Seems like I figured this out.

Using set along with the interface name works.

For my bond interface named LAG1 that had only two slave interfaces defined, something like:

interface bonding set transmit-hash-policy=layer-2-and-3 LAG1
interface bonding set link-monitoring=mii LAG1

I was hoping with this all, I will have a resiliency, but if I unplug any cable, the ping thru the LAG drops. So then I changed the mode type to LACP / 802.3ad and that also had the same behavior. Then I changed the link monitoring method to arp and specified the target IP of the other end gateway IP. The situation still remains same.

I have tried protocol less at both ends, as well as LACP at both ends and failover does not happen.

Only thing that I have been able to make work is using Active / Passive or transmit load balancing, or adaptive load balancing (which balances both sides, but requires both end to have this setup). Generally in my case of Internet access, TLB is relevant as upstream traffic will be low.

/interface bonding
add mode=balance-tlb name=LAG1 primary=sfp1 slaves=sfp1,sfp2 transmit-hash-policy=layer-2-and-3

With this other end switch does not need any special config, just two access or trunk ports to match with the setup of bond on router end. This works flawlessly. My plan will be to connect route to two members of a stacked switch.

I thought I had found the correct CLI to get this simple portchannel / LAG working, but I was wrong.

  1. I was last testing by disabling and then enabling individual ports from with router. But I now tested by unplugging and then plugging cables and sometimes there is failover and failback in 20 to 30 ping drops, other times it only took 4 to 5. And sometimes, it never comes back up with one port only in.

  2. And then sometimes it will failback but will see 1 ping every 5 or 6.

Definitely 6.47.3 is buggy in terms of LAG / aggregate or I am doing something really wrong.

Is anyone else doing LAG on a router running newer code?

Thanks

@MT Support, can someone advise on what is going on in this case?

For the TLB load balancing mode, only mii is allowed for local link monitoring status, but with SFPs only working with auto-neg off (this sets the light on the port to be permanently on) and full duplex set to yes and speed set to 1Gbps, it could be that auto-neg disabling also disables link keepalives and mii thinks link is up (because link light is ON).

I just saw that there is 6.47.6 that has been made available. I will try with that version and report back.

Thanks

@MT Support and other members, I have done upgrade to 6.47.6 and issues are still the same.

When I disable / enable the ports, then failover and failback works. But when I unplug any of the two cables, then it does not work as port remains lit and mii does not take action. ARP link monitoring is not available for TLB, so not sure. I had also tried LACP and ARP monitoring by setting the switch IP address (direct neighbor for the router) and that also does not work as if ARP monitoring is also broken.

Thanks

Can someone in support please look into this? Should I conclude that bonding will not work over SFPs? Is this limited to built-in copper ports and hence some of the models with SFP / SFP+ only ports are out of bound of bonding?

Thanks

To get a response from support, you have to open a support case, via an e-mail to support@mikrotik.com or, better, using the web interface at help.mikrotik.com.

Other than that - if I remember right, in LACP mode, the detection of link failure should not depend on the physical state of the link as the LACP PDUs should be informing the peer about protocol state on the links, not just about their physical fitness. But I may be wrong here. You can check by sniffing on the interfaces and using Wireshark to visualise the contents of the PDUs.

It’s still strange that seting autonegotiation to off prevents the Loss of Signal from being reported by the SFPs. Or maybe they send it but RouterOS ignores it?

Thank you so much @sindy for your help here. I was away and did not get chance to login. I now know the process of soliciting support to read my posts, so will keep in mind. And you are very correct that LACP is purposely preferred over the non protocol portchannels in Cisco and other vendor, just for the reason that there could be fiber links where one strand may get damaged and other side will not get to know and then LACP PDUs play a role and to report the link failure. But here I have tested many times and not only fiber but also copper, I find that to bring up any SFP, I need to do auto-neg off and that visibly makes the port to go up (just like we used to force an interface up in cisco without plugging anything into it and then we could ping the ip address on that port from anywhere, by simply having no keepalive command under the interface). So something similar is happening with this command and that causes the portchannel on routerOS to not detect link down situation. Is not this very strange that no one else has seen this and reported this on the forum?

Appreciate again your help and support.