Hello, I have a WORKING configuration in MLAG with 4 switches CRS326-24S+2Q+ and ROS 7.6 which I will call SW1 - SW2 - SW3 - SW4
SW1/SW2 are in MLAG with each other via the 40GBps port
SW3/SW4 are in MLAG with each other via the 40GBps port
SW1 does LACP with SW3/SW4 via two SFP ports, one going to SW3 and the other going to SW4
Same thing goes for SW2
In practice, the two pairs of switches do MLAG and then BONDING with the corresponding ones using MLAG.
The problem:
With ROS 7.9 nothing works anymore, the MAC Addresses on the ARP table continue to switch between the MLAG port and the LACP one driving the network completely crazy.
So I downgraded to 7.7 which I see has several fixes on BRIDGE/RSTP etc.. The problem also occurs with 7.7 so I think the malfunction starts from this version..
As soon as I downgrade to 7.6 the network starts working fine..
I have the same issue. Very annoying on a mission critical network when they give you 30 minutes downtime.
I noticed with 7.9, when I did a show lacp neighbour on a connected Cisco switch, it was showing the port address of one of the MTs for both neighbours, not the bond address. When I downgraded to 7.6 it showed the bond address of one of the MTs. The MT bond interfaces have different macs, so I assume the one advertised is effectively virtualised across both MTs?
The problem starts from v7.7, if you notice it contains many changelogs related to the Bridge:
*) bridge - added support for static MDB entries;
*) bridge - disallow port-controller while the bridge has MSTP enabled;
*) bridge - fixed “edge=yes” setting for MSTP;
*) bridge - fixed MSTP compatibility with STP;
*) bridge - fixed R/M/STP bridge identifier on protocol-mode change;
*) bridge - fixed RSTP BCP with bridged PPP interfaces;
*) bridge - fixed STP blocking state on port-controller;
*) bridge - fixed host moving with fast-path;
*) bridge - fixed incorrect root port blocking for MSTP;
*) bridge - fixed master port conversion;
*) bridge - fixed mst-override port priority for MSTP;
*) bridge - fixed port priority for STP and RSTP;
*) bridge - improved port-controller system stability;
*) bridge - improved system stability when using MSTP and many VLAN mappings;
*) bridge - removed “age” monitoring property from the host table;
I have performed extensive tests in the laboratory and I confirm that if the configuration is MLAG on 2 switches everything works correctly even on version 7.9 (7.7 and 7.
If MLAG is done between 4 switches (as per schema) the functioning is compromised from version 7.7 (including 7.8 and 7.9)
Since the 7.7 changelog there are several changes to the bridge
Hi, Mikrotik Support has reproduced and confirmed the bug:
Hello,
Thank you for the report!
We have managed to reproduce the issue locally in our labs and look forward to fixing it on upcoming RouterOS versions, unfortunately, I cannot provide a release date now.
Unfortunately, I cannot suggest any known workarounds.
Best regards,
Edgars P.
i haven’t lab your diagram yet - but just having some thought from basic stp point of view.
from your diagram- let us just view them in basic stp operation minus the lacp/mlag operation - each sw1, 2, 3 and 4 —> from those 3 interface each on the switch, basic stp will only allow each of the switch to have 1 port activated.
am i correct?
ok, now we will have a look at lacp/bonding/mlag point of view - which is their operation state need to modify in certain way from that basic stp operation —> so that their bonding ports could become activated at the same time ie. rr, master-slave etc.
from above assumption, did you see any port seemed to be still in disabled state in your lab/diagram?
if yes, then I would say the problem might be in m/r/stp software?
if not, then maybe we need to adapt/adjust this problem to a working MT diagram?
Good thing they confirm it - I was going crazy.
I will install version 7.6 to see if it is stable on my network and in case of new instabilities - I will unmount MLAG until it is stable. I already knew I had to test it in my lab first… but “rush” is what they have
And I will wait for them to confirm the resolution of the problem in later versions - before updating ANYTHING
Hi everyone!
We have configured the similar HA setup - two pairs of MLAGGed switches.
But we connected them with ONE LACP bond of 4 lines. Generally it works good, guests connected to switches works as expected.
But when we try to connect to switches manage IP across this 4-lines bond - connection lags/freezes (but not hanging completely).
As we can see - packets jumping to the destination host from ICPP interlink to bond and back again constantly.
factory-firmware: 7.8
current-firmware: 7.10
upgrade-firmware: 7.6 < ---------------- It was upgradet to latest version 7.10, but after that I downgrade it to 7.6
Hi connectlife, one quick question, after a reboot of one of the switches the traffic is normally balanced between them? Or just one is working with the traffic and the rebooted works as backup?