Mikrotik MLAG HOWTO and Questions

I have 2 brand new, unconfigured CRS326-24S+2Q+RM switches running v7.9.1 stable and I have been trying to get MLAG to work. I followed the guide here:

but the resulting configuration was very unstable. Running a constant ping between 2 hosts, I tried disconnecting and reconnecting random ports and it would work sometimes and sometimes not.
Eventually, I was able to figure out a configuration that appears to be working 100%. The key was to create a UNIQUE MLAG ID for each client bond, instead of using the same one for all bonds. Using this configuration, I am able to plug and unplug any hosts to/from any ports and have full redundancy -- but I have 2 questions:

  1. Is this configuration correct?
  2. In the current configuration, I am using port qsfpplus1-1 as the uplink/peer port (40Gbps). What I would really like to do is create a LACP bond using qsfpplus1-1 and qsfpplus2-1 and use that bond as the peer uplink. However, if I do this, the MLAG does not work any more.

My working config is below. Can someone confirm that this is correct, and can anyone explain why it won't work if I LACP my peer ports?

For reference, the hosts are Proxmox (Linux) with 2x ports in LACP.

Thanks

On Switch 1 and 2 we execute:

/interface bonding

add mode=802.3ad name=uplink_80g slaves=qsfpplus1-1,qsfpplus2-1

/interface bonding
add mlag-id=1 mode=802.3ad name=client-bond1 slaves=sfp-sfpplus1
add mlag-id=2 mode=802.3ad name=client-bond2 slaves=sfp-sfpplus2
add mlag-id=3 mode=802.3ad name=client-bond3 slaves=sfp-sfpplus3
add mlag-id=4 mode=802.3ad name=client-bond4 slaves=sfp-sfpplus4
add mlag-id=5 mode=802.3ad name=client-bond5 slaves=sfp-sfpplus5
add mlag-id=6 mode=802.3ad name=client-bond6 slaves=sfp-sfpplus6
add mlag-id=7 mode=802.3ad name=client-bond7 slaves=sfp-sfpplus7
add mlag-id=8 mode=802.3ad name=client-bond8 slaves=sfp-sfpplus8
add mlag-id=9 mode=802.3ad name=client-bond9 slaves=sfp-sfpplus9
add mlag-id=10 mode=802.3ad name=client-bond10 slaves=sfp-sfpplus10
add mlag-id=11 mode=802.3ad name=client-bond11 slaves=sfp-sfpplus11
add mlag-id=12 mode=802.3ad name=client-bond12 slaves=sfp-sfpplus12
add mlag-id=13 mode=802.3ad name=client-bond13 slaves=sfp-sfpplus13
add mlag-id=14 mode=802.3ad name=client-bond14 slaves=sfp-sfpplus14
add mlag-id=15 mode=802.3ad name=client-bond15 slaves=sfp-sfpplus15
add mlag-id=16 mode=802.3ad name=client-bond16 slaves=sfp-sfpplus16
add mlag-id=17 mode=802.3ad name=client-bond17 slaves=sfp-sfpplus17
add mlag-id=18 mode=802.3ad name=client-bond18 slaves=sfp-sfpplus18
add mlag-id=19 mode=802.3ad name=client-bond19 slaves=sfp-sfpplus19
add mlag-id=20 mode=802.3ad name=client-bond20 slaves=sfp-sfpplus20
add mlag-id=21 mode=802.3ad name=client-bond21 slaves=sfp-sfpplus21
add mlag-id=22 mode=802.3ad name=client-bond22 slaves=sfp-sfpplus22
add mlag-id=23 mode=802.3ad name=client-bond23 slaves=sfp-sfpplus23
add mlag-id=24 mode=802.3ad name=client-bond24 slaves=sfp-sfpplus24


/interface bridge
add name=bridge1 vlan-filtering=yes
/interface bridge port
add bridge=bridge1 interface=qsfpplus1-1 pvid=99 # works

add bridge=bridge1 interface=uplink_80g pvid=99 ## does not work

add bridge=bridge1 interface=client-bond1
add bridge=bridge1 interface=client-bond2
add bridge=bridge1 interface=client-bond3
add bridge=bridge1 interface=client-bond4
add bridge=bridge1 interface=client-bond5
add bridge=bridge1 interface=client-bond6
add bridge=bridge1 interface=client-bond7
add bridge=bridge1 interface=client-bond8
add bridge=bridge1 interface=client-bond9
add bridge=bridge1 interface=client-bond10
add bridge=bridge1 interface=client-bond11
add bridge=bridge1 interface=client-bond12
add bridge=bridge1 interface=client-bond13
add bridge=bridge1 interface=client-bond14
add bridge=bridge1 interface=client-bond15
add bridge=bridge1 interface=client-bond16
add bridge=bridge1 interface=client-bond17
add bridge=bridge1 interface=client-bond18
add bridge=bridge1 interface=client-bond19
add bridge=bridge1 interface=client-bond20
add bridge=bridge1 interface=client-bond21
add bridge=bridge1 interface=client-bond22
add bridge=bridge1 interface=client-bond23
add bridge=bridge1 interface=client-bond24

/interface bridge vlan
add bridge=bridge1 tagged=qsfpplus1-1 vlan-ids=1 # works

add bridge=bridge1 tagged=uplink_80g vlan-ids=1 ## does not work

/interface bridge mlag
set bridge=bridge1 peer-port=qsfpplus1-1 # works

set bridge=bridge1 peer-port=uplink_80g ## does not work

hello phil,

could you solve the problem with the 2x40gb lacp?

I tried the same and can´t solve it.

thanks for your help.

pete

7.14.2 and still unable to use bonding under ICCP/Peer_Port link

I got it to work (at least it seems to be working!) You have to change the MLAG priority on one of the peers so they are not the same. That will make one a primary.

Steps:

  1. Set both switches to a blank config. I connect to the first copper Ethernet port on both using Winbox to maintain a connection while setting the switches up.

  2. If you want more than 1/10/40Gbps between the switches, create a bond interface using whichever two, three, four, etc. physical interfaces you wish to be bonded. Make them identical on both sides: 802.3ad, LACP timing 1s (30s would be fine but I use 1s), and Layer 3 & 4 for hashing (better load balancing that way).

  3. Create a bridge, then enable MLAG on both switches, using your chosen peer port: typically the fastest physical port on the switch, or an LACP bond (created in Step 2).

  4. Add the peer port to the bridge, ensuring that the peer port’s PVID is NOT VLAN 1 (anything but 1 is fine). Make sure the peer port is set to allow all frame types (tagged and untagged).

  5. Tag VLAN 1 to the peer port on both sides to ensure untagged traffic flows properly. Do NOT tag the VLAN you selected as the peer port PVID to anything else, ever.

  6. Make sure STP settings on both bridges is identical.

  7. Add your other ports to the bridge as you need them, being sure to tag any VLANs to the necessary physical ports and ALWAYS to the peer port.

  8. If you are adding LAG’s to the MLAG stack (which is kind of the point), as the OP found out, the MLAG-ID is UNIQUE per LAG client. You can have as many links in the LAG as you want, as long as they all have the same ID, but different clients MUST have different ID’s.

  • I personally found issues with more than two links (one per switch) on versions between 7.15.3 and 7.19.x, so use 7.15.3 or 7.19.3 for maximum stability. I have some devices with four links (two per switch) in some of my MLAG stacks and it works fine on those versions (7.15 or 7.19).
  1. You don’t have to have two links per downstream device. One will work, but you won’t have any redundancy to that device if the particular switch it’s connected to goes down. Yet, for 99% of what most of us do in home labs, that’s fine.
  1. These are switches. Don’t try to do any fancy routing or additional CPU-bound work (unless you’re doing MLAG with a pair of 2116/2216 devices, but even then, it can make things complicated; and L3HW offload is not supported with MLAG anyway). It is likely to cause confusion and violates the concept of separation of functions. Use routers for routing and switches for switching at this point, even in a home lab.

After more testing, I don’t think the MLAG priority really matters as it seems to work fine when they are same.

Could you explain #7 a bit more? I want to be sure I did not misunderstand this step. Maybe an example?

This is an example from an MLAG pair of switches with both QSFP+ and SFP+ ports

For the most part, the config would be identical on both switches in the stack except for devices that have only one link into the stack

# The bond config could/should be mostly identical
# MLAG-ID's are the way the two switches determine which connections belong to which devices
# I use 1sec LACP rate and L3-4 for the hash; defaults are 30sec and Layer 2
/interface bonding
add lacp-rate=1sec mlag-id=101 mode=802.3ad name=bond-1-router slaves=sfp-sfpplus1-router transmit-hash-policy=layer-3-and-4
add lacp-rate=1sec mlag-id=101 mode=802.3ad name=bond-2-server slaves=sfp-sfpplus2-server transmit-hash-policy=layer-3-and-4

Step 7 from above:

Add switch ports to the bridge/MLAG

/interface bridge port
# PVID has to be the MLAG peer VLAN
# Can be any valid VLAN ID; I don't use 2-10 anywhere on my networks
add bridge=bridge interface=qsfpplus2-mlag-peer pvid=2
# Default PVID is 1 if you don't specify it
add bridge=bridge interface=bond-1-router
add bridge=bridge interface=bond-2-server

# These two are only on Switch 1:
add bridge=bridge interface=sfp-sfpplus3-lone-device
# For this machine we don't want to allow the native VLAN
add bridge=bridge interface=sfp-sfpplus8-macpro-esxi frame-types=admit-only-vlan-tagged

Tag the VLANs to their respective ports

/interface bridge vlan
# VLAN 1: untagged on whatever ports you want to be part of the "native" VLAN,
# and tagged across the MLAG peer link
add bridge=bridge vlan-ids=1 tagged=qsfpplus2-mlag-peer untagged=bridge,bond-1-router,bond-2-server,sfp-sfpplus3-lone-device

# VLAN 2: MLAG peer VLAN, untagged on the peer link and not tagged to anything else

add bridge=bridge vlan-ids=2 untagged=qsfpplus2-mlag-peer

# Example VLANs: 981-985 need to go between a Mac Pro running ESXi and a Linux server with a LAG into the stack

# On Switch 1: Mac Pro ESXi has only port, which is connected to this switch;
# tag its port, the peer port, and the other server's port
add bridge=bridge vlan-ids=981-985 tagged=qsfpplus2-mlag-peer,bond-2-server,sfp-sfpplus8-macpro-esxi

# On Switch 2: Only the server has a link to both switches, so just tag the peer and the server
add bridge=bridge vlan-ids=981-985 tagged=qsfpplus2-mlag-peer,bond-2-server

I think I am following your example. Here is the config from one of the switches. Does this look correct. Am I missing anything? It seems to work. I have 5 servers connected to 2 CRS520-4XS-16Q-RM 100 Gb switches. I don’t have any vlans other than the 99 PVID on the ICCP link.

/interface bridge
add name=bridge vlan-filtering=yes

/interface ethernet
#SERVER PORTS
set [ find default-name=qsfp28-1-1 ] fec-mode=fec91 l2mtu=9000 mtu=9000
set [ find default-name=qsfp28-2-1 ] fec-mode=fec91 l2mtu=9000 mtu=9000
set [ find default-name=qsfp28-3-1 ] fec-mode=fec91 l2mtu=9000 mtu=9000
set [ find default-name=qsfp28-4-1 ] fec-mode=fec91 l2mtu=9000 mtu=9000
set [ find default-name=qsfp28-5-1 ] fec-mode=fec91 l2mtu=9000 mtu=9000

#ICCP PORTS
set [ find default-name=qsfp28-15-1 ] fec-mode=fec91 l2mtu=9000 mtu=9000
set [ find default-name=qsfp28-16-1 ] fec-mode=fec91 l2mtu=9000 mtu=9000

/interface bonding
#ICCP
add lacp-rate=1sec mode=802.3ad mtu=9000 name=ICCP slaves=qsfp28-16-1,qsfp28-15-1 transmit-hash-policy=layer-3-and-4

#SERVERS
add lacp-rate=1sec mlag-id=1 mode=802.3ad mtu=9000 name=SERVER1 slaves=qsfp28-1-1 transmit-hash-policy=layer-3-and-4
add lacp-rate=1sec mlag-id=2 mode=802.3ad mtu=9000 name=SERVER2 slaves=qsfp28-2-1 transmit-hash-policy=layer-3-and-4
add lacp-rate=1sec mlag-id=3 mode=802.3ad mtu=9000 name=SERVER3 slaves=qsfp28-3-1 transmit-hash-policy=layer-3-and-4
add lacp-rate=1sec mlag-id=4 mode=802.3ad mtu=9000 name=SERVER4 slaves=qsfp28-4-1 transmit-hash-policy=layer-3-and-4
add lacp-rate=1sec mlag-id=5 mode=802.3ad mtu=9000 name=SERVER5 slaves=qsfp28-5-1 transmit-hash-policy=layer-3-and-4

/port
/interface bridge mlag
set bridge=bridge peer-port=ICCP priority=50

/interface bridge port
add bridge=bridge interface=ICCP pvid=99
add bridge=bridge interface=SERVER1
add bridge=bridge interface=SERVER2
add bridge=bridge interface=SERVER3
add bridge=bridge interface=SERVER4
add bridge=bridge interface=SERVER5

/interface bridge vlan
add bridge=bridge tagged=ICCP vlan-ids=1

That looks right to me. Compared it really quickly with a new MLAG I just spun up and it matches (in concept anyway).

I think the CRS devices with Marvell Prestera switch chips all have a L2MTU limit of 10218 (you can check with /interface print and looking at the MAX-L2MTU column) and according to MikroTik’s video on L2MTU there’s probably no hardware resource usage difference between setting l2mtu=9000 and l2mtu=10218 (the same amount of buffer memory will be used).

However, if you limit the L2MTU to 9000 on the qsfp28 ports like in your current config, and then have VLANs on them, then those VLANs won’t be able to reach an MTU of 9000. It might be better if you set l2mtu=10218 for the ports under /interface ethernet.

The switches will not let me set the mtu that high…the maximum they will let me go is 9574.

The point is to set them as high as the hardware will allow so that 9000-byte packets will traverse the MLAG stack. If you put in a value higher than the hardware will allow, it will auto-adjust it to the maximum for you. I usually put in 12000, which pretty much covers anything MikroTik makes.

Ok I see… I set them all for 9570 which is the max it allowed.