Community discussions

MikroTik App
 
saarlan
just joined
Topic Author
Posts: 1
Joined: Thu Oct 06, 2022 9:42 pm

CRS326 LACP broadcast storm

Thu Oct 06, 2022 9:45 pm

Dears!

Twice a year, our club hosts an event for pc gaming enthusiasts. We provide the network infrastructure for 160-200 users for a whole weekend.

In our setup, we use a bunch of CSS326 with SwOS as the access switches, two CRS326-24S+2Q+RM as aggregation switches (with SwOS for our latest event, now with ROS) and a stacked Aruba 2930F as a core switch.

We use VLAN network segmentation for our different services (switch management, server, wifi access points, devices, users, ...). The networks are routed through the Aruba switch. However, due to the nature of our event, we decided to put the users in a VLAN together with a /23 netmask. This has the advantage that the users will "see" each other in the server lists of their games and can host their own games in the LAN because they are all part of the same broadcast domain. That means that the user's VLAN (VLAN 11) has a lot of broadcasts.

The CSS326 are configured with all copper ports untagged in the user's VLAN (VLAN 11). The two SFP+ ports per switch are LACP bonded and provide the uplink to the CRS326 SFP+ aggregation switches which have a 2x10 GBit/s LACP bonded uplink to the Aruba switch as well.

With SwOS software versions 2.4 - 2.13 we experienced a serious network outage as soon as a few users were on the network. The Aruba shut down the uplink interfaces due to broadcast storm. The only way to restore the network was to cut the LACP bonding by disabling one of the interfaces on every switch and running the uplink through only one cable.
We experimented with Flow Control on/off on the CSS switches, but got no different results. Since the default configuration is to have the TX Flow Control on on the CSS switches, we ended up leaving it on. As long as the LACP bonding was up and a few users were using the network, all of the switches stopped working eventually. We double checked all of the cables but found no error.

We discovered that downgrading the SwOS versions to version 2.3 allowed us to use the LACP bonding uplinks again. With this software version, we were able to utilize the switches to their full potential without any further issues.

There seems to be a software bug introduced in SwOS with version 2.4 which is still present in the later versions and even in the current version 2.13. We believe that there is a broadcast storm happening as soon as LACP bonding is used.

Can you give us information on what the problem with the newer software versions could be? We are happy to provide further diagnosis and investigation to help fix the issue, if you could tell us what data you need.

Cheers,
the IT team of SaarLAN e.V.
 
lanel
just joined
Posts: 1
Joined: Tue Mar 07, 2023 10:35 am

Re: CRS326 LACP broadcast storm

Tue Mar 07, 2023 10:43 am

Is there any update on this? We are experiencing exactly the same problems with the current latest fw version 2.13 . DHCP broadcast packets cause a packet storm when 10G links are bonded into LACP trunks. It is happening only with the DHCP broadcasts, other types of broadcast or multicast traffic like ARP are handled correctly.

The setup can be simplified to two Mikrotik CRS317-1G-16+ switches running swOS 2.13 with 2 10GB SFP+ modules in each. Even such basic configuration does not work.

Downgrading to an old version of swOS helps. So it seems the problem can be solved by firmware update...
 
mjezierski
newbie
Posts: 36
Joined: Mon Jul 01, 2019 3:50 pm
Location: Racing Capital of the World
Contact:

Re: CRS326 LACP broadcast storm

Mon Mar 13, 2023 7:27 pm

Watching ....

I have a CRS317 (SwOS 2.13) with a twin-10G LAG going to a CRS328 (SwOS 2.14) but the particular VLAN that is carried on the LAG has no DHCP server enabled.

Who is online

Users browsing this forum: No registered users and 17 guests