My main switch in my rack is an CRS317-1G-16S+.
I also have a CSS326-24G-2S+ in that same rack, for gigabit copper stuff.
I wanted to link up the CSS326-24G-2S+ to the CRS317-1G-16S+ using both of its SFP+ ports in a LACP configuration in order to get almost non-blocking performance on the gigabit ports, but I am having a little trouble.
Back on my old HP Procurve's this was a manual process. Thell the switch which ports to group together into what Procurve called a "Trunk" (which was ambiguous, as Cisco used the same term for a link with multiple VLAN's), tell the switrch which link aggregation mode to use, and then connect it to a similarly manually configured device on the other end.
Mikrotik's SwOS seems to automate things a little more....
From the Wiki:
Code: Select all
Mode (default: passive) Specify LACP packet exchange mode or Static LAG mode on ports: Passive: Place port in listening state, use LACP only when it's contrary port uses active LACP mode Active: Prefer to start LACP regardless contrary port mode Static: Set port in a Static LAG mode Group Specify a Static LAG group Trunk (read only) Represents group number port belongs to. Partner (read only) Represents partner mac-address.
The only way to manually select which ports are members of the LACP group seems to be to select "static" mode, other wise the group column cannot be populated. My gut was to use this method, as I usually don't trust automated things, but the manual is a little bit ambiguous if this results in true link aggregation to provide extra bandwidth, or if it is just failover.
Because of this I used the Active/Passive mode. I selected active on two SFP ports on both sides (CRS317 and CSS326) and just plugged in the short 1ft DAC cables (Molex Branded) and to my astonishment, it just worked. Both switches correctly auto-identified that they were in link aggregated mode, with the correct other port, and everything just worked.
I was pretty impressed, but that only lasted for 3-4 days.
Suddenly I had no connectivity across the switches. Troubleshooting ensued (first I thought it was my pfSense router, but it checked out)
Finally, I figured out that it was being caused by my beautiful automated LAG group. Somehow it had randomly forgotten that it was part of a LAG group, and the resultant loop was causing all sorts of problems network wide. Nothing obvious occurred that caused this to happen. There were no other changes made to any configuration.
A few questions:
1.) Did I do something wrong in configuring this? It seems possible, as good documentation seems difficult to find.
2.) Is forgetting aggregated links a common problem?
3.) In order to use link aggregation in the future, without this happening again, what should I do?
4.) If I use manually configured LAG, will I still get the full bandwidth doubling benefits, or will it just go into a fallback configuration?
I appreciate any help!