CRS326Q-24S+2Q+ packet drops

Hello all,

We have a setup with CRS326Q-24S+2Q+ as our core switch.
We booted SwOS to it.

When pinged in LAN it has 23% packed loss and average ping 10ms (from 1 ms to 2022ms).

Every few minutes (random time span) it completely stops forwarding packets for 2-5 seconds.

I have no idea how to debug it. Someone have ideas where to start?

Maybe CPU is high when you loose those pings ?

Switch becomes unresponsive in that period of time so it’s not possible to check it.

Please make pictures of your configuration and post here.

Well, it’s just default SwOS configuration…

We have few more interesting discoveries. Here is our setup and explanation.

We connected 10 CSS326-24G-2S+ with sfp+ to core CRS326Q-24S+2Q+ switch.
All of them are on default SwOS.
On each CSS326-24G-2S+ we connected 10 to 20 PCs with 1G copper.

We run ping test on those computers from main router and get interesting results.

2 switches from 10 have ping timeouts for 2-5 seconds each few minutes.
This happens on those two switches in the same time, while other 8 switches don’t have ping timeout.

We suspected that it was a problem with SFP+ modules and cables but nothing changed after we replaced those.
Next suspicious thing was that those two CSS326-24G-2S+ failed, so we replaced one of them to test.
After we replaced them we realized that the same problem happens again, on the new CSS326-24G-2S+ switch.

Next step - we replaced problematic CSS326-24G-2S+ with CSS326-24G-2S+ that was working fine and we get the same issue on those new ones.
This means that the group of PCs is causing this network issue, no matter on which switch they are connected.

We will try to identify which PCs are causing problems.
I suspect it’s something low level like duplicate MAC or something similar.

Do you guys have any ideas?

Are these PC-s have only single connection to the switches?

Yes, simple built-in gigabit NIC on consumer MBO.

If it’s significant, there is one server in network that we connected directly to core switch with double qsfp+ and teamed it’s NIC on Windows server side (not on Mikrotik side).

Maybe you have broadcast storm in your environment…configure lacp on sw side too or configure Broadcast Storm Control, by default it’s 100% set it to 5%.
https://wiki.mikrotik.com/wiki/SwOS/CSS326#LAG

Maybe RSTP playing with you rough game, as you said all switches are with def conf. You should configure RSTP properly or disable it. Core switch should be root bridge
https://wiki.mikrotik.com/wiki/SwOS/CSS326#RSTP
https://wiki.mikrotik.com/wiki/Manual:Spanning_Tree_Protocol


Also look in switch menu at error tab, are there something interesting?

Hi,

maybe we have the same errors.
We first tested a 10Gbit setup with a CRS305 - the 4 Port SFP+ Mikrotik Switch.
We use this SFP+ DAC Cables: https://www.fs.com/de/products/74621.html
This worked fantastic.

We then bought 4x CRS326-24S+2Q+RM.
We wanted to upgrade our Hyper-V Cluster and Storage to 10Gbit.

We had a lot of problems - server freezes (until network connection was terminated) , ping timeouts, error connections iscsi volumes etc.
Then we stopped the upgrade and kept at 1Gbit Speed.

We noticed through the statistics tab of our Emulex network cards, that we have from 300 to 6000 CRS Errors within 10 Seconds
of iperf3 tests - depending on the port and the cable we used.
We have these errors with Emulex 10Gbit OCE11102 network cards also with HP NC523SFP network cards.

If we connect two servers directly with the same DAC cable we have no errors.
If we use 2m cisco compatible cables we also have no errors.
If we use 5m Fujitsu compatible cables we also have no errors.
If we use FS-10Gbit GBics , Dell 10Gbit Gbics, Intel 10Gbit GBics we have no errors.

So there is a incompatibility between the mikrotik switch and these cables.
As the cables make no problems when directly connection servers with them here seems to be a problem with the mikrotik
switch.

  • CRS305 -FS.com Generic DAC cable - no errors
  • CRS326 -FS.com Generic DAC cable - CRC errors (errors occur with all 4 Mikrotik Switches we have)