CRS518-16XS-2XQ-RM dropping packets with 10G copper SFP+ transceiver

I am working on a layer 2 protocol that sends about 450Mbit per stream. It’s a one-way protocol, going into the switch via 100G and out 1 of the SFP28 ports via the Mikrotik S+RJ10 SFP+10GBASE Copper Transceiver, which is plugged into a 1G receiving device.

To save time in discussion, just understand that getting from QSFP down to 1G RJ45 is a hard requirement, and this is why I picked this switch. It seems like with the above SFP+ transceiver, it should work. But, as far as I can tell, not all of the packets are making it to the 1G receiving device.

After initially dealing with oddly high CPU use I got CPU use down to 1% by tweaking settings and getting the packets moving fully offloaded from the qsfp28 to the sfp28 port. I’m not sure whether when fully offloaded the tx-drop stats should be accurate, but either way I have plugged the downstream 1G output into a linux box and captured the packets to see what’s going on, and I can see clearly that not all the packets are getting there (and its not just a few missing here or there, its significant chunks.. at least 15% of the packets just not showing up).

If I plug the 100G source directly into the same linux box (it also has a 100G input), I can see via packet capture that indeed all the packets are getting there.

So somehow the packets are disappearing in the switch, or perhaps on the way out via the SFP+ 10G Copper Transceiver.

Just wondering if anyone has thoughts about how to troubleshoot this, as I have tried every possible avenue I could think of.

Considering purchasing a different switch that will go straight to RJ45 1G without the transceiver and this seems like a contender, but I wonder whether I may run into the same issue…

https://mikrotik.com/product/crs326_4c_20g_2q_rm

Thanks!



Cheers!

i think you can try with some tunning of queuing on the switch

also try with routeros 7.16.2 so you can have winbox Gui for queuing info and stats, dont forget to upgrade routerboot on system routerboard until current firmware version equals your current routeros version

when you are done with upgrades then i think you can try this commands and see if situation improves

/interface ethernet switch
set 0 qos-hw-offloading=yes

/interface ethernet switch qos settings
set shared-buffers=90%

when you are on 7.16.2 using winbox 3.41 in the menu switch → qos → port you will have acces to queued and dropped packet stats, There you can detect if the switch is actually discarding packets on any interface on egress

also just in case, on involved ethernet interfaces look for rx-overflow counters to verify if paquets are being dropped on ingress

Thanks for the tips! I managed to find a solution in software by pacing the timing of the packets I was sending. I suspect that even though the bandwidth was only 400-500mbit, the bursts were exceeding 1G and overfilling some buffer somewhere. Pacing the traffic more evenly seemed to do the trick.