General Performance Problems with CRS518

Hi All,

I have four CRS518 installed in our datacenter. The Uplink is done using the 100G Ports with QSFP28. I have fixed the port speed to 100G.

The SFPs seem to work fine, i.e. I can ping the other side.

Only thing… the throughput (with four different SFP+ / SFP28 Modules… is less than 1GBit/s (at times: 100MBit/s).

I have the latest (7.14.3) release installed and see no error messages.

The same SFP+ (10G) work on CRS326 and 317 just fine - so I don’t believe we have a compatibility issue (same SFP+ on the CRS518 → very low bandwidth).

The current setup is consisting of one bridge - and no vlans (open for any vlan) - I do the VLANs on unifi and other devices that are attached to the Switches. For now I have stepped away from using MLAG as this is not yet stable enough over various releases… )

ANY hints what could be wrong?

I don’t have ANY errors on any devices. Also, I tested multiple different fiber cables, too - all to now avail.

Any help highly appreciated!

Tobias

tried to switch rate-select ?

/interface ethernet set {interface} sfp-rate-select=high

or

/interface ethernet set {interface} sfp-rate-select=low

are the low bandwidths on the SFP+ ports or on the QSFP (100G) ports? because on the 100G links there might be an improvement if you configure FEC on the CRS and the other switch (FEC must be the same on both link ends!)

Hi,

Rate-Select is set to high on all devices (FEC-Mode is set to auto)

It’s the same on all interfaces - and no matter what I do…

  • CRS326 here: 6GBit/s with the bandwidth Tester
  • CRS518 there: 100MBit/s

awful.

I have to append that we’re using single mode SFPs for the 100G (we have single mode cabling). I have not seen any package drops…

name: qsfp28-1-1
status: link-ok
auto-negotiation: disabled
rate: 100Gbps
full-duplex: yes
tx-flow-control: yes
rx-flow-control: yes
fec: off
supported: 10M-baseT-half,10M-baseT-full,100M-baseT-half,100M-baseT-full,1G-baseT-half,1G-baseT-full,
1G-baseX,2.5G-baseT,2.5G-baseX,5G-baseT,10G-baseT,10G-baseSR-LR,10G-baseCR,40G-baseSR4-LR4,
40G-baseCR4,25G-baseSR-LR,25G-baseCR,50G-baseSR2-LR2,50G-baseCR2,100G-baseSR4-LR4,
100G-baseCR4
sfp-supported: 1G-baseX,10G-baseSR-LR,25G-baseSR-LR,100G-baseSR4-LR4
sfp-module-present: yes
sfp-type: QSFP28/QSFP56
sfp-connector-type: LC
sfp-link-length-sm: 10km
sfp-vendor-name: FS
sfp-vendor-part-number: QSFP28-LR4-100G
sfp-vendor-revision: 01
sfp-vendor-serial: G2311058537
sfp-manufacturing-date: 20231123
sfp-wavelength: 1310nm
sfp-temperature: 44C
sfp-supply-voltage: 3.272V
sfp-tx-bias-current: 56mA
sfp-tx-power: 0.614dBm
sfp-rx-power: 0.768dBm
eeprom-checksum: good
eeprom: 0000: 11 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 … …
0010: 00 00 00 00 00 00 2c 9d 00 00 7f d0 00 00 00 00 …,. …
0020: 00 00 2e a1 2d 89 2b 02 2a f7 6e 19 6b 9d 6e 89 …-.+. *.n.k.n.
0030: 6e 7e 2d 00 36 91 2f a6 32 8b 00 00 00 00 00 00 n~-.6./. 2…
0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 … …
*
0060: 00 00 ff 00 00 00 00 00 00 00 1f 00 00 00 00 00 … …
0070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 … …
0080: 11 ce 07 80 00 00 00 00 00 00 00 03 ff 02 0a 00 … …
0090: 00 00 00 64 46 53 20 20 20 20 20 20 20 20 20 20 …dFS

The 25GB DA Cables look like this:
name: sfp28-16
status: link-ok
auto-negotiation: disabled
rate: 25Gbps
full-duplex: yes
tx-flow-control: no
rx-flow-control: no
fec: off
supported: 10M-baseT-half,10M-baseT-full,100M-baseT-half,100M-baseT-full,1G-baseT-half,
1G-baseT-full,1G-baseX,2.5G-baseT,2.5G-baseX,5G-baseT,10G-baseT,10G-baseSR-LR,
10G-baseCR,25G-baseSR-LR,25G-baseCR
sfp-supported: 1G-baseT-full,1G-baseX,2.5G-baseT,2.5G-baseX,5G-baseT,10G-baseCR,25G-baseCR
sfp-module-present: yes
sfp-rx-loss: no
sfp-tx-fault: no
sfp-type: SFP/SFP+/SFP28/SFP56
sfp-connector-type: copper-pigtail
sfp-link-length-copper-active-om4: 1m
sfp-vendor-name: FS
sfp-vendor-part-number: S28-PC01
sfp-vendor-revision: A
sfp-vendor-serial: F2230383968-2
sfp-manufacturing-date: 24-01-25
sfp-dwdm-channel-spacing: 13Ghz
eeprom-checksum: good
eeprom: 0000: 03 04 21 00 00 00 00 00 04 00 00 00 ff 00 00 00 ..!.. …
0010: 00 00 01 00 46 53 20 20 20 20 20 20 20 20 20 20 …FS
0020: 20 20 20 20 0d 78 a7 14 53 32 38 2d 50 43 30 31 .x.. S28-PC01
0030: 20 20 20 20 20 20 20 20 41 20 20 20 00 00 00 44 A …D
0040: 00 00 68 00 46 32 32 33 30 33 38 33 39 36 38 2d ..h.F223 0383968-
0050: 32 20 20 20 32 34 30 31 32 35 20 20 00 00 00 e7 2 2401 25 …
0060: 80 00 1b 82 86 c2 8a 85 9b c4 bf 45 13 73 85 f1 … …E.s..
0070: 7f a5 d2 00 00 00 00 00 00 00 00 00 93 34 e9 f3 … …4..
0080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 … …


EDIT:

are the low bandwidths on the SFP+ ports or on the QSFP (100G) ports? because on the 100G links there might be an improvement if you configure
FEC on the CRS and the other switch (FEC must be the same on both link ends!)

No, they are not. But I’m anyways not talking about 10 or 15GBit - I am talking about MEGABITS. We currently have 300 MBit/sec throughput. Not even one Gigabit.

Something is completely off - and I have no clue what it is at the moment. No matter what ports I use… 25GBit/s DAC, 10Gbit/s SFPs, 10GBit/s DAC or 100GBit/s QSFP25 - the CPU load goes up like crazy (and yes, the system shows the HW Offload is on) but it’s like having a 20 year old cablemodem wedged in between (probably even worse).

Thanks
Tobias

I am having the same issue with a CRS518-16XS-2XQ connected via XQ+31LC10D to a CRS-530-4XS-16XQ. Using the ROS built in bandwidth tester the best I get is 200 MBit receive, 50-75 MBit send from the CRS518. Running ROS 7.17.1 The QSFP ports are set to FEC91 and rate select is set to high. L3HW is enabled for all of the ports. CPU pegs at 100%.

Never run bandwidth test on the device itself.
It requires quite a bit of CPU power which CRS devices typically don’t have, as you have observed.

Always test THROUGH the device. Best to use iperf3 using 2 PCs or so.

Interesting though I don’t see this particular bottleneck with CCR1009’s or a hAPac^3.

I’ll give iperf a spin and post an update

CCR1009 is a beast, no comparison there with CRS518 :laughing:
It’s also MEANT to be a router. CRS is a switch.

Even AC3 has a more powerful processor then CRS518 when it comes to these things.

ran iperf3 through the CRS518 and the throughtput was about 115 MBit and the CPU was maxed out.

Connected the same PC running iperf3 to the CRS520 and tested to the same server and got close to gigabit (>900 MBit). CPU didn’t seem that affected.

CRS518 used in that setup as a switch or as a router ?
Any VLANs at play which are not HW offloaded ?
Config can make a big difference in performance ... and since you indicate CPU was maxed out, it had to do something which it probably shouldn't ?

I doubt in both cases same PC and same server were both times in same subnet or something is really off with that CRS518.

Dot1x is in play on the CRS518. The QSFP ports are all trunks. The SFP28s are part of a Dot1X with radius that authenticates via MAC address and places the user on their assigned VLAN, and that works as expected. A CRS510-8XS-2XQ connects to that and that switch provides a NAT to the end user.

I also learned the hard way the management ethernet port is not part of the HW offload. So, I removed that from the bridge and plugged a PC into a port on the CRS518, was assigned the correct VLAN, and I measured close to a gigabit with iperf3 and moderate CPU activity.

So to answer your question, the CRS518 is not acting as a router, its just breaking out VLANs from trunks. The CRS510 is acting as a router.

I’ll start a new thread with an appropriate title for the CRS510.