CRS520-4XS-16XQ capped at 25Gb with Connectx-6 cards

Hello,
We are unable to get advertised speeds using CRS520-4XS-16XQ - it always caps at 25Gb with capable cables and ConnectX6 cards on AMD Genoa Epyc servers.
We opened a support case months ago, but besides suggesting iperf3 upgrade (that of course did not change anything) there has been no suggestions as to what to try.

 ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  2.88 GBytes  2.48 Gbits/sec  249            sender
[  5]   0.00-10.00  sec  2.88 GBytes  2.47 Gbits/sec                  receiver
[  7]   0.00-10.00  sec  2.88 GBytes  2.48 Gbits/sec   66            sender
[  7]   0.00-10.00  sec  2.88 GBytes  2.47 Gbits/sec                  receiver
[  9]   0.00-10.00  sec  2.88 GBytes  2.47 Gbits/sec  208            sender
[  9]   0.00-10.00  sec  2.87 GBytes  2.47 Gbits/sec                  receiver
[ 11]   0.00-10.00  sec  2.84 GBytes  2.44 Gbits/sec   63            sender
[ 11]   0.00-10.00  sec  2.84 GBytes  2.44 Gbits/sec                  receiver
[ 13]   0.00-10.00  sec  2.81 GBytes  2.41 Gbits/sec  326            sender
[ 13]   0.00-10.00  sec  2.80 GBytes  2.41 Gbits/sec                  receiver
[ 15]   0.00-10.00  sec  2.81 GBytes  2.42 Gbits/sec   55            sender
[ 15]   0.00-10.00  sec  2.81 GBytes  2.41 Gbits/sec                  receiver
[ 17]   0.00-10.00  sec  2.78 GBytes  2.39 Gbits/sec   60            sender
[ 17]   0.00-10.00  sec  2.78 GBytes  2.39 Gbits/sec                  receiver
[ 19]   0.00-10.00  sec  2.80 GBytes  2.40 Gbits/sec   67            sender
[ 19]   0.00-10.00  sec  2.79 GBytes  2.40 Gbits/sec                  receiver
[ 21]   0.00-10.00  sec  2.87 GBytes  2.46 Gbits/sec  332            sender
[ 21]   0.00-10.00  sec  2.87 GBytes  2.46 Gbits/sec                  receiver
[ 23]   0.00-10.00  sec  2.80 GBytes  2.40 Gbits/sec  299            sender
[ 23]   0.00-10.00  sec  2.80 GBytes  2.40 Gbits/sec                  receiver
[SUM]   0.00-10.00  sec  28.3 GBytes  24.3 Gbits/sec  1725             sender
[SUM]   0.00-10.00  sec  28.3 GBytes  24.3 Gbits/sec                  receiver

iperf Done.
[root@gaiadbgpu01 ~]#  /usr/local/bin/iperf3 -version
iperf 3.19.1+ (cJSON 1.7.15)
Linux gaiadbgpu01.astro.unige.ch 6.1.72-1.el9.elrepo.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Jan 10 13:04:42 EST 2024 x86_64
Optional features available: CPU affinity setting, IPv6 flow label, TCP congestion algorithm setting, sendfile /
zerocopy, socket pacing, authentication, bind to device, support IPv4 don't fragment, POSIX threads
Device #1:
----------

Device type:        ConnectX6
Name:               MCX653105A-ECA_Ax
Description:        ConnectX-6 VPI adapter card; 100Gb/s (HDR100; EDR IB and 100GbE); single-port QSFP56; PCIe3.0 x16; tall bracket; ROHS R6
Device:             0000:a1:00.0

Configurations:                                          Next Boot
        MEMIC_BAR_SIZE                              0
        MEMIC_SIZE_LIMIT                            _256KB(1)
        HOST_CHAINING_MODE                          DISABLED(0)
        HOST_CHAINING_CACHE_DISABLE                 False(0)
        HOST_CHAINING_DESCRIPTORS                   Array[0..7]
        HOST_CHAINING_TOTAL_BUFFER_SIZE             Array[0..7]
        FLEX_PARSER_PROFILE_ENABLE                  0
        FLEX_IPV4_OVER_VXLAN_PORT                   0
        ROCE_NEXT_PROTOCOL                          254
        ESWITCH_HAIRPIN_DESCRIPTORS                 Array[0..7]
        ESWITCH_HAIRPIN_TOT_BUFFER_SIZE             Array[0..7]
        PF_BAR2_SIZE                                0
        PF_NUM_OF_VF_VALID                          False(0)
        NON_PREFETCHABLE_PF_BAR                     False(0)
        VF_VPD_ENABLE                               False(0)
        PF_NUM_PF_MSIX_VALID                        False(0)
        PER_PF_NUM_SF                               False(0)
        STRICT_VF_MSIX_NUM                          False(0)
        VF_NODNIC_ENABLE                            False(0)
        NUM_PF_MSIX_VALID                           True(1)
        NUM_OF_VFS                                  0
        NUM_OF_PF                                   1
        PF_BAR2_ENABLE                              False(0)
        SRIOV_EN                                    False(0)
        PF_LOG_BAR_SIZE                             5
        VF_LOG_BAR_SIZE                             1
        NUM_PF_MSIX                                 63
        NUM_VF_MSIX                                 11
        INT_LOG_MAX_PAYLOAD_SIZE                    AUTOMATIC(0)
        PCIE_CREDIT_TOKEN_TIMEOUT                   0
        PHY_COUNT_LINK_UP_DELAY                     DELAY_NONE(0)
        ACCURATE_TX_SCHEDULER                       False(0)
        PARTIAL_RESET_EN                            False(0)
        RESET_WITH_HOST_ON_ERRORS                   False(0)
        DISABLE_SLOT_POWER_LIMITER                  True(1)
        ADVANCED_POWER_SETTINGS                     True(1)
        CQE_COMPRESSION                             BALANCED(0)
        IP_OVER_VXLAN_EN                            False(0)
        MKEY_BY_NAME                                False(0)
        PRIO_TAG_REQUIRED_EN                        False(0)
        UCTX_EN                                     True(1)
        PCI_ATOMIC_MODE                             PCI_ATOMIC_DISABLED_EXT_ATOMIC_ENABLED(0)
        TUNNEL_ECN_COPY_DISABLE                     False(0)
        LRO_LOG_TIMEOUT0                            6
        LRO_LOG_TIMEOUT1                            7
        LRO_LOG_TIMEOUT2                            8
        LRO_LOG_TIMEOUT3                            13
        LOG_TX_PSN_WINDOW                           7
        LOG_MAX_OUTSTANDING_WQE                     7
        ROCE_ADAPTIVE_ROUTING_EN                    False(0)
        TUNNEL_IP_PROTO_ENTROPY_DISABLE             False(0)
        SWITCH_COMPT_FEATURE_MASK                   0x76(118)
        ICM_CACHE_MODE                              DEVICE_DEFAULT(0)
        HAIRPIN_DATA_BUFFER_LOCK                    False(0)
        TX_SCHEDULER_BURST                          0
        LOG_MAX_QUEUE                               17
        LARGE_MTU_TWEAK_64                          False(0)
        AES_XTS_TWEAK_INC_64                        False(0)
        CRYPTO_POLICY                               UNRESTRICTED(1)
        RDE_DISABLE                                 False(0)
        PLDM_FW_UPDATE_DISABLE                      False(0)
        RBT_DISABLE                                 False(0)
        PCIE_SMBUS_DISABLE                          False(0)
        PCIE_IN_BAND_VDM_DISABLE                    False(0)
        LOG_DCR_HASH_TABLE_SIZE                     11
        MAX_PACKET_LIFETIME                         0
        DCR_LIFO_SIZE                               16384
        LINK_TYPE_P1                                ETH(2)
        NUM_OF_PLANES_P1                            0
        IB_PROTO_WIDTH_EN_MASK_P1                   0
        ROCE_CC_PRIO_MASK_P1                        255
ROCE_CC_CNP_MODERATION_P1                   DEVICE_DEFAULT(0)
        CLAMP_TGT_RATE_AFTER_TIME_INC_P1            True(1)
        CLAMP_TGT_RATE_P1                           False(0)
        RPG_TIME_RESET_P1                           300
        RPG_BYTE_RESET_P1                           32767
        RPG_THRESHOLD_P1                            1
        RPG_MAX_RATE_P1                             0
        RPG_AI_RATE_P1                              5
        RPG_HAI_RATE_P1                             50
        RPG_GD_P1                                   11
        RPG_MIN_DEC_FAC_P1                          50
        RPG_MIN_RATE_P1                             1
        RATE_TO_SET_ON_FIRST_CNP_P1                 0
        DCE_TCP_G_P1                                1019
        DCE_TCP_RTT_P1                              1
        RATE_REDUCE_MONITOR_PERIOD_P1               4
        INITIAL_ALPHA_VALUE_P1                      1023
        MIN_TIME_BETWEEN_CNPS_P1                    4
        CNP_802P_PRIO_P1                            6
        CNP_DSCP_P1                                 48
        LLDP_NB_DCBX_P1                             False(0)
        LLDP_NB_RX_MODE_P1                          OFF(0)
        LLDP_NB_TX_MODE_P1                          OFF(0)
        ROCE_RTT_RESP_DSCP_P1                       0
        ROCE_RTT_RESP_DSCP_MODE_P1                  DEVICE_DEFAULT(0)
        DCBX_IEEE_P1                                True(1)
        DCBX_CEE_P1                                 True(1)
        DCBX_WILLING_P1                             True(1)
        KEEP_ETH_LINK_UP_P1                         True(1)
        KEEP_IB_LINK_UP_P1                          False(0)
        KEEP_LINK_UP_ON_BOOT_P1                     False(0)
        KEEP_LINK_UP_ON_STANDBY_P1                  False(0)
        DO_NOT_CLEAR_PORT_STATS_P1                  False(0)
        AUTO_POWER_SAVE_LINK_DOWN_P1                False(0)
        NUM_OF_VL_P1                                _4_VLs(3)
        NUM_OF_TC_P1                                _8_TCs(0)
        NUM_OF_PFC_P1                               8
        VL15_BUFFER_SIZE_P1                         0
        QOS_TRUST_STATE_P1                          TRUST_PCP(1)
        ETS_SCHED_MODE_P1                           device_default(0)
        DUP_MAC_ACTION_P1                           LAST_CFG(0)
        MPFS_MC_LOOPBACK_DISABLE_P1                 False(0)
        MPFS_UC_LOOPBACK_DISABLE_P1                 False(0)
        UNKNOWN_UPLINK_MAC_FLOOD_P1                 False(0)
        SRIOV_IB_ROUTING_MODE_P1                    LID(1)
        IB_ROUTING_MODE_P1                          LID(1)
        PHY_AUTO_NEG_P1                             DEVICE_DEFAULT(0)
        PHY_RATE_MASK_OVERRIDE_P1                   False(0)
        PHY_FEC_OVERRIDE_P1                         DEVICE_DEFAULT(0)
        PF_TOTAL_SF                                 0
        PF_SD_GROUP                                 0
        PF_SF_BAR_SIZE                              0
        PF_NUM_PF_MSIX                              63
        SILENT_MODE                                 False(0)
        MKEY_BY_NAME_RANGE                          DEVICE_DEFAULT(0)
        ROCE_CONTROL                                ROCE_ENABLE(2)
        PCI_WR_ORDERING                             per_mkey(0)
        MULTI_PORT_VHCA_EN                          False(0)
        PORT_OWNER                                  True(1)
        ALLOW_RD_COUNTERS                           True(1)
        RENEG_ON_CHANGE                             True(1)
        TRACER_ENABLE                               True(1)
        IP_VER                                      IPv4(0)
        BOOT_UNDI_NETWORK_WAIT                      0
        UEFI_HII_EN                                 True(1)
        BOOT_DBG_LOG                                False(0)
        UEFI_LOGS                                   DISABLED(0)
        BOOT_VLAN                                   1
        LEGACY_BOOT_PROTOCOL                        PXE(1)
        BOOT_INTERRUPT_DIS                          False(0)
        BOOT_LACP_DIS                               True(1)
        BOOT_VLAN_EN                                False(0)
        BOOT_PKEY                                   0
        P2P_ORDERING_MODE                           DEVICE_DEFAULT(0)
        ATS_ENABLED                                 False(0)
        DYNAMIC_VF_MSIX_TABLE                       False(0)
        EXP_ROM_UEFI_ARM_ENABLE                     True(1)
        EXP_ROM_UEFI_x86_ENABLE                     True(1)
        EXP_ROM_PXE_ENABLE                          True(1)
        ADVANCED_PCI_SETTINGS                       False(0)
        SAFE_MODE_THRESHOLD                         10
        SAFE_MODE_ENABLE                            True(1)
ethtool -g enp161s0np0
Ring parameters for enp161s0np0:
Pre-set maximums:
RX:             8192
RX Mini:        n/a
RX Jumbo:       n/a
TX:             8192
Current hardware settings:
RX:             8192
RX Mini:        n/a
RX Jumbo:       n/a
TX:             8192
RX Buf Len:     n/a
CQE Size:       n/a
TX Push:        off
TCP data split: off

And the cards report 100Gb (same cables worked well at 40-56Gb with an old mellanox switch, and are new, so very unlikely these are the culprit.

Settings for enp161s0np0:
        Supported ports: [ Backplane ]
        Supported link modes:   1000baseT/Full
                                10000baseT/Full
                                1000baseKX/Full
                                10000baseKR/Full
                                10000baseR_FEC
                                40000baseKR4/Full
                                40000baseCR4/Full
                                40000baseSR4/Full
                                40000baseLR4/Full
                                25000baseCR/Full
                                25000baseKR/Full
                                25000baseSR/Full
                                50000baseCR2/Full
                                50000baseKR2/Full
                                100000baseKR4/Full
                                100000baseSR4/Full
                                100000baseCR4/Full
                                100000baseLR4_ER4/Full
                                50000baseSR2/Full
                                1000baseX/Full
                                10000baseCR/Full
                                10000baseSR/Full
                                10000baseLR/Full
                                10000baseER/Full
                                50000baseKR/Full
                                50000baseSR/Full
                                50000baseCR/Full
                                50000baseLR_ER_FR/Full
                                50000baseDR/Full
                                100000baseKR2/Full
                                100000baseSR2/Full
                                100000baseCR2/Full
                                100000baseLR2_ER2_FR2/Full
                                100000baseDR2/Full
        Supported pause frame use: Symmetric
        Supports auto-negotiation: Yes
        Supported FEC modes: None        RS      BASER
        Advertised link modes:  1000baseT/Full
                                10000baseT/Full
                                1000baseKX/Full
                                10000baseKR/Full
                                10000baseR_FEC
                                40000baseKR4/Full
                                40000baseCR4/Full
                                40000baseSR4/Full
                                40000baseLR4/Full
                                25000baseCR/Full
                                25000baseKR/Full
                                25000baseSR/Full
                                50000baseCR2/Full
                                50000baseKR2/Full
                                100000baseKR4/Full
                                100000baseSR4/Full
                                100000baseCR4/Full
                                100000baseLR4_ER4/Full
                                50000baseSR2/Full
                                1000baseX/Full
                                10000baseCR/Full
                                10000baseSR/Full
                                10000baseLR/Full
                                10000baseER/Full
                                50000baseKR/Full
                                50000baseSR/Full
                                50000baseCR/Full
                                50000baseLR_ER_FR/Full
                                50000baseDR/Full
                                100000baseKR2/Full
                                100000baseSR2/Full
                                100000baseCR2/Full
                                100000baseLR2_ER2_FR2/Full
                                100000baseDR2/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: Yes
        Advertised FEC modes: None
        Speed: 100000Mb/s
        Duplex: Full
        Auto-negotiation: on
        Port: Direct Attach Copper
        PHYAD: 0
       Transceiver: internal
        Supports Wake-on: d
        Wake-on: d
        Current message level: 0x00000004 (4)
                               link
        Link detected: yes

Would be grateful for any hints.
Krzysztof

Hi,
"Smells" like that: MikroTik XQ+BC0003-XS+ with Broadcom P225p NX-E Dual - #2 by Otono

Thanks for the hint! - indeed we see 4x25Gb links on the router and the connections are copper.
The connection seems stable though, just capped.
Will check if can try other cables.

Search forum for a "100GB" keyword as there are some more topics on getting 100GB connection.

1 Like

We have had ticket opened for months now.
After the suggested cables checks, iperf upgrades, direct connections between the nodes that show cables and cards were ok, the last suggestion was to have one bridge (we indeed had two). But removal did not change anything, we are still capped at 25Gbs which occasionally jumps to 30Gbs.Have been waiting for weeks now for the support to respond. This is very worrying.