I have a CRS504-4XQ-IN switch (firmware 7.19.2) connected to 2 different Linux servers using a XQ+BC0003-XS+ (QSFP28 to 4x SFP28 break-out cable). I am only using cable 1 and 2, the other 2 are not connected. Both servers are using dual Broadcom NetXtreme-E 25Gb cards:
01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57414 NetXtreme-E 10Gb/25Gb RDMA Ethernet Controller (rev 01)
01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57414 NetXtreme-E 10Gb/25Gb RDMA Ethernet Controller (rev 01)
On one of the servers everything works fine but on the other there are frequent link downs. It goes back up in less than a second. Today there was no problem but, for instance, yesterday, it happened 5 times:
2025-07-01T08:06:04.948423+01:00 srv-nas1-dc kernel: [5477790.914659] bnxt_en 0000:01:00.0 en100g0: NIC Link is Down
2025-07-01T08:06:05.200466+01:00 srv-nas1-dc kernel: [5477791.165280] bnxt_en 0000:01:00.0 en100g0: NIC Link is Up, 25000 Mbps full duplex, Flow control: ON - receive & transmit
2025-07-01T09:20:52.220462+01:00 srv-nas1-dc kernel: [5482278.180902] bnxt_en 0000:01:00.0 en100g0: NIC Link is Down
2025-07-01T09:20:52.472467+01:00 srv-nas1-dc kernel: [5482278.431620] bnxt_en 0000:01:00.0 en100g0: NIC Link is Up, 25000 Mbps full duplex, Flow control: ON - receive & transmit
2025-07-01T09:20:52.980422+01:00 srv-nas1-dc kernel: [5482278.940879] bnxt_en 0000:01:00.0 en100g0: NIC Link is Down
2025-07-01T09:20:53.232472+01:00 srv-nas1-dc kernel: [5482279.191608] bnxt_en 0000:01:00.0 en100g0: NIC Link is Up, 25000 Mbps full duplex, Flow control: ON - receive & transmit
2025-07-01T10:18:18.040470+01:00 srv-nas1-dc kernel: [5485723.992554] bnxt_en 0000:01:00.0 en100g0: NIC Link is Down
2025-07-01T10:18:18.292463+01:00 srv-nas1-dc kernel: [5485724.243030] bnxt_en 0000:01:00.0 en100g0: NIC Link is Up, 25000 Mbps full duplex, Flow control: ON - receive & transmit
2025-07-01T15:36:35.164382+01:00 srv-nas1-dc kernel: [5504821.090564] bnxt_en 0000:01:00.0 en100g0: NIC Link is Down
2025-07-01T15:36:35.416365+01:00 srv-nas1-dc kernel: [5504821.340957] bnxt_en 0000:01:00.0 en100g0: NIC Link is Up, 25000 Mbps full duplex, Flow control: ON - receive & transmit
The second port of the Broadcom cards are connected directly from one server to the other. This link works fine. The other ports of the switch are connected at 100Gb to other servers and also work fine.
The configuration seems to be the same for both ports on the switch and on the servers.
Do you have a clue what could be happening or provide some recommendation on how to debug this?
Thank you.
RLM