CRS504-4XQ to Broadcom NetXtreme-E 25Gb unstable

I have a CRS504-4XQ-IN switch (firmware 7.19.2) connected to 2 different Linux servers using a XQ+BC0003-XS+ (QSFP28 to 4x SFP28 break-out cable). I am only using cable 1 and 2, the other 2 are not connected. Both servers are using dual Broadcom NetXtreme-E 25Gb cards:

01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57414 NetXtreme-E 10Gb/25Gb RDMA Ethernet Controller (rev 01)
01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57414 NetXtreme-E 10Gb/25Gb RDMA Ethernet Controller (rev 01)

On one of the servers everything works fine but on the other there are frequent link downs. It goes back up in less than a second. Today there was no problem but, for instance, yesterday, it happened 5 times:

2025-07-01T08:06:04.948423+01:00 srv-nas1-dc kernel: [5477790.914659] bnxt_en 0000:01:00.0 en100g0: NIC Link is Down
2025-07-01T08:06:05.200466+01:00 srv-nas1-dc kernel: [5477791.165280] bnxt_en 0000:01:00.0 en100g0: NIC Link is Up, 25000 Mbps full duplex, Flow control: ON - receive & transmit
2025-07-01T09:20:52.220462+01:00 srv-nas1-dc kernel: [5482278.180902] bnxt_en 0000:01:00.0 en100g0: NIC Link is Down
2025-07-01T09:20:52.472467+01:00 srv-nas1-dc kernel: [5482278.431620] bnxt_en 0000:01:00.0 en100g0: NIC Link is Up, 25000 Mbps full duplex, Flow control: ON - receive & transmit
2025-07-01T09:20:52.980422+01:00 srv-nas1-dc kernel: [5482278.940879] bnxt_en 0000:01:00.0 en100g0: NIC Link is Down
2025-07-01T09:20:53.232472+01:00 srv-nas1-dc kernel: [5482279.191608] bnxt_en 0000:01:00.0 en100g0: NIC Link is Up, 25000 Mbps full duplex, Flow control: ON - receive & transmit
2025-07-01T10:18:18.040470+01:00 srv-nas1-dc kernel: [5485723.992554] bnxt_en 0000:01:00.0 en100g0: NIC Link is Down
2025-07-01T10:18:18.292463+01:00 srv-nas1-dc kernel: [5485724.243030] bnxt_en 0000:01:00.0 en100g0: NIC Link is Up, 25000 Mbps full duplex, Flow control: ON - receive & transmit
2025-07-01T15:36:35.164382+01:00 srv-nas1-dc kernel: [5504821.090564] bnxt_en 0000:01:00.0 en100g0: NIC Link is Down
2025-07-01T15:36:35.416365+01:00 srv-nas1-dc kernel: [5504821.340957] bnxt_en 0000:01:00.0 en100g0: NIC Link is Up, 25000 Mbps full duplex, Flow control: ON - receive & transmit

The second port of the Broadcom cards are connected directly from one server to the other. This link works fine. The other ports of the switch are connected at 100Gb to other servers and also work fine.

The configuration seems to be the same for both ports on the switch and on the servers.

Do you have a clue what could be happening or provide some recommendation on how to debug this?

Thank you.

RLM

Anyone knows how we can debug this problem?

Thank you in advance for any help you can provide!

RĂºben

Try connecting breakout cables #3 or #4 to the server that is having issues?

I will of course, but unfortunately the equipment is 250km away so I was looking for some remote debug tips before driving to the datacenter.

no logs on the switch ?