Weird SFP 1Gb link issue between CRS328-4C-20S-4S+ (RouterOS) and netpower lite 7R

When it happened once, I thought it was just a bad device. But it happened again, in a different location using the same type of devices and similar setup.

Something causes the netpower lite 7R to issue DHCP requests very often (as if there was very brief link down/up, not seen in the CRS328-4C RouterOS logs), intervals are a bit random but about every 1-2 minutes on average. But the weird thing is, it depends on the link state of other CRS328-4C SFP ports.

The 328 had active links on ports sfp2, sfp3, sfp4. The netpower lite 7R is connected to sfp4. It has been working for a long time. The issue (many DHCP requests from the 7R) only started after sfp3 link went down (device disconnected at the other end). However it only happens with if multiple SFP modules are inserted in the 328, even with no active links - couldn’t reproduce this with just one SFP in one port.

No change with auto-neg or forced 1Gb. I think I may have seen “link down” on the 7R once, but only very briefly (and it has no logs, and logs in the 328 at the other end don’t show any link downs). Each time this happens, Rx FCS errors on the 7R SFP port also increments by one. Using just one port in the 328 fixed the issue, so did using some other switch in place of the 328.

So the 328 sends bad frames with FCS errors under some conditions (depending on state of other adjacent ports), but the 7R (swos lite 2.20) also over-reacts by sending a new DHCP request after each Rx FCS error. This happens on two different sets of (328, 7R, SFP modules, single mode fiber) so I no longer believe it was just one bad 328 as I previously assumed.

Has anyone else seen similar issues? It turned out I don’t need so many ports so will probably replace the 328 with 310, hopefully it will not repeat there. Smaller board, easier to avoid long traces with signal integrity issues etc.