RB760iGS - when using SFP ethernet interfaces get locked

This problem happens only when SFP module is inserted - probably because then ethernet ports share all the same lane and the SFP gets its own private lane to the CPU.
After some time (could be days or weeks, sometimes only hours) other interfaces stop working - RB sees ARP table, but I don’t have ping or even arping to those hosts connected to other interfaces.
If i do some operation (like disable/enable) on those interfaces, all ARP (on other than SFP interfaces) get lost and no longer appears.
Only reboot fixes it.
This problem appears on several RB760iGS (probably all that use SFP).
I created support ticket more than a month ago, attached 2 supout.rif files - you think I got ANY reaction? :frowning:

Is the SFP module a compatible one ?
Network diagram ?

The only way to work near stable on sfp1 on hEX s is:
do not support non-half and non-1G speed on any sfp compatible modules

/interface ethernet
set [ find default-name=sfp1 ] advertise=1000M-full arp=enabled arp-timeout=\
    auto auto-negotiation=yes bandwidth=unlimited/unlimited disabled=no \
    full-duplex=yes l2mtu=1596 loop-protect=default loop-protect-disable-time=\
    5m loop-protect-send-interval=5s mtu=1500 \
    name=sfp1 rx-flow-control=off speed=1Gbps tx-flow-control=off

It should be, according to the wiki:

MikroTik devices and SFP/SFP+/QSFP+ modules do not have any restrictions for other vendor equipment. As long as the other vendor modules and devices comply with transceiver multi-source agreement (MSA) they should be compatible with MikroTik.

it’s common “Cisco compatible” SFP that we use on hundreds of other devices without any problem, even other Routerboard models
but I can try using Mikrotik SFP if it makes any difference…

Diagram is simple, something like this:
Juniper swith — EdgeCore switch — sfp1 on RB760iGS = WAN (172.17.21.5/24)
Bridge (ether1 - ether5) = LAN (/29 public IP routed via WAN using OSPF/BGP)
(or even without bridge, doesn’t make a difference)

well connection is stable as far as sfp1 goes - but other ethernet ports get locked, I can still access RB via SFP when that happens
and we use only 1Gbit FD modules

What SFP module you use?

GigaLight 1.25G SM 1310/1550 (so like S-35/53LC20D)

I have the same problem with two RB760iGS.
It’s is strange because I have totally six RB760iGS. And problems are observed just on two of them.
And I can not understand what is difference between routers which operate well and ones with problem.

Anyone find a solution?

I have fifteen RB760 in my network, they are all working fine using the SFP interface until some days ago with firmware 6.48.6.
Then, suddently a bounch of them start to showing the exactly same behavior that you describe.
We upgraded the firmware for the 6.49.10 and the problems persist.

Anyone expirience this behavior with RouterOS V7 ?

I have the same problem. I have tried from v6 to v7.

V7 has more CPU resources
I do not recommend it with the SFP port installed until Mikrotik helps us with a version that fixes it.

Any news in this topic? I am using about ~300-400 HEX S in my network and only some of them has problem as described. Unfortunately I didn’t find solution yet.

Hi,

I encountered this issue this morning.

Updating a Hex S from 7.12.2 to 7.18.2, all ports started flapping, not just the SFP.

The router starts rebooting with a kernel panic, and there’s no way to get any information or support.

Netinstall and v7.12.2 are back to normal.

I’m sure this line has something to do with it…

What’s new in 7.18 (2025-Feb-24 10:47):
*) sfp - fixed missing “1G-baseX” supported rate for NetMetal ac2 and hEX S devices;

Regards,

Did you upgrade both RouterOS and firmware, @wispmikrotik?

Yes, both (I always do).

I also tried only updating routerOS with the same result, the office started its service and I had to leave it with v7.12.2

Regards,

Hello, I have the same issue with the RB760iGS (Hex S). After updating to version 7.18.2, I get a kernel failure, kernel panic, and the router continuously reboots. I tried downgrading to versions 7.15.2 and 7.16.2; the kernel failure no longer persists, but 1-2 times a week, the router freezes and stops responding. The interfaces disappear, and I can’t even see it on the neighbour list. The only solution is a physical reboot. Please let me know if there is any solution.

By the way, the problem also persists on the L009. After updating to v7.18, a kernel error occurs and the router reboots continuously when the SFP is connected. It works stably on version 7.16.2.

The support team they ask for the supout.rif file… but it is impossible to extract.

I’m facing the same issue with a RB760iGS that use the SFP interface. I believe that only one of them does not have this problem.
I tried several different firmwares and the only temporary solution I found was to use RJ45 media converters instead of the SFP port, which increases the cost, making the device unviable.
Unfortunate, because we plan to replace more than 200 RB951 with 760… now we are looking for similar solutions from competitors.

Please, share any update if you find a solution for this.

Odd, I have L009 using 7.18.2 connected via SFP to RB5009, no issues.

It’s not that they do not want, they can not without supout of your device.
If the device reboots due to crash, there should be an autosupout.rif file.
Give them that.

So, I provided the supout file, and in return, I was given a development firmware version, which, after being installed on the MikroTik, seems to have resolved the issue. The watchdog timer issue still needs to be monitored over time, but for now, after about 2 hours of functionality, everything seems fine. At least the router no longer enters continuous reboot, and the interfaces no longer disappear. It only remains to be verified over time. I thank the support team for considering the issue and working on its resolution. I will provide an update after further testing.

Nice!

Which firmware are you using now? And, everything stills fine?

Hi,

Version? 7.20ab2XX?

Yes

No, initially after installing the beta version, it worked for 2-3 hours, after which it restarted on its own with the log message ‘watchdog timer’. This continued to happen every 1-3 days, until on the 7th or 8th day it started restarting continuously, again showing a ‘kernel error’. I reported the issue to support on April 8th, but have not received a response to this day.

Right now I’m using version 7.16.2 — I get a reboot due to the watchdog every 2–3 days consistently, but it’s still better than continuous reboots and kernel errors…