Community discussions

 
millenium7
Member Candidate
Member Candidate
Topic Author
Posts: 155
Joined: Wed Mar 16, 2016 6:12 am

CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Wed Jun 12, 2019 12:02 pm

This has just happened out of the blue. All data is transmitted to/from one of these routers via the SFPPlus1 port (connected with a Direct Attach Cable to a Mikrotik CRS328)
I went to site and logged into the router via ethernet/laptop before touching anything and found the port just entirely stopped transmitting data. It was receiving it OK but sat on 0 packets/s outbound
It's not the cable because moving it to port2 it works. And we have a 2nd backup router in place with exactly the same configuration and physical setup, I moved the cable from it to the first router and again receives packets but doesn't transmit

After rebooting the router, the port functioned fine for about 16 hours then exactly the same thing happened (fortunately this time I wrote a script to check for it and reboot the router)
But this is a huge problem, this is a production router with a lot of traffic going through it. Obviously we'll RMA the device but I want to know if this has been reported before, if there's any kind of known issue

RouterOS version is 6.42.3 and has been stable for months
Recent changes prior to this happening is adding 2 more neighbors via OSPF (and only about another 100 routes)
Router runs BGP + OSPF + MPLS and has about 152,000 routes in the routing table so shouldn't be overwhelmed, has 16gb memory (most of it is free) and 800mb+ free disk space so I doubt its a resource issue. Even if it was I wouldn't expect the SFP port to just entirely stop transmitting data, if anything it would be opposite if there was a routing loop or something going on.
Logs show nothing
 
glueck05
just joined
Posts: 3
Joined: Fri Jan 26, 2018 12:49 pm

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Wed Jun 12, 2019 2:14 pm

Hello,
i have have seen this issue also on CCR1036 on sfp-sfplus2 and on CCR1072 (also connected via DAC) often. The port shows Running but does not send any data. When this happens i disable and enable the port via netwatch and it worked again. A reboot is not nessescary (in my case).

RouterOS 6.42.12.

regards
 
millenium7
Member Candidate
Member Candidate
Topic Author
Posts: 155
Joined: Wed Mar 16, 2016 6:12 am

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Thu Jun 13, 2019 9:54 am

Cycling interface isn't a solution and for us would still result in an extended outage as this router handles PPPoE connections

Have replaced 1x router with the new CCR1036 revision that has dual power supplies and updated both to 6.44.3 including firmware
Will report back if it continues to lock up. And if it does we'll be ripping routers out and replacing with CCR1016's as we've had less issues with those
 
millenium7
Member Candidate
Member Candidate
Topic Author
Posts: 155
Joined: Wed Mar 16, 2016 6:12 am

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Thu Jun 13, 2019 1:33 pm

Nope, new hardware revision and 6.44.3 still same problem

So it's very likely some bug with the hardware or underlying OS that produces no logs and no information to us. As I can't possibly see how you can stop a SFP port from transmitting data no matter what you tried to do via scripting or configuration aside from a bridge filter rule (which wouldn't fix itself after a reboot)

The recent changes are enabling OSPF - but was stable initially, and enabling MPLS to those peers. That inherently won't cause the issue but it may trigger some bug. I've reverted all changes for now as the router should still reboot overnight if the problem is still there. Regardless we'll be swapping out with 1016's
 
millenium7
Member Candidate
Member Candidate
Topic Author
Posts: 155
Joined: Wed Mar 16, 2016 6:12 am

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Mon Jun 17, 2019 3:53 pm

replaced with brand new CCR1016's and the same problem happens!
This is caused by either OSPF or MPLS in combination with what's already running (eBGP, iBGP, PPPoE, IPSec). When OSPF+MPLS are disabled it's fine. But when enabling them the network is perfectly stable and looks totally fine for a few hours, literally no visible issues at all, definitely no routing loops or anything weird happening. Then after a random interval of a few hours the interface totally locks up in the transmit direction

Not a routing problem, I mean it totally locks up, not even Layer2 neighbor hello packets go out, dead. Disabled/enabling the interface doesn't seem to work, have to reboot the router
Routers were all running 6.44.3 and had /system routerboard upgrade applied as well

This is a big problem. Right now i'm thinking throw the CCR's in the bin and replace with Cisco. This has already cost way too much in downtime

Who is online

Users browsing this forum: No registered users and 6 guests