Community discussions

MikroTik App
 
millenium7
Long time Member
Long time Member
Topic Author
Posts: 538
Joined: Wed Mar 16, 2016 6:12 am

CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Wed Jun 12, 2019 12:02 pm

This has just happened out of the blue. All data is transmitted to/from one of these routers via the SFPPlus1 port (connected with a Direct Attach Cable to a Mikrotik CRS328)
I went to site and logged into the router via ethernet/laptop before touching anything and found the port just entirely stopped transmitting data. It was receiving it OK but sat on 0 packets/s outbound
It's not the cable because moving it to port2 it works. And we have a 2nd backup router in place with exactly the same configuration and physical setup, I moved the cable from it to the first router and again receives packets but doesn't transmit

After rebooting the router, the port functioned fine for about 16 hours then exactly the same thing happened (fortunately this time I wrote a script to check for it and reboot the router)
But this is a huge problem, this is a production router with a lot of traffic going through it. Obviously we'll RMA the device but I want to know if this has been reported before, if there's any kind of known issue

RouterOS version is 6.42.3 and has been stable for months
Recent changes prior to this happening is adding 2 more neighbors via OSPF (and only about another 100 routes)
Router runs BGP + OSPF + MPLS and has about 152,000 routes in the routing table so shouldn't be overwhelmed, has 16gb memory (most of it is free) and 800mb+ free disk space so I doubt its a resource issue. Even if it was I wouldn't expect the SFP port to just entirely stop transmitting data, if anything it would be opposite if there was a routing loop or something going on.
Logs show nothing
 
glueck05
newbie
Posts: 37
Joined: Fri Jan 26, 2018 12:49 pm

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Wed Jun 12, 2019 2:14 pm

Hello,
i have have seen this issue also on CCR1036 on sfp-sfplus2 and on CCR1072 (also connected via DAC) often. The port shows Running but does not send any data. When this happens i disable and enable the port via netwatch and it worked again. A reboot is not nessescary (in my case).

RouterOS 6.42.12.

regards
 
millenium7
Long time Member
Long time Member
Topic Author
Posts: 538
Joined: Wed Mar 16, 2016 6:12 am

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Thu Jun 13, 2019 9:54 am

Cycling interface isn't a solution and for us would still result in an extended outage as this router handles PPPoE connections

Have replaced 1x router with the new CCR1036 revision that has dual power supplies and updated both to 6.44.3 including firmware
Will report back if it continues to lock up. And if it does we'll be ripping routers out and replacing with CCR1016's as we've had less issues with those
 
millenium7
Long time Member
Long time Member
Topic Author
Posts: 538
Joined: Wed Mar 16, 2016 6:12 am

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Thu Jun 13, 2019 1:33 pm

Nope, new hardware revision and 6.44.3 still same problem

So it's very likely some bug with the hardware or underlying OS that produces no logs and no information to us. As I can't possibly see how you can stop a SFP port from transmitting data no matter what you tried to do via scripting or configuration aside from a bridge filter rule (which wouldn't fix itself after a reboot)

The recent changes are enabling OSPF - but was stable initially, and enabling MPLS to those peers. That inherently won't cause the issue but it may trigger some bug. I've reverted all changes for now as the router should still reboot overnight if the problem is still there. Regardless we'll be swapping out with 1016's
 
millenium7
Long time Member
Long time Member
Topic Author
Posts: 538
Joined: Wed Mar 16, 2016 6:12 am

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Mon Jun 17, 2019 3:53 pm

replaced with brand new CCR1016's and the same problem happens!
This is caused by either OSPF or MPLS in combination with what's already running (eBGP, iBGP, PPPoE, IPSec). When OSPF+MPLS are disabled it's fine. But when enabling them the network is perfectly stable and looks totally fine for a few hours, literally no visible issues at all, definitely no routing loops or anything weird happening. Then after a random interval of a few hours the interface totally locks up in the transmit direction

Not a routing problem, I mean it totally locks up, not even Layer2 neighbor hello packets go out, dead. Disabled/enabling the interface doesn't seem to work, have to reboot the router
Routers were all running 6.44.3 and had /system routerboard upgrade applied as well

This is a big problem. Right now i'm thinking throw the CCR's in the bin and replace with Cisco. This has already cost way too much in downtime
 
mducharme
Trainer
Trainer
Posts: 1777
Joined: Tue Jul 19, 2016 6:45 pm
Location: Vancouver, BC, Canada

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Tue Jun 18, 2019 2:04 am

We have this problem, but for us it happens every 30-90 days or so. It last happened 57 days ago. We have a ping watchdog to reboot the router when this happens. Disabling and re-enabling the interface might fix it too. Same CCR1036-8G-2S+, first generation. We have two CCR's connected to each other, one is PPPoE concentrator, the other not. The one that is not a PPPoE concentrator has no issues. Both run MPLS and OSPF.

We were soon going to be replacing the device with a CCR1072.
 
millenium7
Long time Member
Long time Member
Topic Author
Posts: 538
Joined: Wed Mar 16, 2016 6:12 am

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Tue Jun 18, 2019 12:34 pm

We have this problem, but for us it happens every 30-90 days or so. It last happened 57 days ago. We have a ping watchdog to reboot the router when this happens. Disabling and re-enabling the interface might fix it too. Same CCR1036-8G-2S+, first generation. We have two CCR's connected to each other, one is PPPoE concentrator, the other not. The one that is not a PPPoE concentrator has no issues. Both run MPLS and OSPF.

We were soon going to be replacing the device with a CCR1072.
This was happening to us every 1-12 hours, extremely disruptive to the network

We have had the combination of technologies in various forms as we've changed the network layout over the time. 1.5 years ago we did have OSPF + MPLS + PPPoE + BGP running on CCR1036's at 3 different sites and it was working fine
Now if we try to do the same with everything running on the same router the the interface will lock up. Only thing different is previously BGP was only receiving default routes, now we get much larger BGP routing tables, more PPPoE connections, larger OSPF network, and we are using SFP+ instead of ethernet interfaces

I don't know exactly what the problem is. If it's PPPoE in combination with everything else then great we can simply remove PPPoE from it, that's the easiest thing to do.
My plan is to remove PPPoE from that router anyway and bring it as close to every customer as possible, because it provides for easier QoS, faster reconnection as PPPoE won't drop and shorter paths when there's a routing failure to the closest edge router. Hopefully that will fix the problem, but i'm not going to do anything for a while except plan. We already have some very angry customers who have had to put up with continued disconnections for days.

I may end up just getting rid of the MikroTik routers at key locations and instead using something else. Starting to get too many issues with MikroTik. Good for distribution and customer equipment, not so good for the core
 
millenium7
Long time Member
Long time Member
Topic Author
Posts: 538
Joined: Wed Mar 16, 2016 6:12 am

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Mon Jun 24, 2019 8:16 am

I setup a lab using 1 of the existing routers, leaving config exactly the same. Used other devices to simulate switches and other routers
Setup BGP+OSPF+MPLS routers as good as I can but obviously not as big as the actual network. Added 200 PPPoE sessions with traffic generator across several routers to just send traffic all over the place. Setup a fake BGP router to inject global routing tables and blackhole all traffic to it. Simulated flapping PPPoE connections every few seconds. Also had traffic generator send from the CCR to another CCR (in place instead of CRS317 switch as I don't have a spare lying around) and run a total of 9.5gbit/s constantly through it. Using the same DAC cable

Has been stable for days. It's definitely not an issue with how things are configured
Maybe its very specific to having a CRS317 on the other end of the CCR. Maybe it's only when there's a certain number of active routes in the OSPF or MPLS table, who knows. But I can't replicate the issue so therefore I can't find a workaround, big problem as I don't want to try this on a production network again unless I can be sure it's going to be stable
 
WirelessDSL
newbie
Posts: 38
Joined: Thu Nov 24, 2011 12:43 pm
Location: Germany
Contact:

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Fri Mar 20, 2020 8:32 pm

Did anyone found a solution for this problem?

We experience the same issue.
2x CCR1072 connected to each other (V6.46.4 also with updated Firmware)
- 3m Mikrotik S+DA0003 -> suddenly traffic stops (between 5min and approx. 16h)
- Mikrotik SFP+ S+31DLC10D with LC-Patchcable -> suddenly traffic stops (between 5min and approx. 16h)
- Ports tested with and without Autonegotiation

Router are configured with OSPF+MPLS. Around 130 OSPF Routes.

Third router connected to one of the CCR1072 with DAC (S+DA0003) CCR1016-12S-1S+ (V6.46.4 also with updated Firmware) also with OSPF+MPLS work without issues.

I can´t reproduce it. Suddenly the traffic stops. No Layer 2/3 connectivity, no neighbour discovery possible. No log entries.

Sometimes it works to disable the interface and reenable it again. But sometimes I have to reboot one device to get it work again.

Any ideas?
 
millenium7
Long time Member
Long time Member
Topic Author
Posts: 538
Joined: Wed Mar 16, 2016 6:12 am

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Sat Mar 21, 2020 12:50 am

I found no solution, and the amount of outages and customer issues this caused i'll never be trying it again
We've had to keep those core routers entirely OSPF and MPLS free. As PPPoE is still terminated on those routers, this means we lose automatic failover if a major site goes down, and we have to manually move VPLS tunnels to another main site that links to the core

This is not a great solution at all, but the network has been stable enough that it hasn't been a major problem. We get alerts immediately when a BGP session to those core routers goes down, so it only takes at most 5 minutes to move all the tunnels over

The plan is to eventually move 90% of our customers over to a DHCP Option 82 based system. The hurdle has been route injection so allow /32 addresses to be assigned to customers anywhere in the network without having to manually do it (yuck). I managed to write a script to handle that recently
And for the rest of the customers, no more VPLS tunnels. Their PPPoE will terminate on the closest router not at a centralized location
This way the core can only have eBGP to the internet, and iBGP to major distribution sites. All customer traffic will be regular IP traffic which solves the failover problem of VPLS tunnels
 
WirelessDSL
newbie
Posts: 38
Joined: Thu Nov 24, 2011 12:43 pm
Location: Germany
Contact:

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Sat Mar 21, 2020 3:18 pm

Thanks for the reply.

I opened a ticket at Mikrotik with a mark to this thread and with supouts.
Maybe they take some time into it.

With 1G-Connections/Routers we never saw this problem. I think it´s related to CCR1072, maybe a firmware issue. With CCR1016 it isn´t happening until right now.
But with no logs, there is no debug possible.

You´re network concept with DHCP Option 82 sounds great, but I love the flexibility with MPLS and in 99% of the time it is stable enough.
I don´t want to think about changing the network concept ;)
 
millenium7
Long time Member
Long time Member
Topic Author
Posts: 538
Joined: Wed Mar 16, 2016 6:12 am

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Sun Mar 22, 2020 12:42 am

The biggest benefit of DHCP for both for us and customers is they can just take any router straight out of the box, plug it in and bam immediately have internet access, as almost all routers are configured for DHCP by default. They can factory reset it, still works just fine. Because MikroTik routers don't have very good WiFi coverage, this makes life a whole lot easier when we can just tell them to go buy one with big antenna's from a store and plug it in, problem solved. So far every router i've tried has had no problem with /32 assignments meaning no wasted IP's

The next biggest benefit is the connection does not need to be 'established'. A brief outage or a change of data path on a PPPoE connection can mean that data stops flowing, the circuit has to time-out then reconnect before it can flow again. MikroTik is the fastest at this but even so its often a few seconds which is long enough to effectively kill a VoIP session, and most other vendors are painfully slow upwards of 30 seconds. And when the PPPoE connection drops it flushes connections so can result in website timeouts etc. The net experience for the customer is a bit worse
Whereas with DHCP/straight IP it's treated on a per packet basis, no need to re-establish a circuit. If the data path changes even multiple times, it doesn't matter as the packets will still get to their destination. And the recovery time from a link failure is practically the same as it takes for the link to come back up, no additional waiting

The final benefit is traffic engineering. You cannot separate traffic inside a PPPoE tunnel, it all flows exactly the same way. We have quite a few 24ghz and 60ghz links that go down in the rain, they are backed up by 5ghz and the current failover is only ~200ms but now I can separate VoIP to always use the 5ghz when available, and I can reserve bandwidth for VoIP

The last 2 can mostly be achieved by just moving the PPPoE session as close to the customer as possible but its still not as good
 
WirelessDSL
newbie
Posts: 38
Joined: Thu Nov 24, 2011 12:43 pm
Location: Germany
Contact:

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Sun Mar 29, 2020 11:34 pm

Update:

I changed the ports on CCR1072.

1st Router
Before: SFPPlus 1-3 (with OSPF+MPLS)
After: SFPPlus 6-8 (with OSPF+MPLS)

2nd Router
Before: SFPPlus 2 and 3 (with OSPF+MPLS)
After: SFPPlus 7 and 8 (with OSPF+MPLS)

Now it´s working since two days without any issues. Have to wait a few days how it´s working now.

I found another thread which pointed me to this possible solution.
viewtopic.php?t=102946

Now it seems there are some issues with OSPF+MPLS with Port 1-4. Maybe a hardware issue.
 
glueck05
newbie
Posts: 37
Joined: Fri Jan 26, 2018 12:49 pm

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Mon Apr 06, 2020 9:59 am

@wirelessDSL: Does it work stable for the last week?

thanks,
glueck
 
WirelessDSL
newbie
Posts: 38
Joined: Thu Nov 24, 2011 12:43 pm
Location: Germany
Contact:

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Mon Apr 06, 2020 10:56 am

Since now. Everything is fine.

Hope for the best.
 
millenium7
Long time Member
Long time Member
Topic Author
Posts: 538
Joined: Wed Mar 16, 2016 6:12 am

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Wed Apr 08, 2020 10:23 am

This happened AGAIN in our network at a different location, but to 'ethernet' ports this time. So this bug seemingly doesn't care if its ethernet or SFP modules
This happened on a CCR1009-7G-1C-1S+

That site has had issues with VPLS tunnels randomly dropping off over the past couple months. I very thoroughly combed the MPLS labels on every router in the path, checked OSPF, checked everything i could. There's absolutely no issue, traceroutes show its using MPLS just fine, yet the tunnels just will not come up. Yet reboot the router and they work....... for a short while, then they start to drop off 1 by 1 for absolutely no reason again

A couple months pass by and we reach today where ether1 & ether3 just completely stop transmitting data, exact same situation as the original post. They receive fine, can see neighbors, but can't MAC ping or anything. TX bytes remain at 0 forever
Those ports were not bridged and had nothing in common. Rebooting the router the ports work perfectly fine for 1-10 minutes then suddenly just stop transmitting

I've narrowed the issue down to either MPLS or LDP, one or the other. But since you can't viably use MPLS without LDP I just have to abandon MPLS entirely
This is a HUUUUUUUUUUUGE problem MikroTik, holy hell. I understand not having some enterprise features, but when you have a bug like this that cripples the network on a supposedly supported feature, and show absolutely zero response to it, it's just not acceptable (yep, we submitted multiple supouts and did all the troubleshooting ourselves)

Our immediate solution is to completely disable MPLS on that router and use EoIP tunnels to all the sites, and between all sites that transit through that router
Our next move is to rip MPLS out of the entire network as its been slowly causing weird behavior like this for seemingly no reason. All PPPoE sessions will be terminated as close to the customer as possible to remove the need to VPLS tunnels
Yet we have issues with data throughput rate on RB3011's (already made a topic about it). So our longer term solution may be to ditch MikroTik entirely in our distribution and core network and go elsewhere. It seems at a certain scale, things just break, and there's too many little problems with huge consequences. When this happened last time it cost us several customers, just not worth it
 
WirelessDSL
newbie
Posts: 38
Joined: Thu Nov 24, 2011 12:43 pm
Location: Germany
Contact:

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Wed Apr 08, 2020 11:10 am

I got an answer from Mikrotik about the issue with stopping traffic on interfaces.

"It does not look like a hardware-related issue. Seems some similar issues have been reproduced in our labs, when suddenly Tx traffic stopped on a physical interface and it is related to L2MTU handling on the device, we will try to improve this in further RouterOS versions, but at moment we cannot say any ETA.
In our tests, it seems like work around helps if you simply increase the maximum L2MTU on some interface (it can be even an unused interface) and then restore it to a default value. For example, try to enter these commands:
/interface ethernet set sfp-sfpplus8 l2mtu=10222
/interface ethernet set sfp-sfpplus8 l2mtu=1580
It will create a short link down on all interfaces and after this procedure this issue should not appear more.
If you reboot or upgrade your router, then you should follow the same procedure again until we include a improvements in further RouterOS versions.
Please share your feedback if this stops the interface hang.
"

Maybe this could solve the issue temporary.

I´ll try that if it happens again.
 
millenium7
Long time Member
Long time Member
Topic Author
Posts: 538
Joined: Wed Mar 16, 2016 6:12 am

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Thu Apr 09, 2020 2:32 am

That's not a great fix
But would simply increasing the L2MTU and not restoring it back down help? Because there is no harm in setting L2MTU to max. Infact I don't know why it isn't set to maximum (that goes for every single device on the market). Nothing will ever send larger L2 frames unless specifically told to, i.e. you start stacking on lots of VLAN tags, MPLS/VPLS, PPPoE, increasing L3MTU, using IP packing etc. Even L2 protocols like ARP are small packets they don't suddenly become super large and pose an issue, as L2 packets just get silently dropped there is no communication mechanism hence all the protocols are built around assuming a certain universally accepted L2 MTU size anyway
 
emmabnt03
just joined
Posts: 1
Joined: Mon Aug 10, 2020 6:27 am

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Thu Aug 13, 2020 3:23 am

I have the same drawback. It happened after an update. when CCR1036-8G-2S + was installed (in Replacement of an RB1100) it was this fault from the beginning. It was solved by reinstalling from netinstall. it worked for 4 months. I will try this interim solution and report if it doesn't fail. too much headaches has not given
 
jmallari76
just joined
Posts: 6
Joined: Sat Jun 11, 2011 8:51 am

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Sun Nov 29, 2020 6:46 pm

did you guys find a solution on this, please advise
 
Noll26
just joined
Posts: 2
Joined: Sat Oct 02, 2021 2:26 pm

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Mon Oct 04, 2021 4:10 pm

RouterOS have a update for the problem? I'm on 6.48.4 with a CCR1036 and the problem happens every 30 days approximately. The router works with OSPF and EoIP Tunnels.
 
millenium7
Long time Member
Long time Member
Topic Author
Posts: 538
Joined: Wed Mar 16, 2016 6:12 am

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Wed Oct 06, 2021 12:46 pm

That's not good to hear it still occurs......

I havn't touched the network topology and been considering changing it all back to how it logically should be, but if this is still happening today then no chance..... this is hugely service impacting
Think I lost 5 years of my life last time, not game to try again easily
 
WirelessDSL
newbie
Posts: 38
Joined: Thu Nov 24, 2011 12:43 pm
Location: Germany
Contact:

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Wed Oct 06, 2021 1:08 pm

That's not good to hear it still occurs......

I havn't touched the network topology and been considering changing it all back to how it logically should be, but if this is still happening today then no chance..... this is hugely service impacting
Think I lost 5 years of my life last time, not game to try again easily
After changing Ports 1-4 to 5-8 on CCR1072 we never saw this problem again. Everything running smooth.
But we never had this problem with the CCR1036 until now.
RouterOS have a update for the problem? I'm on 6.48.4 with a CCR1036 and the problem happens every 30 days approximately. The router works with OSPF and EoIP Tunnels.
Is it Hardware Rev 1 or 2? What SFP+ Modules you are using and what´s the signal? Maybe a temperature issue?
Just a few questions, it hopefully can point you in the right direction.......
 
User avatar
mjoksimovic
newbie
Posts: 46
Joined: Thu Jun 19, 2008 1:19 pm
Location: Serbia
Contact:

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Fri Oct 29, 2021 1:16 am

We have the same problem. CCR1036-8G-2S+. PPPoE connections drop every several hours or minutes. Router freezes constantly. Replaced with new one, same revision, REV2, problem stays even with new configuration step-by-step, not restoring old one. Only phisicaly restart helps for next several hours.

We put back old one CCR1009-8G-1S-1S+ and everything works perfectly stable. We will add one more 1009 and solve problem. This is definetely hardware problem with 1036. Now I have two devices 1036 for taking a dust in my room. Wasted money and lost several customers.

Shame.
 
oeyre
Member Candidate
Member Candidate
Posts: 137
Joined: Wed May 27, 2009 12:48 pm

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Wed May 17, 2023 6:54 am

Hi,

Did anybody get a proper resolution on this? Please let me know your ticket refs as I am dealing with MikroTik support ATM (SUP-116184).

I recently changed dark fibre providers and I am having a very similar issue for 2 of my CCR1036-8G-2S+ (original HW rev), both when the SFP is on sfpplus1. I also have a 3rd CCR (same specs/rev) which is using sfpplus2 that does not have the issue! In 1 case, sfpplus1 was already working successfully with a different vendor SFP but has this issue with the new one.

In my case the issue happens within 30-120 seconds of putting traffic on the link however, not after x days.

Very interesting...
 
oeyre
Member Candidate
Member Candidate
Posts: 137
Joined: Wed May 27, 2009 12:48 pm

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Fri May 26, 2023 7:19 pm

Quick update regarding my issue:
-moved SFP from sfpplus1 to sfpplus2, no change
-MikroTik support noted that FCS errors could cause this which was happening on 1 of my 2 1036 with problems, cleaned SFP/fibre/slot and stopped FCS errors, no change
-swapped in a known good and working (2 weeks) SFP from the 1 site of 3 with no issues, no change
-MikroTik support suggested upgrading to 6.48.7 (was running 6.48.4), no change
-tried setting the port to sfp-rate-select=low, no change
 
oeyre
Member Candidate
Member Candidate
Posts: 137
Joined: Wed May 27, 2009 12:48 pm

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Wed Jun 21, 2023 4:22 am

I just thought I'd share another update about this problem...

After accidentally triggering this problem on a different 1036 with a SFP that had been running fine for years, I did some testing regarding the MTU settings. This was a hunch based on comments from this thread with a similar issue: viewtopic.php?t=84824

The MTU settings before my testing were:
Physical interface - L2 MTU: 1580
VLAN on physical interface with IP/LDP - IP MTU: 1500
MPLS - MPLS interface MTU: 1580

Ultimately I found that by changing only the L2MTU of the physical interface to the max value (10222 on CCR1036) I have managed to stop the transmit lockup from happening now for several weeks and counting.

MikroTik advises they are investigating the problem. I suspect there is some kind of weird interaction between RouterOS and the underlying hardware wherein after the OSPF/LDP table gets larger than a certain size (for example, the IP MTU) it somehow crashes the interface when trying to reassemble. This is just my theory, not based in any quantifiable or data backed research.

For reference my feature mix is as follows:
  • RouterOS v6 long-term
  • IPv4
  • OSPF (v2) with BFD
  • MPLS/LDP
  • BGP route reflector client
  • BGP signaled VPLS
 
WirelessDSL
newbie
Posts: 38
Joined: Thu Nov 24, 2011 12:43 pm
Location: Germany
Contact:

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Wed Jun 21, 2023 4:32 pm

I got an answer from Mikrotik about the issue with stopping traffic on interfaces.

"It does not look like a hardware-related issue. Seems some similar issues have been reproduced in our labs, when suddenly Tx traffic stopped on a physical interface and it is related to L2MTU handling on the device, we will try to improve this in further RouterOS versions, but at moment we cannot say any ETA.
In our tests, it seems like work around helps if you simply increase the maximum L2MTU on some interface (it can be even an unused interface) and then restore it to a default value. For example, try to enter these commands:
/interface ethernet set sfp-sfpplus8 l2mtu=10222
/interface ethernet set sfp-sfpplus8 l2mtu=1580
It will create a short link down on all interfaces and after this procedure this issue should not appear more.
If you reboot or upgrade your router, then you should follow the same procedure again until we include a improvements in further RouterOS versions.
Please share your feedback if this stops the interface hang.
"

Maybe this could solve the issue temporary.

I´ll try that if it happens again.
That´s what Mikrotik told .... 3 years and still counting
 
oeyre
Member Candidate
Member Candidate
Posts: 137
Joined: Wed May 27, 2009 12:48 pm

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Thu Jun 22, 2023 6:12 am

That´s what Mikrotik told .... 3 years and still counting
If I am reading your post correctly, the advice from MikroTik was to change the L2MTU to 10222, and then immediately change it back again to 1580? This is effectively the same thing as disabling/enabling the interface and would clear the problem.

What I did was set the L2MTU to 10222 and then leave it that way. So far after nearly 3 weeks the problem has not happened again where it used to occur within 2 minutes of putting traffic on the link.

Do you mind telling what features you are using on your network and if it is 100% MikroTik? Mine is mostly MikroTik but with some Juniper acting as a core/P router participating in OSPF/BFD/LDP but not BGP or VPLS.
 
User avatar
mkx
Forum Guru
Forum Guru
Posts: 11433
Joined: Thu Mar 03, 2016 10:23 pm

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Thu Jun 22, 2023 9:05 am

What I did was set the L2MTU to 10222 and then leave it that way. So far after nearly 3 weeks the problem has not happened again where it used to occur within 2 minutes of putting traffic on the link.

I was always wondering what is the benefit of setting L2MTU to some "tight" value instead of leaving it at maximum value, supported by hardware. Since this value varies pretty wildly between different vendors (and device models), I don't see any practical way of getting it set to the same value on all devices. And if it's not the same on all devices, then setting it to anything but maximum value is futile.
 
WirelessDSL
newbie
Posts: 38
Joined: Thu Nov 24, 2011 12:43 pm
Location: Germany
Contact:

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Thu Jun 22, 2023 1:53 pm

That´s what Mikrotik told .... 3 years and still counting
If I am reading your post correctly, the advice from MikroTik was to change the L2MTU to 10222, and then immediately change it back again to 1580? This is effectively the same thing as disabling/enabling the interface and would clear the problem.

What I did was set the L2MTU to 10222 and then leave it that way. So far after nearly 3 weeks the problem has not happened again where it used to occur within 2 minutes of putting traffic on the link.

Do you mind telling what features you are using on your network and if it is 100% MikroTik? Mine is mostly MikroTik but with some Juniper acting as a core/P router participating in OSPF/BFD/LDP but not BGP or VPLS.
I think the point is that just changing the value of the L2MTU "do things" within the software and it will work until the next reboot. It doesn´t matter which actual value.

For us the problem was solved changing ports (CCR1072) from 1-4 to 5-8 and it worked without problems. Never had that again until now.
Our core/backbone network is Mikrotik only and we use OSPF/MPLS/LDP (no BFD). Also BGP to speak to the world.
 
oeyre
Member Candidate
Member Candidate
Posts: 137
Joined: Wed May 27, 2009 12:48 pm

Re: CCR1036-8G-2S+ - SFP+ port stops transmitting data?

Sat Jun 24, 2023 2:15 pm

I was always wondering what is the benefit of setting L2MTU to some "tight" value instead of leaving it at maximum value, supported by hardware.
There isn't really any valid reason I can think of not to max out the L2/bridging MTU on your network. You are only causing yourself potential problems later (see above).

I think 1580 might have been some default at some point, who knows... This particular network was inherited with those MTU settings, if I was the one setting it all up in the beginning I would have gone max everything from day 1 but you play the hand you got dealt. Given that fiddling around with the MTU causes all ports to bounce, and given that "everything was working fine" it was a low priority on my todo list to go around and raise all the MTUs network wide.
I think the point is that just changing the value of the L2MTU "do things" within the software and it will work until the next reboot. It doesn´t matter which actual value.
Riddle me this, then. If the value is not important, why does mine always trigger again within 30-120 seconds at 1580, but when setting the max value I am now at 3 weeks and counting of it not happening. I think there is more to it that just freshening up the memory or whatever you suggest might be happening flipping the MTU back and forth.

Who is online

Users browsing this forum: No registered users and 35 guests