Yesterday replaced a rb1100AH running v.6.30.2 to a brand new CCR1009-8G-1S-1S+ router.
Looked in the log today and found it filled with fcs errors?
Screenshot 2015-10-04 15.03.14.png
All ethernet cables are almost new and connect to new routers and on the rb1100AH showed no errors and were Gigabit connected (auto negotiation on)
Now these failures?
I installed the CCR with 6.32.2 but reverted back to 6.30.2 as the most stable version but the problem didn’t disappear…
It can’t be the cables, they were all fine on the rb1100AH and basically all new…
It can’t be the software? 6.32.2 or 6.30.4 makes no difference. And most of my other routers now work fine with either one of these two versions…
I ran 2 ping tests with 100ms timeout to adjacent router and that is fine. 1st is 100ms and 50 bytes packet size, 2nd is 100ms but 1500 packet size. On the latter I see an occasional time out and I see 1% package loss…
The stats of the interface is also showing the error. 2100 errors in 25 mins uptime…
Do I have a failing CCR? Where is this error coming from?
Nobody? Mikrotik? support Ticket#2015100466000162 send last week thursday, still no answer?
Screenshot 2015-10-06 10.15.20.png
Traffic seems to flow normally. When run a ping over the link no delays, no package losses…
Router on the other end shops no issues on same connection.
1). After 1st install router showed fcs errors on 6 out of the 8 gigabit connections. Later most have gone into a very occasional error, but this port 7 keeps going on.
This router served as replacement for slower rb1100AH but that one didn’t have a single fcs error on the same cables…
2). Since ports are gigabit auto setting has to be followed or it simply doesn’t work. (MT and internet documentation depict it has to be ‘auto’!)
3.) Already reverted back to 6.30.4 to no avail, and up to 6.32.2 again and latest firmware is running.
In our lab we have a CCR1009 connected to MikroTik, Cisco and HP network equipment on copper and fiber without issue…unfortunately you probably got a bad CCR.
There was a batch of bad CCR1009s early on and it’s possible you got one that’s been on the shelf for a while as all the new CCR1009s shipping seem to be pretty solid.
Is this router plugged in to an airfiber? I have a CCR1036 that spews this error no matter what when an airfiber is attached. Almost same scenario replaced an 1100AHx2 with the CCR1036.
All links / ethernet cables leave from CCR1009 over some 15 meter of ftp cable (all well grounded on both ends…) to a tower cabinet.
In the cabinet a 24V PoE battery fed midspan power inserter with lighting/surge protector is fit. From here some 8-10 meter of ftp (all well grounded on both ends…) run up the tower to connect to the radios. Mostly Netmetals.
The midspan PoE injector is from Cyberteam, Poland. This is a managed Gigabit version. This is actually the 3rd box that give me problems. One just died, the second had no more webserver working and became useless because of that en this last one after 2 months again fell without web server control. So my suspicion went to this device. Last night I replaced it to a brand new unmanaged gigabit net-protector and almost all fcs error disappeared from the logs…
So this is the 3rd managed netprotector heading for the bin (with 2 unmanaged going the same before… poor quality stuff…)
I still suspect the last remaining errors, now only on one link, are due to this new netprotector. But to be honest there are not a lot 24V multiple port passive PoE midspan injectors in gigabit version with IP management on the market. The nearest solution I can find are Netonix WISP switch but I need to convert it into 1:1 port to port switch only since the several routes have different dhcp-servers and hence these 'pipe’s need to stay separated.
To reply to some of your comments;
I am 100% its not the cabling. It also happened on almost every port, and they all have Gb connections.
This is true only on copper (and you can set 100M Full on each side just for testing) SFP ports can be hard coded to 1000/Full as well.
Since the midspan PoE insert and the fact that CCR sits as router in a crosspoint of links, 1:1 SFP connections are not possible. Also because we don’t have the tools to make fibre cables nor can I buy prefab SFP with lengths of more than 5 meters against affordable prices…
In our lab we have a CCR1009 connected to MikroTik, Cisco and HP network equipment on copper and fiber without issue…unfortunately you probably got a bad CCR.
There was a batch of bad CCR1009s early on and it’s possible you got one that’s been on the shelf for a while as all the new CCR1009s shipping seem to be pretty solid.
I agree, I am now more looking at another midspan PoE solution.
I replace the midspan PoE netprotector last night at 1 am.
I opened the log of the CCR this morning at 10 am. During this 9 hours it ran it shows only 4 fsc error on one and the same link.
But while posting on this forum in just 20 mins I had another 5 errors of that same port… weird…
Update:
This is what I send to support (some grammar edits…):
Ok,
Last night at around 1 am we replaced the gigabit netprotector (Poe midspan injector) for another more simple one (Also giga, also Cyberteam, but no remote access, no control whatsoever) brand new out of the box.
Immediately after the change the fcs errors didn’t come back. Well, instead of every 30 seconds until this morning 10 am when I logged in again the log only showed 4 fcs errors in total since the change… good!
… You would think… =>
While looking over a remote winbox session in this CCR1009’s log this morning I saw the errors coming back! First every 10 mins, then every 5 mins, every 3-4 mins and now they are back to every 2 mins…
How is this possible? Overnight the problem vanished just to come back in the morning…? Very, very weird…
I don’t know what to do now. Cannot go to 100mbps setting on each end since this is my main backhaul that at times has more traffic than 100mbps. Fast Ethernet will create a bottleneck again. (I spend a lot of time and money and energy into reforming a previously duo bonded link with double antenna sets etc. high in towers into a ‘ac’ single 40Mhz wide link to get up to 300Mbps over this new link… (tested up to 280M!)
I also cannot fit in the gigabit power inserters from Mikrotik. Because the poe insert is in the middle of a cable. Both ends need to be at least 10 meters…
I don’t seem to be able to find any passive poe midspan inserters that work on 24V battery power and have gigabit port and are remotely controllable…
So, main question; Until I find any solution, how bad is it these errors are there? They fill the log, but do they have any other harm? I don’t seem to notice problems (yet?). Ping on long and short package see no losses and ping times are good and traffic flows…
But out of the 12 NetMetals installed over the last weeks, all running with latest ROS and firmware in bridge mode (fastpath enabled) I have 3 of them that just stop passing traffic once a 2 or 3 weeks… only a power reboot brings these back in working order again. Can this have anything to do with some fcs error too? (Since the Netmetals don’t show these errors in the logs I never looked for them…)
While it doesn’t sound like the cabling itself is the issue, there does appear to be a physical issue in the copper path (which is the type of error that FCS is designed to indicate) possibly with your midspan PoE.
I would probably do two things to isolate this given the info you provided:
Swap in a new CCR1009 and see if the FCS still show up.
Take the CCR1009 you removed and set it up as a test to a netmetal (on a port that was getting FCS) without the midspan PoE injection device you mentioned. If it stays clean then you know where the problem is.
FCS on the same second of every minute, whether that’s every 30 seconds or every 60 seconds, may be an AirFiber thing. The AF developers may have identified the issue and be fixing it in the next firmware.
I see most FCS errors on port 8 of my CCR1009 routers. That is the PoE in port. I have a hypothesis that grounding issues at the tower site and / or just static build up on the cable may cause link issues, perhaps because the power connections to the port might make for a better ground loop path. That may just happen to be where I like to put my highest bandwidth radios.
On one router, a CCR1009-8G-1S-1S+, we were seeing a lot of FCS errors on ether8, which connects to an AF24. The port would also drop and re-negotiate at 100Mbps during the warmest part of the day. That hurts when you normally run 500+Mbps. There are 3 other AF devices plugged into this router in ether7, ether6, and ether5. We only had issues, which couldn’t be resolved by fixing the cabling, with this one. We moved the AF24 from ether8 to ether4 and have not had any further issues, yet. The board temperature has been hitting the same highs as were associated with the renegotiation issues with ether8.
I see occasional FCS errors on other ports, but the most common occurrences are on ether8 on the CCR1009s (the originals and the PCs). Maybe I’m just lucky, I don’t have a large sample set, only about 6 devices.