So I have swapped about 10 of my CCR1072 with CCR2216 since that time I have been having a lot of packet drops on my MPLS network.
And it appears that the cpu is the problem. After a long observation it looks like anytime a single core out of the 16 cores hits greater than 90%
the device randomly drops packets even though the other 14 or 15 cores are a little idle.
Please is there some configuration I have to do so that the CPUs utilization balances. I neen assistance.
I’ve seen such uneven CPU load distribution on ARM (RB3011, RB1100) and ARM64 (CCR2004) devices running MPLS on ROSv6.
/tool/profile may give some indication of the cause of this behavior.
I actually added the /tools/profile stat on the initial post. There isnt much there.
You’ll see items like “networking” which really doesnt point out what in networking is
causing the high utilization.
The first thing that stands out to me is that your MTU settings are incorrect for the pseudowire mtu you’re trying to support.
Have you checked for fragmentation?
You need a minimum of 26 bytes of overhead between the VPLS MTU and the MPLS MTU with untagged traffic and an additional 4 bytes per VLAN tag. So, if you want to pass 9216 inside a VPLS pseudowire, then you need at least 9,242 in MPLS MTU and L2 MTU to support it.
I would also consider disabling hw-offload and testing purely on CPU to determine if the problem is related to hw-offload and MPLS running together since MPLS traffic is not yet supported in hw-offload.
The ISPs that we’ve done work with running MPLS using ROSv6 and ROSv7 have had mixed results. Some of them we had to roll back to ROSv6 only or move to all ROSv7. We weren’t able to pinpoint specific root cause other than observe the symptoms of performance issues and tunnel stability.
How much traffic is going through the CCR2216 when the CPU reaches 100%?
My experience with ROSv6 on ARM equipment (RB3011 and RB1100) used as MPLS PE (VPNv4) tells me that the approximate traffic that the equipment can handle is in the order of the test results for “25 ip filter” rules divided by the number of CPU cores. I have found it impossible to get the load to be divided evenly between all the CPU cores.
I haven’t tested the CPU of the CCR2x16, but it seems to be a pretty interesting traffic value. As I wrote above, ROS does not seem to be able to distribute among different CPU cores the traffic coming from the MPLS core (labeled packets).
I guess you are right.. All my customers are complaining, the packet drops are crazy and my monitoring software keeps alerting me on CPU > 90%.
I have tried the MTU change, disabled L3-HW Offloading and yet issue still persist.
I have installed 13pieces of this CCR2216 routers on my network, and it starting to looking its not a match to the CCR1072.
I have no choice now but to retire the CCR2216 and go back for my CCR1072 which was working perfectly until I did the change.
I will probably reinstall them when MIKROTIK fixes the cpu load balancing.
Goodness, this is heart breaking.
I am starting to believe that cpu high utilization is from the ROSv7.
I pushed 6Gig through CCR1072 with ROSv6 and CPU was around the 1-2%.
I did same on CCR1072 with ROSv7.7 and CPU started shooting up, individual cores hitting about 70%
with CPU temperature hitting 66C. I think the CCR2216 CPU problem has to do with the ROSv7.
it looks like a lot of people are having CPU issues on this new flagship device CCR2216.
Is MIKROTIK really doing anything about it?
You dont buy this high-end device to be tweaking stuff just to get it working properly.
CCR1072 with ROSv6 works OK out of the box.