CCR2216 CPU UNBALANCED LOAD AFFECTING TRAFFIC

Hello

So I have swapped about 10 of my CCR1072 with CCR2216 since that time I have been having a lot of packet drops on my MPLS network.
And it appears that the cpu is the problem. After a long observation it looks like anytime a single core out of the 16 cores hits greater than 90%
the device randomly drops packets even though the other 14 or 15 cores are a little idle.

Please is there some configuration I have to do so that the CPUs utilization balances. I neen assistance.
MCH-CPU.png

Without more information, it’s hard to point to a root cause.

Are you using hardware offload?
Do you have a mixed ROSv6 and ROSv7 MPLS environment?
Have you verified MTU is set correctly after the migration?

Please post your config and throughput numbers.

Hello

Yes I have a mixed ROSv6 and ROSv7 MPLS. I have enabled HW-OFFLOADING
And I keep all my interface MTU at 9220 and VLAN interface for MPLS at 9216.

please find configs below;;;;


# feb/08/2023 13:43:26 by RouterOS 7.7
# software id = 4MCY-2MPJ
#
# model = CCR2216-1G-12XS-2XQ
# serial number = 
/interface ethernet
set [ find default-name=ether1 ] l2mtu=9220 mtu=9220 name=Ether1
set [ find default-name=sfp28-1 ] auto-negotiation=no l2mtu=9220 mtu=9220 name=sfp-sfpplus1-UPLINK-TO-MCH-CSW1 speed=10Gbps
set [ find default-name=sfp28-2 ] auto-negotiation=no l2mtu=9220 mtu=9220 name=sfp-sfpplus2-MCH-CANTONMENT speed=10Gbps
set [ find default-name=sfp28-3 ] auto-negotiation=no l2mtu=9220 mtu=9220 name=sfp-sfpplus3-MCH_RACK speed=10Gbps
set [ find default-name=sfp28-4 ] auto-negotiation=no l2mtu=9220 mtu=9220 name=sfp-sfpplus4-MCH-MADINA speed=10Gbps
set [ find default-name=sfp28-5 ] auto-negotiation=no l2mtu=9220 mtu=9220 name=sfp-sfpplus5-DOLPHIN-NNI speed=10Gbps
set [ find default-name=sfp28-6 ] auto-negotiation=no l2mtu=9220 mtu=9220 name=sfp-sfpplus6-WINNEBA-PRIM-NNI speed=10Gbps
set [ find default-name=sfp28-7 ] auto-negotiation=no l2mtu=9220 mtu=9220 name=sfp-sfpplus7-MCH-OLT speed=10Gbps
set [ find default-name=sfp28-8 ] auto-negotiation=no l2mtu=9220 mtu=9220 name=sfp-sfpplus8-AMASAMAN-NNI speed=10Gbps
set [ find default-name=sfp28-9 ] auto-negotiation=no l2mtu=9220 mtu=9220 name=sfp-sfpplus9
set [ find default-name=sfp28-10 ] l2mtu=9220 mtu=9220 name=sfp-sfpplus10-kojo-antwi
set [ find default-name=sfp28-11 ] l2mtu=9220 mtu=9220 name=sfp-sfpplus11
set [ find default-name=sfp28-12 ] l2mtu=9220 mtu=9220 name=sfp-sfpplus12
/interface vpls
add arp=enabled disabled=yes mac-address=02:9E:CF:82:1E:1B mtu=9216 name=DD-VOBISS-INTERNET-UPLINK-XC-SEC-SFP1-VLAN2626 peer=100.64.100.4 pw-l2mtu=9216 pw-type=\
    raw-ethernet vpls-id=2626:0
add arp=enabled disabled=no mac-address=02:BE:12:D4:5A:C8 mtu=9216 name=xconnect-3D-tuba-etix-mtk-pe1-vlan601 peer=100.64.100.14 pw-l2mtu=9216 pw-type=raw-ethernet \
    vpls-id=601:0
add arp=enabled disabled=no mac-address=02:7B:08:2B:5D:89 mtu=9216 name=xconnect-at-sunda-L2-cc-gil-mtk-pe1-sfp1-vlan3291 peer=100.64.100.27 pw-l2mtu=9216 pw-type=\
    raw-ethernet vpls-id=3291:0


/interface ethernet switch
set 0 l3-hw-offloading=yes
/interface list
add name=mpls-interface-list
/interface wireless security-profiles
set [ find default=yes ] supplicant-identity=MikroTik

/mpls traffic-eng path
add name=auto-path use-cspf=yes
add hops=100.64.200.17/strict name=cantonment-mpls-TE-path record-route=yes
add hops=100.64.200.29/strict name=rack-mpls-TE-path record-route=yes
add hops=100.64.200.194/strict name=tak-mpls-TE-path record-route=yes
add hops=100.64.200.170/strict name=amasaman-mpls-TE-path record-route=yes
add hops=100.64.200.14/strict name=madina-mpls-TE-path record-route=yes
add hops=100.64.200.142/strict name=capecoast-mpls-TE-path record-route=yes


/routing ospf instance
add disabled=no mpls-te-address=100.64.100.7 mpls-te-area=0.0.0.0 name=ospf-backbone0 router-id=100.64.100.7
/routing ospf area
add disabled=no instance=ospf-backbone0 name=ospf-area0
/routing bgp template
set default address-families=ip,vpnv4 as=328659 cisco-vpls-nlri-len-fmt=auto-bits disabled=no router-id=100.64.100.7 routing-table=main
add address-families=ipv6,vpnv4 as=328659 cisco-vpls-nlri-len-fmt=auto-bits disabled=no name=DD-INTERNET-UPLINK-v6 router-id=100.64.100.7 routing-table=INTERNET \
    vrf=INTERNET


/interface bridge settings
set use-ip-firewall=yes use-ip-firewall-for-pppoe=yes use-ip-firewall-for-vlan=yes
/interface ethernet switch l3hw-settings
set ipv6-hw=yes
/ip neighbor discovery-settings
set discover-interface-list=all
/interface list member
add interface=cantonment-mch-mpls-nni-sfp1-vlan22 list=mpls-interface-list
add interface=rack-mch-mpls-nni-sfp1-vlan24 list=mpls-interface-list
add interface=tak-mch-mpls-nni-sfp6-vlan75 list=mpls-interface-list
add interface=amasaman-mch-mpls-nni-sfp4-vlan57 list=mpls-interface-list
add interface=madina-mch-mpls-nni-sfp4-vlan79 list=mpls-interface-list
add interface=winneba-mch-mpls-nni-sfp6-vlan59 list=mpls-interface-list
/ip address
add address=100.64.200.18/30 interface=cantonment-mch-mpls-nni-sfp1-vlan22 network=100.64.200.16
add address=100.64.200.30/30 interface=rack-mch-mpls-nni-sfp1-vlan24 network=100.64.200.28
add address=100.64.200.13/30 interface=madina-mch-mpls-nni-sfp4-vlan79 network=100.64.200.12
add address=100.64.200.141/30 interface=capecoast-mpls-mch-mpls-nni-sfp6-vlan63 network=100.64.200.140
add address=100.64.200.169/30 interface=amasaman-mch-mpls-nni-sfp4-vlan57 network=100.64.200.168
add address=100.64.200.193/30 interface=tak-mch-mpls-nni-sfp6-vlan75 network=100.64.200.192
add address=100.64.200.125/30 interface=winneba-mch-mpls-nni-sfp6-vlan59 network=100.64.200.124
add address=100.64.100.7 interface=loopback0 network=100.64.100.7

/ip firewall mangle
add action=mark-routing chain=output new-routing-mark=main passthrough=yes port=161 protocol=udp


/mpls interface
add disabled=no input=yes interface=capecoast-mpls-mch-mpls-nni-sfp6-vlan63 mpls-mtu=9216
add disabled=no input=yes interface=rack-mch-mpls-nni-sfp1-vlan24 mpls-mtu=9216
add disabled=no input=yes interface=tak-mch-mpls-nni-sfp6-vlan75 mpls-mtu=9216
add disabled=no input=yes interface=amasaman-mch-mpls-nni-sfp4-vlan57 mpls-mtu=9216
add disabled=no input=yes interface=cantonment-mch-mpls-nni-sfp1-vlan22 mpls-mtu=9216
add disabled=no input=yes interface=madina-mch-mpls-nni-sfp4-vlan79 mpls-mtu=9216
add disabled=no input=yes interface=all mpls-mtu=9216
/mpls ldp
add disabled=no hop-limit=255 lsr-id=100.64.100.7 path-vector-limit=255 transport-addresses=100.64.100.7
/mpls ldp interface
add accept-dynamic-neighbors=yes disabled=no interface=amasaman-mch-mpls-nni-sfp4-vlan57 transport-addresses=100.64.100.7
add accept-dynamic-neighbors=yes disabled=no interface=cantonment-mch-mpls-nni-sfp1-vlan22 transport-addresses=100.64.100.7
add accept-dynamic-neighbors=yes disabled=no interface=capecoast-mpls-mch-mpls-nni-sfp6-vlan63 transport-addresses=100.64.100.7
add accept-dynamic-neighbors=yes disabled=no interface=madina-mch-mpls-nni-sfp4-vlan79 transport-addresses=100.64.100.7
add accept-dynamic-neighbors=yes disabled=no interface=rack-mch-mpls-nni-sfp1-vlan24 transport-addresses=100.64.100.7
add accept-dynamic-neighbors=yes disabled=no interface=tak-mch-mpls-nni-sfp6-vlan75 transport-addresses=100.64.100.7
add accept-dynamic-neighbors=yes disabled=no interface=winneba-mch-mpls-nni-sfp6-vlan59 transport-addresses=100.64.100.7
/mpls traffic-eng interface
add bandwidth=10Gbps disabled=no interface=cantonment-mch-mpls-nni-sfp1-vlan22
add bandwidth=10Gbps disabled=no interface=rack-mch-mpls-nni-sfp1-vlan24
add bandwidth=10Gbps disabled=no interface=amasaman-mch-mpls-nni-sfp4-vlan57
add bandwidth=10Gbps disabled=no interface=madina-mch-mpls-nni-sfp4-vlan79
/mpls traffic-eng tunnel
add bandwidth=10Gbps disabled=no name=cantonment-mpls-TE-Tunnel primary-path=cantonment-mpls-TE-path secondary-paths=auto-path to-address=100.64.100.2
add bandwidth=10Gbps disabled=no name=rack-mpls-TE-Tunnel primary-path=rack-mpls-TE-path secondary-paths=auto-path to-address=100.64.100.4
add bandwidth=10Gbps disabled=no name=madina-mpls-TE-Tunnel primary-path=madina-mpls-TE-path secondary-paths=auto-path to-address=100.64.100.5
add bandwidth=10Gbps disabled=no name=amasaman-mpls-TE-Tunnel primary-path=amasaman-mpls-TE-path secondary-paths=auto-path to-address=100.64.100.25
add bandwidth=10Gbps disabled=no name=Tak-mpls-TE-Tunnel primary-path=tak-mpls-TE-path secondary-paths=auto-path to-address=100.64.100.8
/routing bgp connection
add address-families=ip,vpnv4 as=328659 cisco-vpls-nlri-len-fmt=auto-bits disabled=no local.address=100.64.100.7 .role=ibgp name=iBGP-KUM-ASANTEMAN remote.address=\
    100.64.100.26/32 .as=328659 router-id=100.64.100.7 routing-table=main templates=default vrf=main
add address-families=ip,vpnv4 as=328659 cisco-vpls-nlri-len-fmt=auto-bits disabled=no local.address=100.64.100.7 .role=ibgp name=iBGP-PEER-AKUSE-MTK-PE1 \
    remote.address=100.64.100.18/32 .as=328659 router-id=100.64.100.7 routing-table=main templates=default vrf=main
add address-families=ip,vpnv4 as=328659 cisco-vpls-nlri-len-fmt=auto-bits disabled=no local.address=100.64.100.7 .role=ibgp name=iBGP-PEER-CAPECOAST-MTK-PE1 \
    remote.address=100.64.100.13/32 .as=328659 router-id=100.64.100.7 routing-table=main templates=default vrf=main

Hello

Please anyone here to assist me?

I’ve seen such uneven CPU load distribution on ARM (RB3011, RB1100) and ARM64 (CCR2004) devices running MPLS on ROSv6.
/tool/profile may give some indication of the cause of this behavior.

Thanks @clambert

I actually added the /tools/profile stat on the initial post. There isnt much there.
You’ll see items like “networking” which really doesnt point out what in networking is
causing the high utilization.

The first thing that stands out to me is that your MTU settings are incorrect for the pseudowire mtu you’re trying to support.

Have you checked for fragmentation?

You need a minimum of 26 bytes of overhead between the VPLS MTU and the MPLS MTU with untagged traffic and an additional 4 bytes per VLAN tag. So, if you want to pass 9216 inside a VPLS pseudowire, then you need at least 9,242 in MPLS MTU and L2 MTU to support it.

Here is the breakdown:

https://help.mikrotik.com/docs/display/ROS/MTU+in+RouterOS

I would also consider disabling hw-offload and testing purely on CPU to determine if the problem is related to hw-offload and MPLS running together since MPLS traffic is not yet supported in hw-offload.

The ISPs that we’ve done work with running MPLS using ROSv6 and ROSv7 have had mixed results. Some of them we had to roll back to ROSv6 only or move to all ROSv7. We weren’t able to pinpoint specific root cause other than observe the symptoms of performance issues and tunnel stability.

How much traffic is going through the CCR2216 when the CPU reaches 100%?

My experience with ROSv6 on ARM equipment (RB3011 and RB1100) used as MPLS PE (VPNv4) tells me that the approximate traffic that the equipment can handle is in the order of the test results for “25 ip filter” rules divided by the number of CPU cores. I have found it impossible to get the load to be divided evenly between all the CPU cores.



@ IPANetEngineer Thanks a lot for the pointers. I will effect that and see the results I get.

I have 3-4 gig traffic per POP.

I haven’t tested the CPU of the CCR2x16, but it seems to be a pretty interesting traffic value. As I wrote above, ROS does not seem to be able to distribute among different CPU cores the traffic coming from the MPLS core (labeled packets).

I guess you are right.. All my customers are complaining, the packet drops are crazy and my monitoring software keeps alerting me on CPU > 90%.
I have tried the MTU change, disabled L3-HW Offloading and yet issue still persist.

I have installed 13pieces of this CCR2216 routers on my network, and it starting to looking its not a match to the CCR1072.
I have no choice now but to retire the CCR2216 and go back for my CCR1072 which was working perfectly until I did the change.

I will probably reinstall them when MIKROTIK fixes the cpu load balancing.
Goodness, this is heart breaking.
CPU-UTILIZATION-NEW-Screenshot 2023-02-17 053455.png

your stable CCR1072 setup is running RouterOS 6 or 7 ?

Its running routerOS7

this is the cpu utitlization on one of the pops i have changed back to CCR1072 and issues of drops are gone.
CPU utilization looks ok..
cpu_new--Screenshot 2023-02-19 085332.png

I am starting to believe that cpu high utilization is from the ROSv7.

I pushed 6Gig through CCR1072 with ROSv6 and CPU was around the 1-2%.
I did same on CCR1072 with ROSv7.7 and CPU started shooting up, individual cores hitting about 70%
with CPU temperature hitting 66C. I think the CCR2216 CPU problem has to do with the ROSv7.
CPU Health.png
ROV6-Good CPU usageScreenshot 2023-03-12 001432.png
ROV7-Not Good CPU usageScreenshot 2023-03-12 001432.png

Is there a way to downgrade the CCR2216 to ROSV6 ??

It’s not possible to run ROSv6 on CCR2216.

it looks like a lot of people are having CPU issues on this new flagship device CCR2216.
Is MIKROTIK really doing anything about it?
You dont buy this high-end device to be tweaking stuff just to get it working properly.
CCR1072 with ROSv6 works OK out of the box.