Community discussions

MikroTik App
 
SignumFera
just joined
Topic Author
Posts: 10
Joined: Tue May 23, 2023 2:41 pm

High CPU on 2216 with intervlan routing

Wed May 24, 2023 3:54 pm

Hi All,

We have an issue, logged with support as well (SUP-116528) for the high CPU usage including the extra 10% when switching to the documented setup

We have a pair of CCR2216, lets call them R1 and R2, currently routing between 30Gbps and 60Gbps, inter vlan. The traffic is load balanced across the 2 using ECMP.

Both these routers have both 100G legs connected to the same switch, as below
----------   ----------
|   R1   |   |   R2   |
----------   ----------
 |     |      |     |
 |     |      |     |
 |     |      |     |
-----------------------
|         S1          |
-----------------------
The initial idea is to let traffic enter the switch on the one port and exit the other. In this we then have "downstream" vlans on qsfp28-1-1 and "upstream" vlans on qsfp28-2-1.

This, obviously does not allow for L3HW offloading, so we redid the config on R2. Placing both interfaces in a LACP bond, putting the bond into a bridge and using vlan filtering brought the vlans to the cpu for routing.

This allowed our prefixes to be installed into the switch chip - all the routes got their "H" flag, however the CPU usage increased with 10%, which normally is around 43% when routing 15Gbps. Once we suppress HW offloading on our BGP routes, the cpu comes down to 17%. In hindsight this makes sense as the bulk of our traffic will be towards the customer, who's routes will be in the IGP.

After running on HW offloading for about an hour, R2 stopped responding on the LACP bond, the switch reported both the ports as suspended, rebooting R2 brought the bond up again, but only for a few seconds, then the switch suspended the ports again.

Unsupressing the BGP routes and rebooting R2 again brought the router back up and stable, albeit not offloading the routing.

I saw nothing in R2's logs to indicate why the ports would suspend on the switch side. The switch merely reported that the ports got suspended.

I suspect that we were hitting the chip to hard for the chip to send LACP PDU's which caused the switch to suspend the bond. This, however, is merely a theory. Would not mind if someone could test that theory.

I should note that R1 is running ROS7.6 and R2 is running ROS7.9

Who is online

Users browsing this forum: Batterio, Bing [Bot], fibracapi, iustin and 78 guests