CCR2216 have terrible IO performances, very much worst than CCR10xx! Maybe they have no DMA?

We found that the new CCR2216 have very poor IO performances. The CPU is fast (e.g. BGP convergence is fast) but the input/output of packets uses a HUGE AMOUNT of CPU!

This is a screenshot of a CCR2216 used a BGP router. It has only 3 ethernet interfaces, FastTrack active, no Firewall, no Queue, nothing: for a total of 7-8Gbps of traffic it uses about 25% of CPU!!! The same task performed by a CCR1072 uses less then 1% CPU!!!

Screenshot_2023-11-13_17-19-12.png
As you can see, the interesting part is that ALMOST ALL CPU power is used in IRQ!

It seems that all transfers of packets from the switch chip to the CPU is done by the CPU instead of a DMA (Direct Memory Access).

If this is true, it is a basic design flaw of the platform and we have no hope of great improvements in the future.

Otherwise, there could be some bug in the driver that handle CPU-Switch transfers, and so we can hope for same great improvement in the future.

Please Mikrotik, let us know how is the situation and if we have hope for the future…

Thanks.

you cannot compare directly the CPU Usage of CCR2216 with CCR1072, most the time in CCR1072 caculation tends to zero or a very low value

aditionally

sugestion: CCR2216 mostly is designed to be used in HW L3 Offload otherwise you will be wasting all the money you payed for it

Keep in mind: if you are focused on CPU forwarding, CCR2116 has the same CPU at a fraction of the price

I don’t understand this sentence, we use a good number of CCR1072 and similar, we know what they can do, we know the CPU load they can reach.

The point here is that FOR I/O OPERATIONS the CCR2216 use a HUGE amount of CPU in IRQ. The CCR10xx don’t, so they can give you much much higher I/O throughput.

It’s a real shame that CCR2216 WASTE almost all of the CPU power in IRQ for packets reading/writing from/to the switch.


First of all, at least until a couple of months ago (when we tried), HW L3 had a lot of bugs and was unusable in production. Moreover, there are situations where it’s not usable: full internet BGP, MPLS, queues, etc…

If CPU was not important, instead of CCR2216 we had used a CRS518 (we need the 25G and 100G (40G actually) ports).

Finally, I would like to point out that Mikrotik REMOVED the CCR10072-CCR1036 from production, they REPLACED them with the CCR2216-CRR2116, but there are situations where the the CCR1072 is still MUCH MUCH more powerful, so we remained without a REAL replacement.

Anyway, we are asking Mikrotik, not what the current situation is, but if something will change in the future.

Can we hope that in CCR2216 there will be significant improvements in the usage of the CPU for the I/O?

Thanks.

Do you also have actual impact / problem in this case, or you are just seeing 25% CPU usage and that’s all? Because from what I can tell, you are not using 75% of your CPU

Not in that case, where there is a BGP with only two 10Gbps upstreams, but could became a problem when we’ll use bigger lines.

Instead we have problems NOW with MPLS! We have a situation with about a total of 20Gbps MPLS traffic in several VPLS that we could not handle with TWO CCR2216 without start loosing packets. Mean CPU usage is around 30-40% and some single CPU at 100% (we think packet loosing is due to this)! We replaced one of them with an old CCR1072 and we have NO packet lost, and mean CPU is less than 1%!!!

Please try 7.13beta1, we made several optimizations for your case (this CPU + MPLS)

So, if there is no hope for a significant improvement in I/O CPU usage with CCR2216, we have to start looking around for some used CCR1072 to replace some of our CCR2216.

Thanks

see above post

OK, thank you, we’ll try it.

Anyway, the question about CPU used in I/O remains, we need to know it to correctly plan future design/purchases.

Thanks

100% sure this user failed to properly configure L3 offloading, single bridge config approach with VLAN segregation. So traffic is going via control plane instead of the data plane (ASIC).
http://forum.mikrotik.com/t/ccr2116-disappointing-cant-do-2gbps-pppoe-single-cpu-95/170531/4

@normis if you read this, I think it’s high-time MikroTik abstracts the config away across all legacy and new hardware to ensure users don’t face L3 offloading/fast-path misconfigurations. Similar to how we work on JunOS for example, I don’t need to think “single bridge + this + that” to make sure things are offloaded correctly.

As we already said, we CANNOT use L3HW here: it uses MPLS/VPLS!

In another case, we have a BGP router with Full Internet (and we have around 300.000 connections), so I have some doubts that L3HW work. Until today we didn’t have the courage to activate it (also due to bad previous experiences). Anybody have experiences in these cases? L3HW can give some relief with 1M routes table? Or the router will continuously change/swap the routes that put in HW, maybe slowing it? And is it now stable enough for production?

And there are other cases where L3HW cannot be used: queues, complicate filtering and NAT, etc…

I don’t get the fascination of having two or three beefy routers do everything for the whole network. To me that’s a really bad single point of failure. Use each type of router for the things it does best, or design the network around the limitations of each.

CCR1000’s are good at processing lots of packets. CCR2000’s are good at switching/routing lots of packets. Have the 1072’s handle the CGNAT, queueing, filtering, etc. and let the 2216’s handle your BGP peering with L3HW offload enabled. Make them each redundant, and the core becomes bulletproof.

I have four 2116’s (same CPU, just smaller L3HW switch chip) running as BGP border routers. For a while I had them load up the full Internet routing table from four different peers. They work just fine with L3HW acceleration turned on and full BGP tables. They don’t spend a lot of time swapping routes back and forth. They have very basic firewalls on the input chain, mainly to protect the router itself from external access. They do not have any queues and do not run NAT. They are BGP border routers, after all. With L3HW offload enabled, their CPU hovers around 5-10%. Recently I set the input filters to cull out anything beyond 3 or 4 ASN’s away, which helps simplify troubleshooting routing issues, and almost all routes are HW-offloaded. I’ve also experimented with L3HW off and on, and have seen no performance degradation leaving it off, even if the CPU usage jumps to 15% or more.

I have another 2116 connected to those four external routers, and it aggregates everything before handing traffic to the next router. That next machine is a 1036 which handles NAT and some traffic shaping. It then hands traffic off to our core CRS317, which “switches” at Layer 3 to the distribution routers, which consist of a bunch of CRS300’s doing L3HW offload or quad-core routers (CCR2004, RB4011, RB5009).

@sirbryan Thank you for sharing your experience.

We’ll try again to give more chance to L3HW.
Last time we tried we found some problems in case of ECMP (equal-cost multi-path) too. Anybody knows if it is now solved?

Thanks

Single bridge for MPLS/VPLS with VLAN filtering and segregation using PVID. Read the link I shared and then read MikroTik official docs. Only single bridge can exist for max performance.

L3HW is supported for NAT/FastTracked etc.

OP, clearly never read this:
https://stubarea51.net/2021/11/14/isp-design-guide-separation-of-network-functions-introduction-and-overview/

https://stubarea51.net/2022/05/02/webinar-isp-design-separation-of-network-functions/

giannici try latest v7.13beta2 and see if there are improvements.

The problem is NOT the single bridge, we know it. The problem is that Mikrotik has not yet implemented handling of MPLS in HW.


We are not talking of “user” router. As we said, there can be many problems: we have more than 500.000 connection, much more than can be handled in HW. Some kind of NAT cannot be handled in HW. Etc…

And there are other situations where HW cannot work: Queues, VLAN nesting, etc…
So, it’a pity that CCR1072 has been retired…

Thank you.
Exactly in what area we should see improvements?

We’ll try it as we can, these are “production” routers and testing these conditions in lab is not easy.

Chip is not fabricated anymore

Unfortunately Tile-GX Architecture do not gained traction in the market, is 10 years Old with no iterations nor improvements