I am sitting in front of a CCR1072 (1000MHz clocked), running on ROS 6.49.13 which is used a company edge-router. With around 120 firewall-rules (in different chains), 50 NAT-rules, 130 Mangle-rules, 220 routes, 200 Simple-Queues and 1 Bridge with 40 VLANs. But I would say one third of everything is disabled (until needed).
This device is able to push routed (VLANx to VLANy) around 280-300Mbit/sec through a TCP-single-stream connection. Tested with different performance-tools, everytime same results. Because I have no reference, is this an acceptable value for such a configuration and device? This table: https://mikrotik.com/product/CCR1072-1G-8Splus#fndtn-testresults is not very helpful, because it shows with 512byte and 25 IP-Firewall-Filter rules, its able to do 20691 Mbps. But in real world, devices are way more complex configured than 25 firewall rules.
On the CCRs single stream = single core. And the TILE cores aren’t very fast; the CCR1072 just has a lot of them. Dividing the 20691 Mbps by 72 gets you 287 Mbps per core, matching your result. If you want better single stream performance, get something with beefier cores, like a CCR2xxx.
For us we learned the hardway, We are only using 1072 for EDGE routing purposes BGP/OSPF no NAT/Firewall and Queues they are fine, for Access Concentrators we stick for a dozen of 1036 with rule of thumb of 950 customers per 1036 doing NAT and Queue, For some of our PoP with premium customers which we do handover /32 routable public IP is between 1500 to 1700 customers per 1036 with no NAT and Connection Tracking is disabled (Bandwidth Management is in OLT)
We really accept the fact that NAT and Queue is killing 1036 if you give 50,60,70,80 and 100MB plan per customer.
I know this is not the answer of the OP is looking for I just want to share the reality we see in the field
Thats a good point, this correlates with what I see to almost 100%!
Just for clarification, lets say the CCR2216: https://mikrotik.com/product/ccr2216_1g_12xs_2xq#fndtn-testresults
He has a routing perfomace (512byte), 25 Firewall rules of 15552 Mbps (compared to the 20691 Mbps of the CCR1072). This seems not better to me? Or is a (TCP)-single-stream-connection distributed over more cores? Than it would be 15552 Mbps compared to 287 Mbps?
No, it’s still 1 stream = 1 core, AFAIK. But the cores of the 2216 are a lot more powerful (as it has only 16 of them). Looking at the numbers, I’d expect about 1 Gbps per stream for a 2216. So if you have a lot of TCP streams, the 1072 may actually be faster than the 2216 overall. For a small number of ‘elephant streams’, the 2216 wins. And if you have a simple routing setup (which you do not), the 2216 crushes the 1072 because of the L3 HW offload it can do.
I understand, 15552 Mbps / 16 cores = 972 Mbps vs. 287 Mbps.
That would be excellent! I personally consider the 287 Mbps of the 1072 in the mentioned complex confiuration quite good too, I dont complain. But I had no reference, thank you!