Community discussions

MikroTik App
 
Rfulton
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 99
Joined: Tue Aug 08, 2017 2:17 am

CCR2004 High CPU Usage ROS7

Mon Sep 27, 2021 1:54 am

Just like the title says.

900mbps on ros6 ~5-10% cpu usage.

Same config on ros7 ~35% cpu usage.

With Cake I'm at ~60% cpu usage.

Can we expect improvements in the future? Or will I need a more powerful model to get 2gb consistently with Cake.
 
mducharme
Trainer
Trainer
Posts: 1777
Joined: Tue Jul 19, 2016 6:45 pm
Location: Vancouver, BC, Canada

Re: CCR2004 High CPU Usage ROS7

Mon Sep 27, 2021 2:25 am

I found similar CPU increases across the board with all devices that I have tested, and have been wondering the same thing. My RB4011 at home has similar results - a speedtest where the highest load CPU core is at ~15% on ROS 6, moving to ROS 7 with the same / equivalent config causes 28% usage on the highest load CPU.

MikroTik has actually quantified some of these performance changes in the brochure for the RB5009 where it shows the percentage performance increase above the RB4011. That's actually the performance increase of the RB4011 running RouterOS v7, not v6, whereas the product page for the RB4011 has the RouterOS v6 values. The RB5009 on RouterOS v7 supposedly has similar performance to the RB4011 on RouterOS v6.

I have some devices in the field nearing their limits that I may have to upgrade in order to sustain the same throughput with RouterOS v7.

CAKE is a more complex queuing mechanism that certainly will be a higher load on the router. It might be best to compare apples to apples, do testing with RED (random early drop) on RouterOS v6 vs RouterOS v7, and leave the newer queuing mechanisms out of it.
 
User avatar
raimondsp
MikroTik Support
MikroTik Support
Posts: 267
Joined: Mon Apr 27, 2020 10:14 am

Re: CCR2004 High CPU Usage ROS7

Mon Sep 27, 2021 10:52 am

RoutersOS v6 uses routing cache while ROS v7 doesn't. While ROS6 performs routing faster in the happy-path scenario (cache hit), its performance is significantly slower in the rest of the cases (cache miss). Usually, synthetic speed tests utilize a small number of routes and, therefore, always hit the cache. However, studies had proven that there are 80-90% of the cache miss in real-life router usage. In other words, only 10-20% of real-life cases may show routing speed as high as in the tests. Moreover, there are specific DDoS attacks on routers to invalidate the routing cache and make the routing drastically slow.

ROS7 has a completely different route lookup algorithm. While ROS7 performs routing slower or/and uses more CPU than ROS6 in the happy-path scenario (usually - tests), it does a way better job in the rest of the cases (real-life usage) and is not vulnerable to the DDoS mentioned above. By the way, the latest Linux kernels also got rid of the routing cache.

And as @mducharme correctly said, CAKE is a completely different story.
 
mducharme
Trainer
Trainer
Posts: 1777
Joined: Tue Jul 19, 2016 6:45 pm
Location: Vancouver, BC, Canada

Re: CCR2004 High CPU Usage ROS7

Mon Sep 27, 2021 1:03 pm

Thanks @raimondsp for the clarification. I suspected that this could be related to the removal of the route cache in newer Linux kernel versions, which is something that is out of MikroTik's control. I wasn't sure if there were some other differences in the kernel as well that might account for some performance changes. For instance, if I took an ordinary desktop PC system that was several years old and ran Windows XP or Windows 7 on it, and then compared my experience with running Windows 10 or Windows 11, I would probably find that Windows XP or Windows 7 seemed to run faster than modern Windows versions on that system, since newer desktop operating systems would assume a higher performance CPU and other components.
 
Rfulton
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 99
Joined: Tue Aug 08, 2017 2:17 am

Re: CCR2004 High CPU Usage ROS7

Mon Sep 27, 2021 2:51 pm

I want to get as much of this in writing as I can.

Just to clarify, you claim that the CCR2004, which was built FOR ROS7 which all 25, and 10gb SFP+ ports, can't to 2gb because of route caching, something that you didn't expect to exist on ROS7.

To be clear, my ccr2004 won't reach 2gb without Cake, so I have no idea why you brought up Cake, maybe a coping mechanism or trying to once again deflect the problem to me end.
 
mducharme
Trainer
Trainer
Posts: 1777
Joined: Tue Jul 19, 2016 6:45 pm
Location: Vancouver, BC, Canada

Re: CCR2004 High CPU Usage ROS7

Mon Sep 27, 2021 3:30 pm

Just to clarify, you claim that the CCR2004, which was built FOR ROS7 which all 25, and 10gb SFP+ ports, can't to 2gb because of route caching, something that you didn't expect to exist on ROS7.
I don't see how you think that this will prevent the CCR2004 from getting to 2Gbps because of route caching no longer being available.

If you look at the test results for the CCR2004-16G-2S+, the benchmark I usually go for is the 25 ip filter rule, 512 byte Mbps rate, you will see: 3525.5 Mbps

This (25 ip filter rule, 512 byte Mbps rate) is the benchmark that I rely on because it most consistently represents the performance of the device in the real world, where you are not just using the default configuration but adding a few extra things.

The CCR2004-16G-2S+ is only available with ROS v7 and so therefore you should be safe routing up to about 3500Mbps on it in RouterOS v7. CAKE may change things.

Also, you never said that your router wouldn't reach 2Gbps without CAKE. You only mentioned hitting 35% CPU usage with 900Mbps traffic, which certainly doesn't indicate that you can't reach 2Gbps or higher speeds.
 
User avatar
macgaiver
Forum Guru
Forum Guru
Posts: 1764
Joined: Wed May 18, 2005 5:57 pm
Location: Sol III, Sol system, Sector 001, Alpha Quadrant

Re: CCR2004 High CPU Usage ROS7

Mon Sep 27, 2021 5:28 pm

I want to get as much of this in writing as I can.

Just to clarify, you claim that the CCR2004, which was built FOR ROS7 which all 25, and 10gb SFP+ ports, can't to 2gb because of route caching, something that you didn't expect to exist on ROS7.
1) On any non ASICS device (anything else than switches, basically), in my experience 10G port is just "More that 1G" port - in case you need 1,5Gbps for example. Same applies to 25G it is just "more that 10G" ports.
2) If you would take a look at block diagram of that device, you will see that CPU is connected to the device via 2x25G lines, so 50Gbps total is theoretical maximum.
3) Performance table on the same device shows that using 1518Byte packets and fastpath (bare minimum of configuration) were able to reach 40Gbps, with traffic that consists of lots of stateless connections (not-TCP)

co combining all 3 of these things together having 2x "more than 10G" and 12x "more that 1G" ports - makes sense.

About your 2Gbps - if your testing involves limited number of connections (<100) and is using statefull connections (like TCP), those 2Gbps are close to max any non-ASICS device can do in the world.
 
mducharme
Trainer
Trainer
Posts: 1777
Joined: Tue Jul 19, 2016 6:45 pm
Location: Vancouver, BC, Canada

Re: CCR2004 High CPU Usage ROS7

Wed Sep 29, 2021 12:43 am

There is now further information in the latest newsletter about the performance of the CCR1009 model in RouterOS v7, which we can compare to RouterOS v6:

RouterOS v6 - 25 ip filter rules, 512 bytes, Mbps: 3251.8 Mbps
RouterOS v7 - 25 ip filter rules, 512 bytes, Mbps: 2618 Mbps

This is about a 20% decrease. Is this entirely due to the lack of route caching, or is it due to something else? I suppose I am wondering about the benchmark methodology, and how much the route caching feature benefited v6 in these benchmark situations. Also, I know that v6 route caching for the most part can't be turned off without losing a ton of features, but I wonder if it might allow for a third point of comparison.
 
User avatar
raimondsp
MikroTik Support
MikroTik Support
Posts: 267
Joined: Mon Apr 27, 2020 10:14 am

Re: CCR2004 High CPU Usage ROS7  [SOLVED]

Wed Sep 29, 2021 12:27 pm

There is now further information in the latest newsletter about the performance of the CCR1009 model in RouterOS v7, which we can compare to RouterOS v6:

RouterOS v6 - 25 ip filter rules, 512 bytes, Mbps: 3251.8 Mbps
RouterOS v7 - 25 ip filter rules, 512 bytes, Mbps: 2618 Mbps

This is about a 20% decrease. Is this entirely due to the lack of route caching, or is it due to something else? I suppose I am wondering about the benchmark methodology, and how much the route caching feature benefited v6 in these benchmark situations. Also, I know that v6 route caching for the most part can't be turned off without losing a ton of features, but I wonder if it might allow for a third point of comparison.

Every Test results page has the following side note:
Test results show maximum device performance and are reached using mentioned hardware and software configuration; different configurations most likely will result in lower results
Since we use the identical setup to test performance on all devices, Test Results is a reliable source to compare MikroTik devices with each other. The one with bigger numbers has higher performance. However, those numbers are not the same that you get in real-life usage. Think that as car acceleration and top speed data: the car with lower 0-100 km/h time and higher top speed is faster. However, those numbers are not the same that you get on public roads.


Now let's go deeper into the numbers and try to understand what those mean.
  • (ROS v6) There is almost 100% routing cache hit during the tests.
  • (ROS v6) Usually, there is only 10-20% cache hit (and, respectively, 90-80% cache miss) in real-life usage. It is hard to tell precisely due to the number of connections, traffic patterns, and router's RAM (more RAM = bigger routing cache).
  • There is no routing cache in ROS v7. V7 Test Results are closer to real-life performance.

We have tested v6 routing cache hit vs. miss performance. As already said, the results may vary due to conditions, but roughly those are (for 512-byte packets):
  • Cache hit gives 4x (400%) performance boost on the fast path.
  • Almost double (200%) performance with 25 simple queues.
  • ~60% performance increase with 25 ip filter rules. There is a significant firewall processing overhead here (unaffected by routing), so routing speed plays a smaller role in total.

If we compare v7 rest results with v6 (on the same device:
  • V7 has totally reworked fast path, which brings a ~15% performance increase even over v6 cache-hit case (or ~4.6x vs. cache-miss). Moreover, some devices support fast path HW offloading (e.g., CRS317), boosting it close to wire speed.
  • 25 simple queues: v7 performs ~60% slower than v6 cache-hit but ~30% faster than cache-miss.
  • 25 ip filter rules: v7 performs ~25% slower than v6 cache-hit but ~20% faster than cache-miss.

Now let's combine the above data with the "There is only 10-20% cache hit in real-life usage" statement. Let's take the most positive (for v6) case: 20% cache hit.
Total performance ROSv6 (using v6 cache-miss case as a reference):
  • Fast path: 80% + 4 * 20% = 160%
  • 25 simple queues: 80% + 2 * 20% = 120%
  • 25 ip filter rules: 80% + 1.6 * 20% = 112%
Total performance ROSv7 (using v6 cache-miss case as a reference):
  • Fast path: 460% (almost 3 times faster than v6). And that's without HW offloading.
  • 25 simple queues: 130% (8% faster than v6)
  • 25 ip filter rules: 120% (7% faster than v6)

We can also calculate the cache-hit ratio (x) for "512-byte / 25 ip filter rules beyond" case where v6 routing could perform faster than v7:
(100 - x) + 1.6x = 120
0.6x = 20
x = 33
Hence, v6 and v7 routing performance match if 33% of the packets hit the routing cache on v6. This is clearly unrealistic, unless there is a small number of connections going through a high-end device (e.g., using CCR1036 in SOHO network).
 
mducharme
Trainer
Trainer
Posts: 1777
Joined: Tue Jul 19, 2016 6:45 pm
Location: Vancouver, BC, Canada

Re: CCR2004 High CPU Usage ROS7

Wed Sep 29, 2021 4:32 pm

@raimondsp Thanks very much for the detailed explanation!
 
Zock
just joined
Posts: 9
Joined: Wed Mar 27, 2019 3:07 am

Re: CCR2004 High CPU Usage ROS7

Mon Jul 25, 2022 10:37 pm

There is now further information in the latest newsletter about the performance of the CCR1009 model in RouterOS v7, which we can compare to RouterOS v6:

RouterOS v6 - 25 ip filter rules, 512 bytes, Mbps: 3251.8 Mbps
RouterOS v7 - 25 ip filter rules, 512 bytes, Mbps: 2618 Mbps

This is about a 20% decrease. Is this entirely due to the lack of route caching, or is it due to something else? I suppose I am wondering about the benchmark methodology, and how much the route caching feature benefited v6 in these benchmark situations. Also, I know that v6 route caching for the most part can't be turned off without losing a ton of features, but I wonder if it might allow for a third point of comparison.

Every Test results page has the following side note:
Test results show maximum device performance and are reached using mentioned hardware and software configuration; different configurations most likely will result in lower results
Since we use the identical setup to test performance on all devices, Test Results is a reliable source to compare MikroTik devices with each other. The one with bigger numbers has higher performance. However, those numbers are not the same that you get in real-life usage. Think that as car acceleration and top speed data: the car with lower 0-100 km/h time and higher top speed is faster. However, those numbers are not the same that you get on public roads.


Now let's go deeper into the numbers and try to understand what those mean.
  • (ROS v6) There is almost 100% routing cache hit during the tests.
  • (ROS v6) Usually, there is only 10-20% cache hit (and, respectively, 90-80% cache miss) in real-life usage. It is hard to tell precisely due to the number of connections, traffic patterns, and router's RAM (more RAM = bigger routing cache).
  • There is no routing cache in ROS v7. V7 Test Results are closer to real-life performance.

We have tested v6 routing cache hit vs. miss performance. As already said, the results may vary due to conditions, but roughly those are (for 512-byte packets):
  • Cache hit gives 4x (400%) performance boost on the fast path.
  • Almost double (200%) performance with 25 simple queues.
  • ~60% performance increase with 25 ip filter rules. There is a significant firewall processing overhead here (unaffected by routing), so routing speed plays a smaller role in total.

If we compare v7 rest results with v6 (on the same device:
  • V7 has totally reworked fast path, which brings a ~15% performance increase even over v6 cache-hit case (or ~4.6x vs. cache-miss). Moreover, some devices support fast path HW offloading (e.g., CRS317), boosting it close to wire speed.
  • 25 simple queues: v7 performs ~60% slower than v6 cache-hit but ~30% faster than cache-miss.
  • 25 ip filter rules: v7 performs ~25% slower than v6 cache-hit but ~20% faster than cache-miss.

Now let's combine the above data with the "There is only 10-20% cache hit in real-life usage" statement. Let's take the most positive (for v6) case: 20% cache hit.
Total performance ROSv6 (using v6 cache-miss case as a reference):
  • Fast path: 80% + 4 * 20% = 160%
  • 25 simple queues: 80% + 2 * 20% = 120%
  • 25 ip filter rules: 80% + 1.6 * 20% = 112%
Total performance ROSv7 (using v6 cache-miss case as a reference):
  • Fast path: 460% (almost 3 times faster than v6). And that's without HW offloading.
  • 25 simple queues: 130% (8% faster than v6)
  • 25 ip filter rules: 120% (7% faster than v6)

We can also calculate the cache-hit ratio (x) for "512-byte / 25 ip filter rules beyond" case where v6 routing could perform faster than v7:
(100 - x) + 1.6x = 120
0.6x = 20
x = 33
Hence, v6 and v7 routing performance match if 33% of the packets hit the routing cache on v6. This is clearly unrealistic, unless there is a small number of connections going through a high-end device (e.g., using CCR1036 in SOHO network).
Hello,

Even with that explanation i think i do not get why the bandwidth capacity (Mbps) gets lower with V7 compared to V6?
 
paulct
Member
Member
Posts: 336
Joined: Fri Jul 12, 2013 5:38 pm

Re: CCR2004 High CPU Usage ROS7

Tue Jul 26, 2022 12:31 pm

As it is not a real life test behind a busy network...
 
woobilicious
just joined
Posts: 3
Joined: Sat Dec 24, 2022 12:04 am

Re: CCR2004 High CPU Usage ROS7

Sat Dec 24, 2022 12:49 am

Hence, v6 and v7 routing performance match if 33% of the packets hit the routing cache on v6. This is clearly unrealistic, unless there is a small number of connections going through a high-end device (e.g., using CCR1036 in SOHO network).
Sorry to Necro old thread, but my HAP Ac2 gets about 30% of v6 performance, as in, I get 200Mbps instead of the 600Mbps I got previously, This is on Single file downloads like games or speed tests.

I noticed that the CPU is loaded on one core full on IRQ requests, the 78b5000.spi (the counter doesn't seem to climb tho) driver, I'm not expert coder, But I know just enough of kernel stuff that you're not supposed to do a lot of processing inside an IRQ handler, packets are self contained, you don't need to process them all on one core to maintain ordering or syncro, so why aren't we processing a cross multiple cores? Again, why is so much heavy lifting done inside an single IRQ handle request, why is it always on one core?

The only other difference I know that might effect performance is my old 1gbit speed connection was IPoE, and new 300mbps connection is PPPoE, Is there anyway to have my CAKE and eat it to? Can I limit fast track to just the PPP connection?

I currently have fasttrack disabled on downloads so I get my full-speed, but it feels somewhat inelegant solution to what feels like bad drivers.

Who is online

Users browsing this forum: cciprian and 27 guests