There is now further information in the latest newsletter about the performance of the CCR1009 model in RouterOS v7, which we can compare to RouterOS v6:
RouterOS v6 - 25 ip filter rules, 512 bytes, Mbps: 3251.8 Mbps
RouterOS v7 - 25 ip filter rules, 512 bytes, Mbps: 2618 Mbps
This is about a 20% decrease. Is this entirely due to the lack of route caching, or is it due to something else? I suppose I am wondering about the benchmark methodology, and how much the route caching feature benefited v6 in these benchmark situations. Also, I know that v6 route caching for the most part can't be turned off without losing a ton of features, but I wonder if it might allow for a third point of comparison.
Every Test results page has the following side note:
Test results show maximum device performance and are reached using mentioned hardware and software configuration; different configurations most likely will result in lower results
Since we use the identical setup to test performance on all devices,
Test Results is a reliable source to compare MikroTik devices with each other. The one with bigger numbers has higher performance. However, those numbers are not the same that you get in real-life usage. Think that as car acceleration and top speed data: the car with lower 0-100 km/h time and higher top speed is faster. However, those numbers are not the same that you get on public roads.
Now let's go deeper into the numbers and try to understand what those mean.
- (ROS v6) There is almost 100% routing cache hit during the tests.
- (ROS v6) Usually, there is only 10-20% cache hit (and, respectively, 90-80% cache miss) in real-life usage. It is hard to tell precisely due to the number of connections, traffic patterns, and router's RAM (more RAM = bigger routing cache).
- There is no routing cache in ROS v7. V7 Test Results are closer to real-life performance.
We have tested v6 routing cache hit vs. miss performance. As already said, the results may vary due to conditions, but roughly those are (for 512-byte packets):
- Cache hit gives 4x (400%) performance boost on the fast path.
- Almost double (200%) performance with 25 simple queues.
- ~60% performance increase with 25 ip filter rules. There is a significant firewall processing overhead here (unaffected by routing), so routing speed plays a smaller role in total.
If we compare v7 rest results with v6 (on the same device:
- V7 has totally reworked fast path, which brings a ~15% performance increase even over v6 cache-hit case (or ~4.6x vs. cache-miss). Moreover, some devices support fast path HW offloading (e.g., CRS317), boosting it close to wire speed.
- 25 simple queues: v7 performs ~60% slower than v6 cache-hit but ~30% faster than cache-miss.
- 25 ip filter rules: v7 performs ~25% slower than v6 cache-hit but ~20% faster than cache-miss.
Now let's combine the above data with the "There is only 10-20% cache hit in real-life usage" statement. Let's take the most positive (for v6) case: 20% cache hit.
Total performance
ROSv6 (using v6 cache-miss case as a reference):
- Fast path: 80% + 4 * 20% = 160%
- 25 simple queues: 80% + 2 * 20% = 120%
- 25 ip filter rules: 80% + 1.6 * 20% = 112%
Total performance
ROSv7 (using v6 cache-miss case as a reference):
- Fast path: 460% (almost 3 times faster than v6). And that's without HW offloading.
- 25 simple queues: 130% (8% faster than v6)
- 25 ip filter rules: 120% (7% faster than v6)
We can also calculate the cache-hit ratio (x) for "512-byte / 25 ip filter rules beyond" case where v6 routing could perform faster than v7:
(100 - x) + 1.6x = 120
0.6x = 20
x = 33
Hence, v6 and v7 routing performance match if 33% of the packets hit the routing cache on v6. This is clearly unrealistic, unless there is a small number of connections going through a high-end device (e.g., using CCR1036 in SOHO network).