CRS 3xx - L3 ASIC performance testing

Did some work on testing the L3 performance last week in 7.1beta2 and published it today.

https://stubarea51.net/2020/10/12/mikrotik-routerosv7-first-look-l3-asic-performance-testing/

Thanx, higher than I expected. Nicely done.

Regarding small MTU tests (tests #1 and #2), I suppose that the bottleneck is on the packet generator or receiver side, not the CRS317. As you see, PPS (packets per second) value is almost the same in all three cases, and the transfer speed depends purely on packet size. That is a typical case for CPU, where each packet causes an interrupt, which, in turn, adds performance overhead. ASIC doesn’t care much about the packet count.

Thanks for the feedback…i’ll check the hypervisor and see if it’s creating a bottleneck somewhere.

This. In fact you did well to get 1 Mpps from a Linux box (Proxmox/KVM) without any tuning. CloudFlare had to put a lot of effort into tuning to get that number - https://blog.cloudflare.com/how-to-receive-a-million-packets/

+++

Hello to all.
May the CRS with routeros be used as BGP router to forward almost wirespeed packets?

I think it’s possible. Only if there aren’t enough routes to fit in memory.

CRS with RouterOS can be used as a BGP router unless the number of routes exceeds the hardware memory capabilities.

Refer to “List of supported devices and their limits” table on the link below:
https://wiki.mikrotik.com/wiki/Manual:CRS3xx_series_switches#L3_Hardware_Offloading

On the link it says:
Depending on the complexity of routes in routing table, max HW accelerated route count could change (see table below for min-max supported route count for each hardware). Whole-byte IP prefixes (/8, /16, /24, etc.) occupy less HW space than others (e.g., /22).
If HW route limit is reached new routes will fall back to CPU, except cases when newly added route overlaps with already existing routes processed by hardware. In this case destinations that were processed in hardware will continue to be processed in hardware. The user should choose the device with HW capability large-enough to store all the routes


Yes. I have seen that doc.
Is there a way to raise that limit?

@IPANetEngineer
Very Nice L3 Forwarding test … hopefully in 2021 the production stable version of ROS7 will be completed.

Unfortunately, it is the hardware limitation. There is not enough internal memory in the switch chip to offload the full BGP table. However, if possible, there is an option to limit the incoming BGP route prefixes via

/routing/filter/

Also, we are working on an option to filter out the prefixes for offloading, i.e., to offload routes with potentially the most traffic while the rest gets processed by the CPU.

If the router needs to handle the full BGP table, I suggest looking forward to CCR devices rather than CRS.

Very very interesting.
Using RouterOS we could use BGP to have some internal routes (less than 1000).
we could route them L3 in hardware…
Is something related to fastpath here? Or can we use some firewall filters?
we wont need conntrack or something similar.

Hello.
In your article are missing the notes,
I mean in the table of the max number of connections, are notes, but are not in the page.

There are two distinct L3HW modes in RouterOS v7.1beta2:

  • l3hw=yes (a.k.a. full routing or l3-switching) - the entire routing table gets offloaded to the hardware; traffic gets routed entirely by HW; nothing goes though CPU, and therefore, ROS stateful firewall does not work.
  • l3hw=fw - Firewall-compatible routing. Initially, packets go through CPU/Firewall, then Fasttrack connections get offloaded to the hardware. Consider this as a hardware-accelerated L4 stateful firewall. Unfortunately, the number of hardware connections is strictly limited by the capacity of the internal hardware memory.

Please note that we are talking about a stateful firewall here. Stateless firewall still can be set in l3hw=yes mode via switch ACL rules:

/interface/ethernet/switch/rule/

Hello
Perfect.
But the question is:
a) l3hw=yes (a.k.a. full routing or l3-switching) - the entire routing table gets offloaded to the hardware; traffic gets routed entirely by HW; nothing goes though CPU, and therefore, ROS stateful firewall does not work.
In routerOS will be enabled fastpath then?
If we set some rules on the INPUT chain just to protect the router, we lose the hardware feature?

b) l3hw=fw - Firewall-compatible routing. Initially, packets go through CPU/Firewall, then Fasttrack connections get offloaded to the hardware. Consider this as a hardware-accelerated L4 stateful firewall. Unfortunately, the number of hardware connections is strictly limited by the capacity of the internal hardware memory.

Is there a table? I have seen in the link at the first post, but it is not clear what the number means… 3750 connections, really? it is very low…

thank you

No, ROS firewall (/ip/firewall) does not work simply because packets never enter CPU.


The traffic to the router itself (packet destination IP = router IP; INPUT chain) is unaffected by the l3hw. The firewall stays fully functional here. The same applies to outgoing traffic (OUTPUT chain).
Regarding routed traffic (FORWARD chain, or PRE/POSTROUTING chains for forwarded packets), in the case of l3hw=yes, setting those rules does nothing because the firewall (/ip/firewall) does not get triggered. You need to set l3hw=no or l3hw=fw to make the stateful firewall to work. However, a stateless firewall still is an option via switch ACL rules. For example, you can allow/block specific IP addresses/prefixes or TCP/UDP ports. More info here: https://wiki.mikrotik.com/wiki/Manual:CRS3xx_series_switches#Switch_Rules_.28ACL.29


Yes, unfortunately, the number of hardware connections is limited. Actually, it is 4500 if used without MPLS. Mikrotik smart offloading algorithm picks the heaviest (traffic-wise) connections for offloading at any given time. Other (slower) connections get processed by the CPU. So the number of connections can be much greater. For instance, we tested CRS317 with 10k connections, and it worked fine.

Please take into account that CRS (Cloud Router Switch) series are more “switch” than a “router”. Consider the ability to run an L4 hardware-accelerated firewall more like a bonus feature rather than a common use-case. For heavy routing, please look into the CCR series.

Currently, Mikrotik engineers are working on a “hybrid l3hw mode” which allows running both l3hw=yes + l3hw=fw on the same device. For example, it will allow hardware inter-VLAN routing (with an unlimited number of connections) while running Firewall/NAT on the upstream port(-s).

Thank you for you explanations.
The idea was to use a CRS to route l3 between interfaces at FAAAAST speed via BGP.
The issue is how can I protect the router itself then ?
Never tried the switch rules…

@raimondsp: can you kindly compare different modes of operation of l3hw to HW-offloaded L2? I can imagine many parallelisms, but as I don’t have any experience with CRS3xx L3 offloading I can’t say if those parallelisms are real or imaginary.

I’m so sorry for misleading. INPUT/OUTPUT chains are unaffected by l3hw because the hardware redirects those packets to/from the CPU. The firewall (that is running on the CPU) stays fully functional in such cases. Hence, enabling l3hw does not affect your abilities to protect the router itself.

What I really meant (but originally failed to explain) is that, in the case of l3hw=yes, you cannot enable the firewall on forwarded traffic. For example, to protect a server behind a router.

I edited my original post to avoid confusion.