A CCR handling around 9 Gbps, only 3 firewall RAW rules, BGP peering with 2 upstreams, downstream traffic only (no full table), and the CPU usage is already hitting 50%+. Has anyone else experienced similar behavior?
What exactly value bothers you?
Which value does increase unpredictibly?
Is total load jumping unexpectedly?
Is the load stable?
What values would satisfy you? 9Gb is not a "low" throughput. It's also not a "high" to be honest.
Any other observations?
A better view would be to look at all 16 cores and see if any of them are pegged at 100%. When you look at the overall CPU load that doesn’t scale directly with the actual activities. Adding another 10Gb of throughput might only move the average load to 70% instead of to 100% as you might think, since 50% = 9Gb in your workload.
It is stable, not jumping unpredictably. When traffic goes above 7–8 Gbps, CPU stays consistently between 45% and 55%.
What really stands out is the impact of RAW firewall rules. With RAW disabled, CPU sits around ~25%. As soon as I enable only 3 very simple RAW rules, CPU usage immediately increases to ~55%, with the same traffic profile.
So the issue is not instability, but the CPU cost of RAW processing at this throughput level.
For this kind of traffic (receive-only, no full BGP table, minimal firewalling), I would expect significantly lower CPU utilization.
There is no single CPU core hitting 100%. The load is distributed across cores.
My requirement, however, is to be able to handle up to ~20 Gbps of traffic, with no more than ~3 Gbps of outbound traffic.
Given that at 7–9 Gbps inbound the system is already at ~50% CPU with only 3 simple RAW rules enabled, this raises concerns about how much headroom is realistically available to reach 20 Gbps under similar conditions.
connection tracking is turned off completely. In that same default setting, if you add only one single firewall rule, then the whole connection tracking mechanism will be enabled.
Which means the jump you saw from 25% to 55% CPU usage is not the cost of 3 rules, but the cost of connection tracking + 3 RAW rules.
The load will not jump to 85% if you add 3 more RAW rules, but will only increase minimally.
That's an expected (even favorable) cpu usage for your device. As others have suggested, even a single firewall rule has the effect of disabling fastpath.
Your device is optimized for l3hw offloaded use.
Even if l3hw is not something you want, the hardware is able to execute most rules that you would have in the raw table in the switch chip, programmable in the switch->rules menu. This is done without affecting throughout at all.
If you want a proper firewall, the switch chip also provides flow offloading (in the documentation referred to as fasttrack offloading) but to use this, fasttrack has to be used.
Overall it seems that you are probably somewhat misusing/misconfiguring the device.