Hello, I am testing different L3 HW Offload configurations on a CCR2216 (RouterOS v7) and I would like to confirm with the community which design is the most optimal.
๐น Network scenario
- Device: CCR2216-1G-12XS-2XQ (RouterOS v7.x)
- Traffic: ~10 Gbps transit + ~600k BGP routes received (full table from transit + IXs)
- Interfaces:
- sfp28-1: transit (BGP with provider, IP local 10.0.0.0/30 VLAN 601)
- sfp28-2: IX #1 (185.1.90.0/24 VLAN 602)
- sfp28-3: IX #2 (193.149.1.0/24 VLAN 603)
- sfp28-4: output to customer servers (own ASN + /24)
๐น Option 1: all uplinks inside the bridge with L3HW
- All ports (uplinks and clients) are inside bridge SWICH with
vlan-filtering=yesandl3-hw-offloading=yes. - Each uplink is placed into a dedicated internal VLAN (601, 602, 603) as access, and clients in VLAN 1 (untagged).
- Results:
- CPU: ~20% while forwarding 10 Gbps.
/routing/route/print count-only where afi=ip and active and hw-offloadedโ ~500k routes./interface/ethernet/switch/l3hw-settings/monitor:ipv4-routes-total: 521361 ipv4-routes-hw: 197064 ipv4-routes-cpu: 324296 nexthop-cap: 8192 nexthop-usage: 136
- Question: Is it normal that RouterOS shows ~500k โhw-offloadedโ but the L3HW monitor only reflects ~197k installed in the ASIC? I understand that very large aggregates (/0โ/21) stay in CPU by design, but I want to confirm.
๐น Option 2: uplinks outside the bridge (no L3HW)
- Only the customer port (sfp28-4) is inside the bridge with
l3-hw-offloading=yes. - The 3 uplinks are outside the bridge with
l3-hw-offloading=no. - Results:
- CPU: ~1% while forwarding 10 Gbps.
/routing/route/print count-only where afi=ip and active and hw-offloadedโ ~10k routes./interface/ethernet/switch/l3hw-settings/monitor:ipv4-routes-total: 10 ipv4-routes-hw: 10 ipv4-routes-cpu: ~400k
- Here forwarding is done in software (CPU), not the ASIC.
๐น My questions
- Which option is really the most production-optimal for future scaling (40โ100 Gbps, millions of PPS)?
- Is it correct to assume that Option 1 is the only design that really uses the ASIC, even if it shows ~20% CPU instead of 1%?
- Is it expected to see such a difference between the โhw-offloadedโ routes in
/routing/routevs those inl3hw-settings/monitor(e.g., 500k vs 197k)? - Are there any recommended tweaks to further reduce CPU load in Option 1? (firewall/conntrack, bridge settings, shortest-hw-prefix, etc.)
๐น Extra info (under load)
- Option 1:
/tool profile cpu=all duration=30sโ all cores show ~1โ8% usage but there is usually one core close to 90%. - Option 2:
/tool profile cpu=all duration=30sโ all cores ~0โ1%. /interface/bridge/settings/printโuse-ip-firewall=no,use-ip-firewall-for-vlan=no.- No NAT or firewall rules on transit traffic.
ipv4-shortest-hw-prefixis usually ~22.
๐ Has anyone else with CCR2216 and full tables observed the same? Which design is the recommended practice: all uplinks in the bridge (Option 1) or leaving them outside (Option 2)?
Thanks in advance ๐