In several manufacturers of Routing and Firewall equipment there are definitions that somehow allow the minimum and/or maximum number of CPU (or vCPU) CORES to be defined for control plane process affinity.
Ex: In the ACOS of the A10, by default, an equipment thread is exclusively dedicated to the control-plane, nothing from the Data-plane goes to this core, and nothing from the control-plane is executed outside this core. And with the command multi-ctrl-cpu 2 (requiring reboot) you increase the number of CPUs dedicated to the control-plane to 2.
Solutions very similar to these occur with Palo Alto and Fortinet and others.
Is there any way to do something similar to this with RouterOS?
While I was preparing this I was reminded of BGP, which gets an excellent amount of attention for working well with multi-core. I also remembered some processes that are still forgotten as single-thread.
Another thing to remember is that at the time of RouterOSv6 with CCR1036 and CCR1072, the use of affinity, IR, RPS for scenarios with large throughput volumes was a mandatory trick.
And in current versions I don't remember needing to touch /system/resource/irq/ to deal with packet forwarding issues.
My only pain is still finding a way to ensure that Dataplane-related processes don't grow so much that they eat up all the resources and cause control-plane processes to be affected. Like OSPF and BGP not responding properly because the box is under some kind of DDoS attack.
Is there any way to ensure that the box's control plane doesn't die when it is receiving a denial of service attack?
Not exactly... But if you tell me that it's possible to use "nice" in RouterOS to achieve my objective... I'm all ears!
I imagine this is more aligned with cgroups, PID hierarchy, which PID calls which process.
But talking about this from the outside is pure guesswork.
What I need is a way to be able to connect via Winbox SSH API, be sure that OSPF and BGP won't crash, that SNMP continues to work and respond, even when 10M pps tries to enter through one or more interfaces.
And you're totally correct that cgroups is the appropriate toolset to ensure that any process (group) can't eat up all the cpu. Though I would be much more interested in the memory limiting part, because currently any process leaking memory can crash the router.
Why this can't be done is because Mikrotik uses the in-kernel networking as its data plane (as opposed to the others you mention, who use vpp, dpdk, etc.) Cgroups is strictly a user space thing.
I already imagined something like that would be said.
And it is indeed true.
P.S.: And perhaps that's why I've lost a bit of hope of seeing eBPF on RouterOS anytime soon.
But I reiterate, speaking of the topic I started the thread on.Min/Max CPU cores exclusively dedicated control plane functions.
It's a crude way of guaranteeing some level of resource allocation. But it's simple and effective.
Widely used by several other vendors who also ventured into Ring0.
It won't solve the memory allocation problem the way you dream, but it will guarantee CPU resources.
I imagine they use RC.Local. And it already has a built-in hierarchy of processes. So it would basically be a matter of saying which processes should be children of which processes. Is that really so far away?