You imply that Fastpath and/or Fasttrack is a hard requirement in order to do any kind of traffic forwarding at scale. When the hardware is sufficiently spec’d out, I don’t find that to be the case. I’m running plenty of x86 installations that are doing just fine without it. It’s way more important for the RouterBOARD models on the lower end with slower processors than for beefy x86 boxes.
All the guides I’ve read say that your multiqueue should be set equal to the number of CPUs, you’ve given a VM. I assume you’ve tested that on your set up?
My personal set up has eight cores on a 13900h with a multiqueue = 8. All I know is the performance increased when I said it higher. I haven’t played around with it to find an optimal value though.
You all right. I feel stupid relying on a company whose main focus is selling hardware devices, making the software aspect kind of like Cinderella.
Unfortunately, the company hides the hardware compatibility list with Intel hardware, so I found it on an old page version and decided to try thinking the best of the company (my bad, won’t do that again).
Meanwhile, in the v7.19 changelog:
*) conntrack - improved stability on busy systems;
*) system - improved system stability when sending TCP data from the router;
*) x86 - i40e updated driver to version 2.27.8;
Microsoft is selling a software called Windows and making it run flawlessly on any hardware can be a challenge as well? So you really want to make Mikrotik responsible for not supporting every possibly known hardware combination worldwide ever custom built? But you made one point: Mikrotik should declare Hardware requirements for x86 platform.
That looks a bit weird with that steady increase between 12:30 and 14:30, almost like a memory leak. Do a full export so we can take a look, and maybe “someone” might even have time to do a quick test in the lab. What kind of traffic is going on?
Regarding hardware support, since there’s no official list of supported hardware, you pretty much have to email support and ask. As a rule of thumb, you can assume most mainstream x86 drivers are included from Linux 5.6.3, plus a few legacy drivers that have been ported over from ROS v6.
Have you tried 7.15.3 or 7.16.x? I know my CRS300’s had issues with 7.15.3 causing random reboots due to a memory problem, so 7.16.2 fixed those, but 7.16.x on my CCR2116’s would have random BGP “stuck route” problems, so they were great on 7.15.3 (and now 7.19.1).
I have also encountered your problem. When the number of routers exceeds 32 cores, various strange and unusual issues arise
Try disabling hyper threading or shutting down some cores, and most likely the problem will be resolved
Re; … I have also encountered your problem. When the number of routers exceeds 32 cores …
I’ve seen this many times when I increase the CPU cores on a CHR vm ( such as increasing from 24 cores to 40 ).
I never took the time to find the magic number where a CHR starts to fall apart when increasing CPU cores.
Question - Are you finding that it happens at greater than 32 cores or somewhere around 32 cores?
Question - All CHR ROS version’s ( 6.x and 7.x ) ?
Yes, I have a large number of x86 physical machines with installed routeros environments. The basic problems occur when the single core is 100% larger than 32 cores. I also encountered machines with 72 cores and X86 cores. If there is a slight amount of traffic, the device will automatically restart immediately. Trying to disable the watchdog also did not work until the hyper threading was turned off. There are also some machines with dual 20 cores and 40 threads that cannot solve the problem of a single core being full. Changing the CPU will work normally. These problems are extremely easy to occur in machines with more than 32 cores. Until recently, I tried to test the cracked version of routeros, which has a shell that can enter the system bottom layer. Then I wrote a Python script to optimize the handling of IRQ interrupts, bind CPUs, enable XPS, and modify them. The values of net.cre.netdev-budget and net.cre.rps_stock_flow_detries in the kernel indicate that I have 72 x86 cores The router can work normally now, without the problem of CPU single core being fully occupied or automatic restart
Because I need to use the IPv6 version of router, it has been abandoned and I have only tried the V7 version. I have also encountered situations where 32 core network cards only work on some CPUs in the V7 version CHR. My environment has 32 VLAN based WAN interfaces, with 2 IXGBE 10G interfaces, and CPU0.. 32 is bound to each network card queue. Only 0-15 works normally