We had a RB750Gr3 that was starting to max out on CPU load on a router that was being used as a PPPoE server with about 180 PPPoE clients and also running some queues, filters, etc with about 400Mbps of data flowing through during peak times. We decided to try a RB5009 and a RB4011 to see how they would handle the CPU load. You can see on the chart where the RB750Gr3 was being used from 12/25 through 1/12 and then the RB5009 from 1/13 through 1/16 and then the RB4011 from 1/17 through 1/23. It seems the RB4011 is handling the load about the same but for some reason something is spiking the Processor 1 much more than when using the RB5009. Does anyone have any ideas what might be causing this?
But if you look at Processor 2, 3 and 4, they are under almost the exact same load on the 4011 as they were on the 5009. It is only Processor 1 that is spiking now on the 4011 that wasn’t on the 5009. What does Processor 1 handle that the other processors do not?
Some tasks in RouterOS are not multi-threaded and will be handled only by a single CPU.
That is why in some cases you need a router where 1 CPU is fast (e.g. RB5009) and other solutions (like the CCR which has many slower CPUs) are not going to cut it.
On the other hand, I never cease to be amazed that a user (I presume a company) handles 180 PPPoE clients and buys a RB750Gr3 to do it… unbelievable!
The intention wasn’t to put that many connections on this RB750Gr3 but we had a second unit that failed and we were planning to upgrade to either the RB4011 or RB5009 but had trouble getting them because of the supply chain issues so they were all temporarily put on this one router. Also, as you can see, the RB750Gr3 actually handled the load fairly well and just recently started hitting 100% during peak load.
Under a pretty good load but while the other processors are only showing 10-20% usage, CPU0 “referenced as Processor 1 in PRTG?” is showing 62% with the following details in Profile.
looking profiling we can see it, the problem is related to balancing of load between cores, but cheking the rb750Gr3 graph we see similar behavior charging one thread/core most than the 3 remaining, exasperated by the especial fact in rb750Gr3 (mediatek soc with 2 cores 4 threads) 2 threads share resources of one core, because this we have the impresion of more balanced load but at the end the behavior is very simmilar
The novelty here is rb5009 doing a better job balancing the load between cores
Which version of routeros were running this 3 devices ?
footnote: normally some configurations end up creating this situation when load mostly a single thread/core, for example when most traffic goes across a single tunnel this leads to a single thread/core uneven load
This router does have a bridge for 5 of the ports which a lot of the traffic and PPPoE connections are going through, I was just thinking that one thing about the RB5009 that I read was a better switching processor, could this be the difference?