10 Gbit CHR performance on Proxmox not what it should be?

Hi everyone,

I’m testing 10 Gbit networking in CHR on proxmox between two CHR VMs with bandwidth test and I’m only able to hit speeds of about 3-3.5 Gbit/s on both UDP and TCP (udp a little bit higher, but encountering network stalls - transfer randomly drops to zero for a second on udp, before recovering). Each CHR has 4x 2.6 GHz cores and 1 GB ddr4 ram. CPU utilization is at around 20-25% on both ends during transfer. CHR instances are connected with linux bridge using VirtIO. Firewall is disabled. I tried multiqueue, changing CPU to “host” - no difference. Vmxnet3 and intel e1000/e1000e all perform worse.
To double check I tried transferring a big file from one virtual windows machine to another over a network share, and hit similar speed (~300 MB/s reported). Hard Drive (ssd) performance is not a factor, capable of 4 GB/s +.

This is a server with an intel xeon 6142 cpu.
VM specs:
image_2025-01-13_225156851.png
Bandwidth test
image_2025-01-13_225719920.png
I also tested with exact same setup on a different server with a Xeon Silver 4210R and on that one I can hit double the speed on TCP at 6+ Gb/s and nearly triple on UDP at 10 Gb/s on UDP, while also not encountering any connection stalls. It is seems strange how this lower performing CPU (albeit 2 years newer) is able to do this, while the other one cant:
Bandwidth test
image_2025-01-13_230742683.png
Anyone got any clue or tips how I can improve performance or what am I doing wrong?

Don’t use the built in bandwidth test as it can hammer the CPU affecting performance and not giving real performance results.

Use something like iperf on each side of your CHR to measure traffic through your CHR.

Also your choice of NIC driver for the VM can affect performance. For best performance you should use passthrough to the physical NIC.

Borderline necroposting but I was wondering if you ever fixed this.
I’m in the same boat with Proxmox + 10G NIC and my CHR can’t really do more than 4G.
The network itself, linux bridge on Proxmox etc have been thoroughly tested with other linux VMs and iperf and fully saturate the 10G link.

The original poster’s issue isn’t a bug to fix. Using the in-built bandwidth test means the device is both generating the traffic and routing the traffic. Doing that yields lower performance numbers than if the device was simply acting as a router / firewall.

The proper way to test higher levels of bandwidth is to use a separate testing device (or devices).

I’m assuming that the OP wasn’t testing this for shits and giggles, rather he had an actual use case, found the bottleneck and proceeded with further testing.

Additionally, his screenshots don’t really show the CPU being hit very hard. theoretically it should go higher.

I’m using iperf3 between linux VMs being routed via this CHR and getting similar speeds as with the built-in bandwidth test, around 3.5G.
This has been tested with VMs on the same Proxmox node, on the same linux bridge, and also VMs on different linux bridges and even different nodes (where the actual NIC throughput is taken into account).
In all cases, routing via the CHR yields the same, ~3.5G performance, while routing through a physical device works properly and completely saturates the link.

Then start a new thread, rather than necroposting, because your issue is clearly different. The original OP showed screenshots of the in-built bandwidth test, not of separate iPerf tests. And without knowing extensive details of what kind of hardware you’re running (processor, memory speed / cas latency, NICs, etc.) or system configuration (host and guest) there’s no way of knowing why your performance is not very high. Even something as simple as PPPoE could have drastic impacts on your system throughput.

You seem pretty hard set on the opinion that OP just booted up two CHRs to do bandwidth tests between them, and not had some actual bandwidth issue that led to him making these tests, glossing over the fact that he said he also tried a file transfer that also got similarly limited.

My reply here is because its a fairly similar issue and maybe he fixed something, that could apply to my case.

Still, I followed your advise and opened an new thread:

Blockquote You seem pretty hard set on the opinion that OP just booted up two CHRs to do bandwidth tests between them, and not had some actual bandwidth issue that led to him making these tests, glossing over the fact that he said he also tried a file transfer that also got similarly limited.

This is not at all what I’ve said. But hey, good on you for at least doing the right thing and opening a new thread.