CHR and Proxmox bandwidth issues

Hello all,

I’m running CHR on Proxmox, with a 10G link and I’m seeing bad performance in terms of bandwidth for anything routing through the CHR.

iperf3 tests between a VM routing through the CHR will max out at 5G.
Similarly, iperf3 tests between VMs routing through the CHR but not going out the physical interface, rather between linux bridges, again out at 5G.
The moment I enable a fasttrack-connection rule these drop to ~3G, in both cases.
Bandwidth tests between the CHRs, both with same licenses and similar configs, also max out at ~4G
No CHR goes over 45% total on /tool/profile.

Replacing the CHR with a physical CCR2004 unit sees all problems go away.
iperf3 tests from the Proxmox machine itself are flawless.

Any ideas?


Extra info:

CHR license is a p10.

Proxmox (v7.4-17, kernel 5.15.149-1-pve)

Mobo: ASRockRack ROMED8-2T
CPU: AMD EPYC 7252 8-Core Processor 
RAM: 8x 32GB DDR4-3200 Registered DIMM CL22 1Rx4 1.2V 16Gbit Hynix C w/Rambus
NIC: Mellanox Technologies MT27710
Drive: Kingston KC3000 PCIe 4.0 NVMe M.2 SSD

CHR VM has gone through multiple configurations while testing, increasing resources and changing NIC types.
Current version is :

bios: seabios
boot: order=scsi0
cores: 4
cpu: host
machine: q35
memory: 1024
name: CHR
net0: virtio=8A:94:E5:4B:E8:F7,bridge=vmbr0,queues=4
net1: virtio=8A:94:E5:4B:E8:F8,bridge=vmbr102,queues=4
net2: virtio=8A:94:E5:4B:E8:F9,bridge=vmbr103,queues=4
numa: 0
onboot: 1
ostype: l26
scsi0: local-zfs:vm-101-disk-0,size=512M
scsihw: virtio-scsi-pci
smbios1: uuid=92ffc649-a694-45e9-a721-67e7c248e144
sockets: 1
vga: virtio
vmgenid: f906f731-ae47-4881-b4c9-4ae290e8cc40

CHR config has been dumbed down for testing. Here’s what it’s currently like:

# 2025-07-09 18:09:37 by RouterOS 7.19.3
# system id = tvZPQLDiyLG
#
/interface ethernet
set [ find default-name=ether1 ] disable-running-check=no
set [ find default-name=ether2 ] disable-running-check=no
set [ find default-name=ether3 ] disable-running-check=no
/ip pool
add name=kb-dmz ranges=10.220.3.100-10.220.3.200
/ip dhcp-server
add address-pool=kb-dmz interface=ether3 lease-time=2w1d name=KB-DMZ
/ip address
add address=10.220.2.254/24 comment=WAN interface=ether2 network=10.220.2.0
add address=10.220.0.253/24 interface=ether1 network=10.220.0.0
add address=10.220.3.254/24 interface=ether3 network=10.220.3.0
/ip dhcp-server network
add address=10.220.3.0/24 dns-server=10.220.3.254 gateway=10.220.3.254
/ip dns
set allow-remote-requests=yes servers=10.220.3.1,10.220.3.2,10.220.3.3
/ip firewall address-list
add address=10.220.3.0/24 list=MASQ
add address=10.220.2.0/24 comment=dev-dmz list=DMZ
add address=10.220.0.0/16 comment=dev list=INTERNAL
add address=10.220.0.0/16 disabled=yes list=NOMASQ
/ip firewall filter
add action=accept chain=input src-address-list=MANAGEMENT
add action=accept chain=input connection-state=established,related
add action=drop chain=input in-interface=ether2
add action=fasttrack-connection chain=forward connection-state=\
    established,related disabled=yes hw-offload=no
add action=accept chain=forward connection-state=established,related
add action=accept chain=forward connection-nat-state=dstnat
add action=reject chain=forward dst-address-list=INTERNAL src-address-list=\
    DMZ
/ip firewall nat
add action=masquerade chain=srcnat dst-address-list=!NOMASQ out-interface=ether2 \
    src-address-list=MASQ
/ip route
add dst-address=0.0.0.0/0 gateway=10.220.2.100
/ip ssh
set forwarding-enabled=both
/system clock
set time-zone-name=Europe/Athens
/system identity
set name=TEST-CHR

iPerf3 is single threaded. What test parameters are you using? For instance:

  • How many streams are you running (-P flag)?

  • Are you using iPerf with TCP or UDP?

If TCP:

  • What window size are you running?

If UDP:

  • What’s your targeted bitrate? (-b flag)

If you run tests comparing between the two, what’s the performance difference?

I’ve tested with multiple options. Parallel streams ranging from 4 to 128 (which seems to be a hard limit for iperf3). I’ve tested both TCP (up to 256k window size, although I may have tested with higher values) and UDP with unlimited bitrate (-b 0).
I’m currently using iperf3 -c <ip> -p 5201 -P 16 as, if not routing via the CHR, it can completely saturate the 10G link and when testing “locally” between VMs on the same linux bridge, it goes up to 20G.

At the moment I don’t care about comparing between TCP and UDP. I only want to see performance reaching close to 10G, in any kind of test or scenario, through the CHR.

Blockquote At the moment I don’t care about comparing between TCP and UDP. I only want to see performance reaching close to 10G, in any kind of test or scenario, through the CHR.

Fair enough. Since you’re clearly frustrated, I’ll politely bow out here and let someone else try to help you.

Not sure where you got “frustrated”, but sure. Thanks for your help.

Can you test the 60-day free P-Unlimited trial license? P10 has a 10G upload limit per interface but I don’t know how MikroTik enforces that. Do they drop packets if your links are > 10G? If that’s the case then it might affect TCP congestion avoidance algorithms and achievable throughputs might be way slower than 10G per direction.

If this was the case, then UDP tests should get close to the 10Gbps limit. If @sin3vil tests show UDP performance similar to TCP performance, then it’s not TCP congestion avoidance which limits the throughput.

However: using UDP doesn’t guarantee better test results than TCP. In the past I did see iperf perform worse with UDP traffic than with TCP traffic. I could never figure out what was the detail which made UDP test perform badly though. But I guess it was the network technology which was the bottleneck (LTE with it’s multi-layered backhaul).

Hi guys, thanks for the input.
I did test with an unlimited license and there’s some slight improvement, although I can’t be sure it’s not random. I’m still not getting anything close to 10G when routing out the physical link nor ~20G when routing between linux bridges, that I’m getting without the CHR in place.

Hello I have similar issues. Im looking for hypervisor to replace old ESXI. On old ESXI and XCP-NG I got 10G. On Proxmox Im getting similar results to yours (roughly 5G). Therefore, I believe it has to be something between Proxmox and CHR. Is anyone successfully running CHR on 10G network using Proxmox? I’m new to Proxmox, so I this very much could be an error there.

so what is important when doing this kind of testing is that you get into a ‘known good’ state

for example if you set up an LXC container on proxmox, which uses a veth pair you are very likely to get the speeds you want (but that doesn’t really help that much)

virtio across VMs is significantly less performant, whenever possible i recommend people use SR-IOV because at least that is proper drivers with proper hardware access (I dont have CHR so i cant comment on whether it supports SR-IOV very well but most of the time they’ll work out the box because the drivers are natively in the linux kernel)

additionally, check what version of iperf3 you are using - the stuff from the repo in proxmox is very old and they only somewhat added proper multithreaded support so make sure to version match iperf3 as much as possible (that one caught me out, i didn’t expect a huge performance difference in iperf3 versions) so build from source / install directly the newest versions when doing these tests

theres a variety of flags you can play about with in relation to the host too, but when figuring out all of this its best to only turn one dial at a time and observe the changes and then start turning other dials and logging those changes or else you end up where you’ve changed tons of stuff and have no clue what helped / hindered.

also just noticed your proxmox is pretty old at this point, i’m not going to suggest upgrading because thats yet another variable that you would need to account for but its something to consider too in the future

finally looks like you have a connectx 4, i’ve got a connectx 5 and set up ASAP2 / DOCA networking offloads (switchdev) semi recently - thats pretty nifty and you’d likely benefit from doing that too (but again more effort and time spent figuring out edge cases)

also wait wtf i just realised you have a connectx 4, why are you bothering with virtio - you have a decent nic, set up SR-IOV. things will perform a lot better and you’ll see lower cpu utilisation as well

iperf3 for me, VM to VM using latest iperf3 and using ASAP2 / DOCA networking offloads lets me route @ 25gbit with very minimal cpu usage

Thanks for the general proxmox tips mrpops2ko but at the moment I’m trying to focus specifically on CHRs performance. This 7.4 cluster is a dev setup, production (using similar hardware) is on 8.4 and has the same issues.

For the record, iperf3 between two VMs on separate linux bridges, routing through the proxmox host itself using ip-forwarding in the kernel, sees performance go up to 22Gbps. While that might be less than expected with this setup, it’s still 4 times the performance routing through the CHR is getting.
I also tested other virtual routers, like VyOS and OPNSense which go up to 10Gbps with the same VM configuration (although not 22G that I’m seeing when routing directly).

I also tried SR-IOV, ran into a similar issue to this, and generally had issues setting it up properly while maintaining some semblance of the existing setup (using linux bridges).
I also had a very hard time installing OFED/Mellanox tools as the installers seem to be tied to specific distros and kernels. Can I ask how you got these installed?

your other thread mentioned that it turned out to be a motherboard issue - did you do any further testing re: sr-iov after that?

if its legacy sr-iov (which it looks like it is based upon what you’ve posted) that’ll just work out the box for most stuff

switchdev is all the newer stuff and you do things slightly different, if you want to make use of the full suite of asap2 / doca offloads you need to make use of OVS bridges rather than linux ones but none of this took more than an afternoon to set up

when doing the DOCA switchdev installer focus on having 1) your firmware up to date for the nics, 2) your kernel up to date / matching as best as possible on the host, to get the most chance of running into the least errors, 3) do similar to what i mentioned before and only change 1 thing at a time

i recently started to get kernel panics with it, and it was super hard to diagnose but i think i’ve managed to root cause it (been stable for quite a few days now) in that it doesn’t like when you limit cpu affinity on an LXC container because the representors aren’t limited like that - so you can end up in scenarios where it tries to access memory that it shouldn’t have access to and causes a kernel panic (or at least thats the working theory, that 1 change seems to have resolved it)

i built a script to monitor offloads to verify, but its easy enough to check it with text using ovs-appctl dpctl/dump-flows type=offloaded

but this all i’d suggest is something you look at after getting into a stable state, legacy sr-iov like you linked in that thread - how did you get on with that?

mrpops2ko, no that thread isn’t mine. It’s describing a similar issue to what I’m facing with SR-IOV, being unable to create virtual functions on the secondary NIC of the CX4. I don’t have a different type of motherboard to test with.
I didn’t really get SR-IOV working. While I could create multiple VF on the primary NIC, passing them to a VM worked (in the sense that it appeared as a Mellanox interface) we couldn’t also add them as interfaces on a linux bridge, which is kind of a deal breaker.

Can readily change to open vswitch. We had some issues previously with OVS, like in VE6, Proxmox defaults to linux bridges and we didn’t have 10G requirements then, so we committed and are now sort of tied to them.

In any case, all this is great if you need true 25G throughput with minimal CPU between VMs and the real world. But right now we’re getting 22G with the CPU (and vCPUs) not breaking a sweat, unless we try to route traffic through the CHR (and admittedly other virtual routers although with slightly better performance), without taking into consideration the physical interfaces, just pure linux bridges.

Do you have the Mellanox user-space tools installed? mlxconfig, mst etc? I had much trouble getting them to install and run while troubleshooting the secondary NIC issue, before giving up and trying on the primary.

ah right, yeah the general way you engineer your infrastructure when using SR-IOV is that you give SR-IOV interfaces to everything and they all communicate that way - you can bridge it by say slapping another router in place (say having some virtual machine that has 1x sr-iov interface and 1x virtio interface) and do it that way, or the more proper way is to update the forwarding database of the nic itself and using hook scripts - but both of them kind of defeat the purpose of SR-IOV because the whole point is that you offload the utilisation

not sure what to suggest on your problem about the secondary port, only time i’ve seen something like that happen was due to it being actively used or in IB mode

i have all the userspace tools installed on the host, the guests and lxc containers all make use of the default out of the box drivers (well the lxc’s inherit the drivers from the host) but mlnx5_en is in the linux kernel by default

the only issue i had was that i had to disable secure boot because mst complains about that, i tried signing those kernel drivers myself for some stuff which worked but mst in particular i think was a problem so i just turned secure boot off in the end - i’ll probably revisit enabling it again if that single change solved my kernel panic issues (which im reasonably sure it did)

from everything you’ve described, that just seems like normal virtio performance - an lxc will perform better with them because that uses a different virtio driver and doesn’t have to deal with the same emulation but engineering your setup to make use of SR-IOV is the play imo

Can I ask how you got the Mellanox tools on the host (assuming that’s indeed Proxmox)?
The official installers whined about the distro and kernel, supplying --skip-distro-check then whined about existing packages (like pve-manager) and wanted to remove them, supplying other switches to force user-space package installation sort of worked but then mst complained that it couldn’t “open” the port or some such.

The, practical, issue is that we’re seeing performance all over the place. Ubuntu 24.04 VMs using virtIO go up to 22G, which is “fine”. VyOS and OPNSense with virtIO, surely using vastly different drivers than Ubuntu 24.04, go up to 10G and then CHR is up to 5G, at most, with minimal firewall/nat rules and very low CPU utilization. Like, it’s surely a CHR issue at this point but I’d like to know what and why and possibly bring it up to 10G, which will cover us for the time being and allow for more research and testing later.

i’m using the latest proxmox, so it was just a case of installing DOCA and job done - it’ll run through everything and work fine, i used the ubuntu one because it had a higher kernel version

maybe you were trying to install the OFED stuff which is now EOL? its all under the nvidia DOCA installer

root@pve:~# pveversion -v
proxmox-ve: 8.4.0 (running kernel: 6.14.5-1-bpo12-pve)
pve-manager: 8.4.1 (running version: 8.4.1/2a5fa54a8503f96d)
proxmox-kernel-helper: 8.1.1
proxmox-kernel-6.14.5-1-bpo12-pve: 6.14.5-1~bpo12+1
proxmox-kernel-6.8.12-11-pve-signed: 6.8.12-11
proxmox-kernel-6.8: 6.8.12-11
proxmox-kernel-6.8.8-3-pve-signed: 6.8.8-3
proxmox-kernel-6.8.8-2-pve-signed: 6.8.8-2
proxmox-kernel-6.5.13-6-pve-signed: 6.5.13-6
proxmox-kernel-6.5: 6.5.13-6
proxmox-kernel-6.5.13-5-pve-signed: 6.5.13-5
amd64-microcode: 3.20240116.2+nmu1
ceph-fuse: 17.2.7-pve1
corosync: 3.1.9-pve1
criu: 3.17.1-2+deb12u1
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx11
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libknet1: 1.30-pve2
libproxmox-acme-perl: 1.6.0
libproxmox-backup-qemu0: 1.5.1
libproxmox-rs-perl: 0.3.5
libpve-access-control: 8.2.2
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.1.1
libpve-cluster-perl: 8.1.1
libpve-common-perl: 8.3.2
libpve-guest-common-perl: 5.2.2
libpve-http-server-perl: 5.2.2
libpve-network-perl: 0.11.2
libpve-rs-perl: 0.9.4
libpve-storage-perl: 8.3.6
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.6.0-2
proxmox-backup-client: 3.4.2-1
proxmox-backup-file-restore: 3.4.2-1
proxmox-firewall: 0.7.1
proxmox-kernel-helper: 8.1.1
proxmox-mail-forward: 0.3.3
proxmox-mini-journalreader: 1.5
proxmox-offline-mirror-helper: 0.6.7
proxmox-widget-toolkit: 4.3.11
pve-cluster: 8.1.1
pve-container: 5.2.7
pve-docs: 8.4.0
pve-edk2-firmware: 4.2025.02-3
pve-esxi-import-tools: 0.7.4
pve-firewall: 5.1.2
pve-firmware: 3.15-4
pve-ha-manager: 4.0.7
pve-i18n: 3.4.5
pve-qemu-kvm: 9.2.0-6
pve-xtermjs: 5.5.0-2
qemu-server: 8.4.0
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.7-pve2
root@pve:~# dpkg -l
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                 Version                                       Architecture Description
+++-====================================-=============================================-============-==================================================================================
ii  doca-all                             3.0.0-058000                                  amd64        doca-all meta-package
ii  doca-apsh-config                     3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) Tool
ii  doca-bench                           3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) Tool
ii  doca-caps                            3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) Tool
ii  doca-comm-channel-admin              3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) Tool
ii  doca-devel                           3.0.0-058000                                  amd64        doca-devel meta-package
ii  doca-dms                             3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) Service
ii  doca-flow-tune                       3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) Tool
ii  doca-host                            3.0.0-058000-25.04-debian121                  amd64        Doca repo bundle package
ii  doca-ofed                            3.0.0-058000                                  amd64        doca-ofed meta-package
ii  doca-openvswitch-common              3.0.0-0056-25.04-based-3.3.5                  amd64        Open vSwitch common components
ii  doca-openvswitch-switch              3.0.0-0056-25.04-based-3.3.5                  amd64        Open vSwitch switch implementations
ii  doca-pcc-counters                    3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) Tool
ii  doca-perftest                        1.0.1                                         amd64        RDMA benchmark application
ii  doca-runtime                         3.0.0-058000                                  amd64        doca-runtime meta-package
ii  doca-samples                         3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) Samples
ii  doca-sdk-aes-gcm                     3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) SDK
ii  doca-sdk-apsh                        3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) SDK
ii  doca-sdk-argp                        3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) SDK
ii  doca-sdk-comch                       3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) SDK
ii  doca-sdk-common                      3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) SDK
ii  doca-sdk-compress                    3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) SDK
ii  doca-sdk-devemu                      3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) SDK
ii  doca-sdk-dma                         3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) SDK
ii  doca-sdk-dpa                         3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) SDK
ii  doca-sdk-dpdk-bridge                 3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) SDK
ii  doca-sdk-erasure-coding              3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) SDK
ii  doca-sdk-eth                         3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) SDK
ii  doca-sdk-flow                        3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) SDK
ii  doca-sdk-pcc                         3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) SDK
ii  doca-sdk-rdma                        3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) SDK
ii  doca-sdk-sha                         3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) SDK
ii  doca-sdk-sta                         3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) SDK
ii  doca-sdk-telemetry                   3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) SDK
ii  doca-sdk-telemetry-exporter          3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) SDK
ii  doca-sdk-urom                        3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) SDK
ii  doca-sha-offload-engine              3.0.0058-1                                    amd64        DOCA SHA OpenSSL Offload Engine
ii  doca-socket-relay                    3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) Tool
ii  doca-sosreport                       4.9.0-1                                       amd64        Set of tools to gather troubleshooting data from a system
ii  doca-telemetry-utils                 3.0.0058-1                                    amd64        Data Center on a Chip Architecture (DOCA) Tool
ii  dosfstools                           4.2-1                                         amd64        utilities for making and checking MS-DOS FAT filesystems
ii  dpa-gdbserver                        25.04.2725                                    amd64        Nvidia DPA gdbserver tool.
ii  dpa-resource-mgmt                    25.04.0169                                    amd64        Single Point for Resource Distribution Package
ii  dpa-stats                            25.04.0169                                    amd64        Nvidia DPA performance counters library and tools.
ii  dpacc                                1.11.0.6                                      amd64        DPACC is a high-level compiler for the DPA processor
ii  dpacc-extract                        1.11.0.6                                      amd64        dpacc-extract is a tool used for extracting
ii  dpkg                                 1.21.22                                       amd64        Debian package management system
ii  dpkg-dev                             1.21.22                                       all          Debian package development tools

the mst part could have been related to it trying to toggle switchdev on - which if you’ve got active VMs etc in use could be related to that

like i mentioned these are 2 different things, legacy SR-IOV and the newer switchdev both work differently and it depends on what you want and are trying to aim for as to what you need to do

it sounds like its a virtio issue from what you’ve said, virtio is all over the place and thats why from the very start i’ve suggested you eliminate that as the cause and get some real data using real hardware (as well as that iperf3 issue, get that sorted too)

I am entering the CHR+proxmox crowd and also suffering from abysmal performance.

Setup:

  • Proxmox on HP EliteDesk
  • two bridges vmbr91 and vmbr92 (not connected to any ports)
  • Testmachine-1: Linux CT via vmbr91
  • Testmachine-2: Linux CT via vmbr92
  • Testmachine-3: Linux VM via vmbr91
  • Test-Router-1: vanilla Debian with vmbr91 and vmbr92, IP forwarding enabled, 4 vCPUs and queues=4 for the VirtIO devices
  • Test-Router-2: CHR with activated P10 license, connected to vmbr91 and vmbr92, 4vCPUs and queues=4 for the VirtIO devices

Test 1: iperf3 from Testmachine-1 to Testmachine-3: 16.5GBit/s (!)

Test 2: iperf3 from Testmachine-1 to Testmachine-2 via Test-Router-1: 6.5GBit/s

Test 3: iperf3 from Testmachine-1 to Testmachine-2 via Test-Router-2: 1.25GBit/s

Test 4: iperf3 from Testmachine-1 to Testmachine-2 via Test-Router-2 with 5 parallel connections (-P 5): 730MBit/s

When going over CHR, iperf3 shows many retransmits (>4000) which is odd when the traffic never leaves the machine. No retries when using Test-Router-1

Profiling in CHR during iperf3: Total CPU usage increases from ~3% to 9-10%. Largest componens are virtio_net and networking (~3% each) and bridging (~1.5%)

CHR is underperforming more than 3x to a Linux VM. This can’t be real.

Is there any possible explanation?

Update:

I've since opened a ticket to Mikrotik and recently heard back that they've made a number of improvements to how VirtIO drivers interact with rOS, available in 7.20beta, that improves performance by a lot.

I haven't been able to test yet but will do this week.

Thanks for the info. Because of your comment I checked that update rOS update.

I had the similar issue in my setup (CHR + OVS on the Proxmox). But with the v7.20 (chr - improved virtio_net performance) the problem is almost gone. I have about 10x times better performance with my 25G and 50G network (about 85 to 90% what I expected). I’m super happy with that.

So, for every other guy with the problem: Just update to ≥ v7.20 and give it a try :slight_smile:

Edit: I did another few tests.

Setup:

  • 3x Proxmox Cluser. Hosts with a 2x25G mlx5 card
  • Configured as OVS bond in a OVS bridge
  • The CHR sits in a VM as main router/gateway (with firewall rules & NAT)
  • CHR manages all the traffic between 15 VLANs and a 25G ISP uplink
  • Network behind is based on CRS504 and other Mikrotik devices

Results (between VLANs and also to ISP):

  • v7.18.2 = 2.5-3.0 Gbps
  • v7.20 = 22.0 - 24.0 Gbps

Indeed, there’s a huge performance improvement.
I did a simple test, performed between the same pair of RouterOS VMs (without cold shutdown/boot on PVE) on the same host machine, using the built-in /tool/speedtest command.
First, I tested version 7.20, then downgraded to 7.19.
Here are the results I got:

I pick some big numbers:

RPS Disabled
7.19:tcp 4~5Gbps -> 7.20: 14Gbps+

RPS Enabled
7.19:tcp ~10Gbps+ -> 7.20:60~70Gbps+

notes: that's just speedtest between vm nic, not forwarding performance, maybe not very pro way, but hey—it at least shows something

imgur