SR-IOV with CHR - What hypervisors are you using ?

SR-IOV with CHR - What hypervisors are you using ?

I’ve been using CHR’s under VmWare ESXi for a long time now. I am wanting to change from VmWare ESXi to another hypervisor that supports SR-IOV so that I can get my CHR’s running faster ( 10-Gig to 40-Gig or faster if possible ).

I would like to ask - What non VmWare ESXi hypervisors ( and physical computers ) are others running CHR’s on with SR-IOV configurations to achieve similar CHR throughput speeds ?
Also - what was the speed-limit wall you hit with your CHRs prior to implementing SR-IOV and what results did you get - how much faster with SR-IOV and how much did the hypervisor and CHR CPU loads drop with SR-IOV ?

I am currently running four BGP servers with full routes and wanting up upgrade from 10-Gig to 100-Gig full IPv4 and IPv6 BGP peering sessions.
And depending on other things , my CHR OSPF routers are often sustaining about10-Gig or more during busy peak hours.
And I’ve also got near a dozen other CHRs doing things in my ISP network.
All of my CHRs are I/O network very busy.

North Idaho Tom Jones

SR-IOV is supported in almost all virtual hosts, including ESXi.

It’s up to the NIC device driver to implement the capabilities. I’d start by checking the SR-IOV capabilities of your NICs and drivers with the manufacturers. Similar to specialized variants such as DirectPath, DirectIO etc which only work in specific virtual environments. If you are running on your own dedicated hardware you often need to enable SR-IOV in the system BIOS.

You should notice a significant improvement in I/O performance with reduced CPU and interrupt latency with a functioning SR-IOV, DirectPath or similar.

I prefer Proxmox VE. Non server platform often do not have a SRIOV setting in bios which may lead to compatible issue with ESXI. Proxmox VE does not have this limitation. I can easily turn on SRIOV on intel J4125 with Proxmox VE but get a lot of error with ESXI. J4125 is just a low power laptop platform which is not considered to have SRIOV functionality.

Proxmox VE can definitely be a performant open-source solution if you are willing to invest time in how to configure PCIe Passthrough and SR-IOV, analyze and fix potential issues yourself. However, if you need data center features such as hight end performance, central administration and monitoring, live migration, real-time backup using snapshots, 24/7 support, etc, it might be justified to consider solutions like vSphere/Hyper-V.

Btw, I forgot to mention there are some really good papers online on how to optimize network performance using ESXi.

Is it just me … but it seems to me that Tom “North Idaho” is currently using VMware / vSphere / … and is looking into replacing it with another virtualization platform. Quite probably due to how Broadcom/VMware is changing licensing policies right now.

Hence suggesting him to “consider vSphere” is going backwards in this context.

Certainly, license costs might have a decisive significance but the original question was primarily about performance and which platforms are available with SR-IOV. However, my comment was aimed more at a general recommendation considering Proxmox VE or proprietary solutions like vSphere/Hyper-V. When it comes to picking a virtualization platform/ecosystem, it’s a matter of sacrificing one of the following: fast, good, or cheap. In general, FOSS = invest time, Proprietary = invest money.

Regardless of what you choose, one must be well-versed in the virtualization world to configure maximum IO throughput. I’d say it’s pretty rare that plug-and-play with default settings works well enough to achieve a well-optimized environment. SR-IOV and interrupt latency are two of several important settings you need to deal with.

Yesterday , I found out that the VmWare ESXi essentials license does not support SR-IOV. The SR-IOV feature is available in a VmWare ESXi Enterprise license. So , lets do some math , what’s 18 times the cost of a VmWare ESXi Enterprise license ??? I also have a funny feeling about this whole Broadcom/VMware purchase thing - what’s gonna change ?

I have about 18 VmWare ESXi physical servers. The cost to purchase qty 18 Enterprise VmWare ESXi licenses will be a killer.

So , I am looking for alternative hypervisors that support SR-IOV and something possible low cost - as in free open source.

I did manage to get SR-IOV working when I removed the VmWare ESXi Essentials license - which then reverted my test ESXi license to the free 60-day trial unlimed license - and yes I did get a CHR running with SR-IOV - but the CHR could only hear traffic and not send any network traffic. I figure I was doing something wrong.

So , now you know why a am asking …

North Idaho Tom Jones

OpenNebula, Proxmox VE, KVM, Xen, XCP-ng, Virt-Manager, oVirt … and others all utilize more or less the same fundamental Linux kernel capabilities. However, they differ in their integration methods for installation/configuration, admin GUI, Docker support, tools for operations, monitoring, online tuning and so forth.

Thus, the optimal choice entirely depends on your specific use case and requirements.

If this is business-critical and you are unwilling to pay for a commercial ESXi license, you need to prepare yourself to allocate a significant amount of manpower to ensure it will succeed using open-source solutions.

This thread might possibly be interesting for you to get started with: https://forum.level1techs.com/t/which-hypervisor-would-you-suggest-for-my-situation/186653

Yeah, the news around VMWare has not been good. But are you network cards on the HCL and marked with SR-IOV support? https://www.vmware.com/resources/compatibility/search.php


Long used VMWare myself, but feel like its days are numbered & becoming even more over-priced/subscriptions. Used Xen (and associated XCP-ng) for a large project long ago, so I know it works. I just don’t have any recent practical experience with modern alternatives.

But for networking, most Linux VM hosts use Open vSwitch in-between AFAIK.

Anyway be curious what other folks think.

Btw, here is the new Broadcom VMware licensing model for those unlucky ones who lack the original perpetual licenses. “Foundation” is needed to enable SR-IOV, DirektPath (PCI Passthroug) etc. A one-year subscription is about 40% more expensive.
VMware lic.jpg

Another interesting platform is Nutanix Acropolis Hypervisor (AHV) which is based on the open-source KVM hypervisor and includes standard features such as live migration and VM-centric snapshots. Nutanix has tools to migrate ESXi to their platform.

Read more about it in the article “All About Hypervisors: ESXi vs Hyper-V, XenServer, Proxmox, KVM, and AHV” and Reddit discussing on moving from VMware to alternative platforms: Move from VMware to…what?

wow - trying to get Mikrotik CHRs with SR-IOV enabled under VmWare ESXi is somewhat confusing.
The VmWare ESXi license levels and costs are often “I don’t know”.
If I need VmWare Cloud Foundation for 3-years at $350.00 per core and I have 10 physical VmWare servers and each server has two 20-core CPUs , then that is ( 10 x 2 x 20 x $350.00 ) $140,000.00 for 3 years. Is my math wrong or my understanding of what is called a core. Also , what happens after 3 years is up - does it suddenly stop working if the license is not renewed again ? Is a 10 year run more than triple the $140,000.00 license fees ? (note , I also have some VmWare ESXi servers with qty four physical CPUs ( 4x the count over a single physical CPU ).

At those possible expenses , I would rather just buy bare-metal servers , install some free open source hypervisors to support SR-IOV in a Mikrotik CHR -or- directly install a router Operating System that has an ISO install for bare metal.

I don’t mind doing the work of managing other hardware/hypervisors/software , but I do mind when the costs are through the roof.

I am really starting to think a KVM hypervisor ( open source ) might be a possible SR-IOV solution to get my Mikrotik CHRs with SR-IOV running at the speeds I am looking for.

  • footnote - I know this is a Mikrotik forum , the topic is centered around Mikrotik CHR and SR-IOV functionality and how to implement it *

No problem using SR-IOV on KVM provided the NIC and drivers support it. We have some 10-year-old legacy servers (HP DL380 G5/G6 IIRC) in our testlab to play with and they run just fine using SR-IOV.

Regarding the new licensing model and considering all the frustrating comments where many feel completely overridden by Broadcom, many have already started looking for alternative solutions. However, considering the lock-in effect it will probably take some time before it happens but I suspect that Broadcom will experience a significant loss of customers within a year or so. I mean, it’s just a sound business decision if one has been hit by a cost increase of over 300%.

And yeah, when you paid your hard-earned money (upfront, that is) and the three-year subscription ends, that’s it, finito, unless you sign up for another three years.

If you’re not using the vSphere HA stuff (like vMotion etc)… then it should be easy to switch away from VMWare.

Personally, I really only use snapshot feature in ESXi to be able to rollback something. I’ve used ESXi for, well, decades. But just not much value VMWare at those prices & the “premium” to allow passthrough is ridiculous! In reality the modern CPU does most of the work in all hypervisors.

I’d probably start with Proxmox, which wraps KVM. Mainly because “raw” KVM does not do disk snapshots & IMO snapshots are key value of CHR, over native/X86. Plus proxmox have some UI. And Mikrotik has done videos featuring it, so if there was issue…they should be able to look at it. Anyway, that’s were I’d start… since you can always then try KVM without proxmox tooling.

Amm0 ,
I do not use any vMotion etc …
I do use TrueNAS with NFS mounts to my VmWare ESXi servers. My TrueNAS servers do multiple snapshot schedules and multiple RSYNC backup schedules to multiple multiple TrueNAS servers to hold my RSYNC backups they receive. All of my backups via TrueNAS NFS servers run without any configurations in my VmWare ESXi servers.
I believe Proxmox also supports NFS mounts , so my TrueNAS snapshots and RSYNC backups should still function the same after I begin using Proxmox.

Re: … I’ve used ESXi for, well, decades. …
ditto

When I finally get a hypervisor ( Proxmox ? ) running a CHR with SR-IOV enabled , I will start a new topic on this Mikrotik forum in the virtualization section that will provide some how-to documentation for others to do the same – and my b4 CHR speed/CPU-load results pre SR-IOV and my after SR-IOV enabled CHR speed/CPU results. I am really looking forward to finding out how fast a sustained routing/BGP/OSPF SR-IOV enabled CHR can run up to. I have some 10-Gig , 40-Gig and 100-Gig network interfaces. Sooo , how fast can a SR-IOV enabled CHR go with 100-Gig network interfaces - don’t know yet - but I will find out and document it.

North Idaho Tom Jones

Hyper-V Server 2019 is totally free and will continue to be supported under its lifecycle policy until January 2029, and fully supports SRV-IO.

Just one option. may not be the best one for you, but it does meet the stated requirements.

You miss something: 16core/socket is the minimum amount you need to buy - even if you dont have such CPU-core capacity:

2x(CPU 10 core) = 2x 16 core license
2x(CPU 18 core) = 2x 16 core license + additional core license

Live Migration/ ( Vmotion in vmware terms. ) with SRIOV is not supported by any hypervisor - never will be.
I can’t see the advantage of SRIOV ( broken, many bugs), you loose the VM’s mobility, if you want performance use PCI_Passthrough (this is also not supported with Live-Migration).

KVM with virtio is the best combo for that, but other factor like Mikrotik software is lacking the “true multicore optimization” ( so mostly using 1core only) - you wont able to push 10G.

Performance improvement with SR-IOV VF or PCI Passthrough NIC isn’t likely to be all that great. You’ll bypass the hypervisor network stack (which will remove some performance bottleneck) but you’ll still only have Linux networking in the CHR as it doesn’t use acceleration technology such as DPDK or even eBPF/XDP.

For large packets, I doubt the hypervisor network stack would be a bottleneck.

Neither DPDK nor eBPF/XDP is in any way related to SR-IOV, which is a standard hardware-level technology for I/O virtualization offering bare-metal throughput.

Additionally, ROS uses Linux kernel netfilter/nftables, not Berkeley Packet Filter or DPDK which are a bunch of user-land network drivers and libraries.