Throwing this out here to see if anyone else has experienced tx-drops on VLAN interfaces in CHR running in ESXi?
The configuration is basically a vSwitch assigned with a single 10G port from an Intel XL710 adapter, MTU set to 1600 and promiscuous mode set to accept. Then a PortGroup configured using VLAN ID 4095. The PortGroup is then added to the CHR instance as a VMXNET3 device.
In CHR the VLANs are defined as needed on the parent interface. RPS is disabled. Interface queues set to multi-queue-ethernet-default.
In my test lab I am running two CHR instances on two different physical host systems. ESXi 6.7 U3 on both. One box is a Vengeance 2 and the other is a Lanner 6210. CHR is on version 6.47.9. I also have a RB4011 with 10G connected to each CHR instance for running traffic-generator. Both boxes have HyperThreading disabled.
As soon as you push any load over about 2Gbps the VLAN interface starts to clock tx-drops. The parent interface does not. The higher the traffic load, the more drops and the more unstable the CHR instance becomes.
'esxtop' doesn't show any sort of excessive %DRPTX or %DRPRX counters.
If you use VST to define the VLAN (instead of defining it in CHR), the problem goes away. However, this is a poor workaround because it limits your ability to dynamically add VLAN's without stopping the VM. Not to mention that for each VLAN you need you would be exposing a new interface to CHR and subsequently more IRQ's.
I have SUP-37609 opened with Mikrotik about this, but so far no resolution.
edit 1: my imgur links didn't work. Uploaded the images instead and placed them inline.