The high latency on download is awful, way higher than a hardware appliance, and the jitter and average suck but are at least not killing throughput. The upload is a different story. 1.7 seconds of stall.
Running the same tests over a CCR2116 the download high latency is single figures, the upload can spike but nowhere near as bad.
I’ve the 10 Gb license, not sure if the extra shaper is causing the issues. ESXi version 7.0u3. Intel 82599 NIC connecting via a 10GBase-T SFP to the XGSPON ONU.
The CHR is showing this:
Tx Queue Drops 31 998
This rapidly increments when I run a test, incrementing hugely while the upload stalls and killing throughput due to packet loss.
I don’t understand why it would be dropping anything: the license is 10G, the link is 8G. Is it dropping because it can’t drain the packets to the hypervisor or because there’s something up with the CHR and overly obsessive policing of the license?
The current queue on the interface is ‘Ethernet Default’. Changing it to ‘only-hardware-queue’ results in drops both during download and upload but no change to the result. Changing the size of the Ethernet default queue does nothing. I don’t actually want any queues here, I want it to just get the frames on the wire as upstream devices will manage it. Any words of wisdom or is this genuinely Mikrotik licensing and a queue built into the software? That it’s breaking upload more points to shaping somewhere in the chain.
If it’s not already, try to enable flow control on ether port. Or, if it’s on, disable it.
The problem is interaction between CHR and ONU, the later has to squeeze 10Gbps received from CHR into 8Gbps connection upstream. Your speedtest client bursts more than 8Gbps for sure and if there are no buffers somewhere (or if they’re insufficient), some packets are guaranteed to be dropped.
Not even reaching 6 Gbps and even if it were bursting over a second or more of zero throughput is a little more than a microburst being shaped. I see the stream ramp up and stop at ~5 Gbps maximum.
I can actually manipulate speeds by adding streams to iPerf. 3 streams are bad, 2 are okay, 4 are not. None of the second long measurements at any point in the iPerf come near 8G and as often as not I’ll get a full second of nothing passing. I’m not sure Microbursts will be dealt with so harshly that the kit just stops transmitting for 1.7 seconds.
It will be quite hard to achieve consistent performance using software bridging / virtual NIC’s (vmxnet3), where with PCIe Pass-Through it will be easy.
Just FYI modified host so that everything was in PCI passthrough. Still getting stalls of almost exactly 1.7 seconds during uploads that impact everything going through the CHR.
Think the host itself is the problem. I’ve reinstalled ESX, reinstalled software, changed network cards, nothing changes. Same stalling and it’s so long I’ve no idea what could be up.
Only thought remaining is the 10GBase-T SFP+ modules I’m using though I’ve tried 3 different ones without success.