Unrecorded packet drops and low level UDP packet loss

Hi there,

I’ve been testing the routing performance of a 750G routerboard (between port 1 and port 2, as ports 2-5 are switched), and come across two problems (both with UDP traffic):

  1. Packet drops between interfaces don’t seem to be recorded properly.

For example, transferring UDP traffic between port 1 and 2 - the tx traffic from port 1 may be 70mbit/s, but the rx traffic on port 2 only shows 40mbit/s. From what I have found this is caused by an ethernet interface queue which is too small - boosting it up from 50 packets to say 500 packets mostly fixes this issue.
Still, I think that if the packets are being dropped between interfaces that at least should be recorded (on the interface stats) so you can tell for sure that’s going on.

As an aside, is it possible to find out what the current queue depth of the ethernet interfaces is (i.e. how full the queue is)?

  1. Low level UDP packet loss is present at ‘non-small’ traffic levels.

When the UDP traffic is above a certain level (say about 2mbit/s) the router seems to start randomly dropping packets in the order of 0-1%, irrespective of how big the queue is set at or amount of traffic (this loss tends scale in around the same percentage from low to high levels (until the previous problem starts to kick in)). My understanding is that the routerboard shouldn’t really be dropping packets at all until the cpu is maxed out (so long as the interface queue isn’t full) so I’m wondering where this might be coming from. I understand that UDP traffic isn’t guaranteed by any means and that 1% loss isn’t huge, but these packet loss ‘glitches’ tend to turn up in VOIP for example and it gets a little bit annoying when your caller chops in and out occasionally.

I am running firmware 4.17, and testing with iperf 1.7 between windows computers - the loss that I referred to above does not occur when I connect the PCs with a crossover cable; conntrack, firewall rules, all queues are disabled - CPU is not maxing out according to the resources readout. TCP traffic on this hardware seems to be fine - running iperf with TCP, the cpu immediately maxes out so as far as I know there’s no problem with the hardware itself.

I’m wondering if anyone else has run into these problems before, and if not if you could test on your own networks and see if these problems turn up for you.

Thanks, Michael.

EDIT: For reference, the iperf settings I’ve been testing with are as follows:
TCP
iperf -s -w 64K -p portnumber
iperf -c serverip -w 64K -p portnumber
UDP
iperf -s -u -p portnumber -l packetlength -i 0.5
iperf -c serverip -u -p portnumber -l packetlength -b xxxmbit

Note that iperf sends from client to server, and the port number and packet length needs to be the same between client and server. -i sets an interval (in seconds) that the current test’s stats are displayed in (otherwise it just spits out the overall result), and -b allows a bandwidth selection (in xxxmbit or xxxkbit) - the interesting information is the difference between what the client puts out, and what the server receives (which is displayed under the client result).

iperf is available at http://sourceforge.net/projects/iperf/ for linux users, and https://nocweboldcst.ucf.edu/files/iperf.exe for windows users.

We are also seeing this across various RouterOS platforms. In one example, we have an x86 RouterOS box running 3 KVM’s. 1 KVM is running a custom hotspot and the other 2 (openwrt) are each connected to a wan port. The main RouterOS is doing load balancing. If i ping google’s dns severs from the hotspot or the loadbalancer, i get intermittent timeouts but pining from one of the openwrt’s, i get no loss whatsoever. RouterOS is randomly dropping udp regardless of traffic load.