Slow Network Speeds via MikroTik CRS304-4XG

The Issue:

  • When connecting to NAS directly to my Ubuntu server (using 5gbe USB adaptors), I achieved 3.37 Gbps in iperf3 (proving that your cables and devices support high speeds.)
  • When connecting through my MikroTik CRS304-4XG switch, performance dropped to 236 Kbps, which is extremely slow.

The MikroTik CRS304-4XG has the default configuration, I have only changed the password. Any help would be appreciated.


iperf - with MikroTik CRS304-4XG

thomas@prodesk:~$ iperf3 -c 10.0.0.1
Connecting to host 10.0.0.1, port 5201
[ 5] local 10.0.0.2 port 36778 connected to 10.0.0.1 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 288 KBytes 2.36 Mbits/sec 3 8.74 KBytes
[ 5] 1.00-2.00 sec 0.00 Bytes 0.00 bits/sec 1 8.74 KBytes
[ 5] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec 0 8.74 KBytes
[ 5] 3.00-4.00 sec 0.00 Bytes 0.00 bits/sec 1 8.74 KBytes
[ 5] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec 0 8.74 KBytes
[ 5] 5.00-6.00 sec 0.00 Bytes 0.00 bits/sec 0 8.74 KBytes
[ 5] 6.00-7.00 sec 0.00 Bytes 0.00 bits/sec 1 8.74 KBytes
[ 5] 7.00-8.00 sec 0.00 Bytes 0.00 bits/sec 0 8.74 KBytes
[ 5] 8.00-9.00 sec 0.00 Bytes 0.00 bits/sec 0 8.74 KBytes
[ 5] 9.00-10.00 sec 0.00 Bytes 0.00 bits/sec 0 8.74 KBytes


[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 288 KBytes 236 Kbits/sec 6 sender
[ 5] 0.00-10.00 sec 0.00 Bytes 0.00 bits/sec receiver


iperf - without MikroTik CRS304-4XG

thomas@prodesk:~$ iperf3 -c 10.0.0.1
Connecting to host 10.0.0.1, port 5201
[ 5] local 10.0.0.2 port 52832 connected to 10.0.0.1 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 404 MBytes 3.39 Gbits/sec 0 979 KBytes
[ 5] 1.00-2.00 sec 403 MBytes 3.38 Gbits/sec 0 979 KBytes
[ 5] 2.00-3.00 sec 401 MBytes 3.37 Gbits/sec 0 979 KBytes
[ 5] 3.00-4.00 sec 402 MBytes 3.37 Gbits/sec 0 979 KBytes
[ 5] 4.00-5.00 sec 402 MBytes 3.37 Gbits/sec 0 1022 KBytes
[ 5] 5.00-6.00 sec 401 MBytes 3.36 Gbits/sec 0 1.05 MBytes
[ 5] 6.00-7.00 sec 402 MBytes 3.38 Gbits/sec 0 1.19 MBytes
[ 5] 7.00-8.00 sec 401 MBytes 3.37 Gbits/sec 0 1.19 MBytes
[ 5] 8.00-9.00 sec 402 MBytes 3.38 Gbits/sec 0 1.19 MBytes
[ 5] 9.00-10.00 sec 401 MBytes 3.37 Gbits/sec 0 1.19 MBytes


[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 3.93 GBytes 3.37 Gbits/sec 0 sender
[ 5] 0.00-10.00 sec 3.92 GBytes 3.37 Gbits/sec receiver

I change the MTU on the adaptors from 9000 to 1500. I guess CRS304-4XG doesn’t like jumbo frames? This change increase my Bitrate but now I have an issue with excessive retries. An help would be appreciated.

thomas@prodesk:~$ iperf3 -c 10.0.0.1
Connecting to host 10.0.0.1, port 5201
[ 5] local 10.0.0.2 port 41846 connected to 10.0.0.1 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 404 MBytes 3.39 Gbits/sec 2186 136 KBytes
[ 5] 1.00-2.00 sec 402 MBytes 3.37 Gbits/sec 2075 122 KBytes
[ 5] 2.00-3.00 sec 404 MBytes 3.39 Gbits/sec 2159 140 KBytes
[ 5] 3.00-4.00 sec 401 MBytes 3.37 Gbits/sec 2278 222 KBytes
[ 5] 4.00-5.00 sec 402 MBytes 3.37 Gbits/sec 2183 148 KBytes
[ 5] 5.00-6.00 sec 404 MBytes 3.39 Gbits/sec 2260 113 KBytes
[ 5] 6.00-7.00 sec 402 MBytes 3.37 Gbits/sec 2310 105 KBytes
[ 5] 7.00-8.00 sec 403 MBytes 3.38 Gbits/sec 2411 126 KBytes
[ 5] 8.00-9.00 sec 403 MBytes 3.38 Gbits/sec 2326 106 KBytes
[ 5] 9.00-10.00 sec 402 MBytes 3.37 Gbits/sec 2401 156 KBytes


[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 3.93 GBytes 3.38 Gbits/sec 22589 sender
[ 5] 0.00-10.00 sec 3.93 GBytes 3.37 Gbits/sec receiver

iperf Done.

Retransmissions are one of ways for TCP to throttle back. And they indicate that it’s not the first leg from transmitter which has (performance) problems.
You can try with UDP connectivity … start with modest bandwidth setting (e.g. 2Gbps) and gradually increase it. At some certain point you’ll start seeing missing packets in receiver’s report.
It’s hard to tell where the bottleneck lies, but it’s quite likely it’s the receiver itself, even basic processing of multi-Gbps is not eaxactly peace a cake. Definitely check port stats on switch, if it starts to show dropped packets it will help to narrow down on the location of bottleneck.
So another possibility is to run TCP tests with multiple parallel streams. It gives opportunity to both sender and recriver to process data using multiple CPU cores in parallel. Then you check the sum of all parallel streams.

Re. MTU sizes: switches mostly operate in “store and forward” mode, they first receive full frame, store it in buffer and then forward it through egress port. So switches have to assume some maximum frame size in order to establish some granularity of buffer space. And this granularity is in MT world named L2MTU. Any frame, larger than that, arriving on ingress port will get dropped due to inability to store it. It can be adjusted according zo needs but there’s upper limit and it depends on switch chip (some switch chips don’t even support jumbo feames, some support sizes up to around 4kB, etc.).
So when going for jumbo frames, one has to check all L2 devices about maximum L2MTU … and then set thrm to equal values. When considering setting (L3) MTU, one has to consider ethernet headers (minimum of 18 bytes).

Another thing to consider: the most straight-forward thinking would be to set L2MTU to switch chip max supported value everywhere. That would work but would waste buffer space - due to granularity. E.g. setting L2MTU to 9kB instead of 4.5kB while actual frame sizes are up to 4kB means only half the maximum number of buffered frames.