I recently purchased a MikroTik CCR2004 and set up my network with the following specifications:
ISP Connection: 1 Gbps down / 700 Mbps up (providing IPv4 through IPv6 tunneling with Free FAI in France).
Issue Description:
I am experiencing very slow download speeds when trying to clone a Git repository or download large files over HTTP. The download speed caps at around 100 KB/s, while the router’s CPU usage remains around 0%.
Interestingly, while running speedtest from various devices, I consistently receive results over 500 Mbps. But the download of file during this same test is still cap … (Download from the browser or from wget doesn’t change anything still 100 KB/s and from any machine linux or macOS)
Additional Observations:
Enabling a VPN (such as ProtonVPN) boosts the download speed to over 30 MB/s. Which sound crazy to me.
Disabling all firewall rules, including fasttrack, does not change the slow download behavior.
When using iperf3, I achieve good results, but using the -R option also caps speeds at about 100 KB/s.
Streaming 4K videos on YouTube works seamlessly, with no buffering, even while two 4K Netflix streams are active.
Testing Environment:
I’ve tested downloads from multiple devices connected to the network.
Within my VLAN setup, iperf tests between VLANs show that traffic saturates the Ethernet link without issues, with low CPU usage on the router.
Configuration:
I will include my MikroTik configuration in the attachments for reference.
Questions:
Does anyone have suggestions on what could be causing this issue?
Are there specific settings or configurations I should check to resolve the slow download speeds without the VPN?
I have concerned maybe about my MTU configuration but this part is a bit obscur to me.
« SpeedGuide.net TCP Analyzer Results »
Tested on: 2024.11.21 13:41
IP address: xx.xx.xxx.xx
Client OS/browser: Mac OS (Firefox 132.0)
TCP options string: 020405b4010303060101080a89a5c79c0000000004020000
MSS: 1460
MTU: 1500
TCP Window: 131712 (not multiple of MSS)
RWIN Scaling: 6 bits (2^6=64)
Unscaled RWIN : 2058
Recommended RWINs: 64240, 128480, 256960, 513920, 1027840
BDP limit (200ms): 527 Mbps (53 Megabytes/s)
BDP limit (500ms): 211 Mbps (21 Megabytes/s)
MTU Discovery: OFF
TTL: 54
Timestamps: ON
SACKs: ON
IP ToS: 00000000 (0)
I tried what you proposed something like that iperf3 -R -M 1420 -c 185.93.2.193 but it fail with iperf3: error - unable to set TCP/SCTP MSS: Invalid argument.
Interesting… according to speedguide you have normal MTU - but this is strange as tunneling IPv4 in IPv6 (without tricks on the way) will definitely make it lower.
As to iperf3 - not sure why it fails, did you run it on Linux (or other *ix like) or Windows? It could be also version-dependent.
As to changing MTU on an interface - set it to something relatively low (~ 1400) on the interface that has your default gateway for IPv4. If it works, then you could increase it a bit to find the maximum that works.
If you are under Linux (which is behind your Mikrotik), you could adjust the MTU for the default route only - like “ip ro re default via mtu 1400”, instead of changing the interface MTU on the router itself.
Another potential issue is that your router or provider blocks ICMP’s unreachable messages - under normal circumstances there is no need to fiddle with MTU as it will be auto-discovered, but some providers block such ICMPs, or they are blocked/dropped by your router.
I disabled all hardware offloading, but unfortunately, it did not resolve the issue.
Regarding my testing with iperf, I found that when run on Linux, it performed well, in contrast to my earlier tests on macOS. Without the -R option, I achieved 500 Mbps, but using -R resulted in a drop to just 2 Mbps so it’s same as before.
Moreover, I conducted further iperf tests using the -P 20 option, which allowed for 20 streams, each conveying 2 Mbps. Does this information assist in troubleshooting?
Interestingly, when I connect through a VPN, the slow download issue disappears entirely. I suspect this might be related to my use of WireGuard, which operates over UDP, whereas the issue appears to be isolated to RX TCP traffic. Could this be an indication of a potential MTU issue, if so do you mind explaining why?
EDIT: After some reading on MTU I understand know that the issue is indeed at the MTU level since it works over UDP and by extension with WireGuard. Now I don’t know where to adjust the MTU properly
If IPv6 is not affected, then it explains why your tests with YouTube and Netflix are doing well, because they would use IPv6. Same with many speedtest.net test servers nowadays. And it looks like the problem only affects IPv4 TCP traffic.
When you perform the iperf3 test with -R (the test that produced only 100KB/s), can you check on the sender side (the remote side) whether a Cwnd column is available? and which values does that column have?
Yes, thank you. The congestion window (for the sender side, on your side this is further limited by the receive window) is way too small. And has no chances to increase before encountering a lot of retries (probably due to packet loss). It looks like the round-trip delay from you to the iperf3 server is about 18-20ms. When there are retries the sender side had to reduce the congestion window (cf congestion avoidance algorithms), as you can observe. It looks like retries are needed when the Cwnd is larger than 30KB.
If you use a bandwidth delay product calculator, like this one https://calculator.academy/bandwidth-delay-product-calculator/ or this one https://www.speedguide.net/bdp.php, you’ll see that if the congestion window can’t get bigger than those numbers, you won’t get higher bandwidth. That’s also why if you make 20 connections with -P 20, the sum of the bandwidth is larger, because it looks like the Cwnd limit is about the same for the individual connections. For comparison, here is a test from my place to the server of yours. I’m located in a different continent with a much higher ping time (193ms), and the congestion window can steadily increase to 8.74 Mbytes without retries:
My bet is that when you remove the -R option, the sender side (you) can achieve much much larger Cwnd sizes. When you perform iperf3 tests between your VLANs, do you see large or small Cwnd values? Because the delay within your LANs is much smaller (sub ms) you might not notice the effects of a limited congestion window on the final bandwidth.
Problem with using public servers (including iperf3 servers) is that there might be bottlenecks other than “last mile”. I tried iperf3 server from the screenshots of @CGGXANNX and I got shitty performance in both directions. In both directions I see fair amount of retransmissions … and for TCP retransmissions are sure way to kill any kind of performance (retransmission both means re-sending data and it shrinks TCP window to portion of what it could be, so throughput in next few seconds is low even without retransmissions). If I tried a few other public iperf3 servers, I was getting various results (but all of them were better and more consistent than the ones I got from server from mentioned screenshot.
So when determining performance of router, it’s important to be 200% sure that there are no other bottlenecks.
I agree with you, Mkx, but I ended up doing this test with iperf because I had very slow internet speed over IPv4, such as when pulling from GitHub. The issue disappeared the moment I used a VPN, even in the same location (city).
Also, I tested the same iperf server with the equipment provided by my internet service provider instead of the CCR, and I got good results with no retries. The issue is that I don’t want to keep this equipment because it’s very old, and I have my own gear.
I have the same problem on CCR2216 with L3 HW offloading enabled or disabled. I have even gone so far as to connect the web server directly to the CCR2216 and download of a file is 200Kbps whereas previously it was downloading at my full link speed at home which is 400Mbps.
Multi-stream HTTP downloads such as speedtest.net gives full line speed. Streaming absolutely perfect even on 4/8K.
I am suspecting that something in the recent OS releases has a bug. Im on 7.16.1
Thanks for sharing I was becoming crazy … Did you try to downgrade?
On my side I did dump the packets between the ONU and my ISP hardware and compared it to the ONU and CCR connection.
While using my ISP HW the packets are clearly identify as ipipv6 tunnel where coming from CCR it’s only ipv6 packets and I see some TCP RST.
I plan to try something that should not impact anything is to put my SFP in a bridge and do the VLAN 836 using VLAN filtering instead of L3 VLAN. I don’t have any hope on this part…