RB760iGS as wireguard client - very slow upload

Hello Dear Friends,
So I have got RB760iGS (hex-s) , 7.8 stable (fresh NET Installed).
And I’ve got remote vps (tried centos 7, ubuntu 22) with Wireguard server installed (no mtu in config) and Libreswan server ipsec+l2tp (hardcoded configs mtu 1280).
RB760iGS set as:

  • wireguard client.
  • ipsec l2tp client, mtu 1280, mru 1280.
    I use 1 mange rule , I switch mark routing eather to ipsec interface or wireguard interface.
    RB760iGS connected to Internet thru pppoe, 100\100 mbit, mtu=1492.
    Vps connected to internet 200\200 mbit.
    Ping RB760iGS ->remote vps ~40 msec.
  1. mangle (mark routing) to ipsec , ookla speedtest to nearest to vps server = 70-80 mbit download, 65-70 mbit upload. Cpu RB760iGS ~50%, cpu vps ~30%.
  2. mangle (mark routing) to wireguard, ookla speedtest to nearest to vps server = 85-90 mbit download, ~40 mbit upload. Cpu RB760iGS ~40%, cpu vps ~20%.
    (Speed test run on wired 1gbit windows 10 box+chrome).
    Varying mtu will result in 20-40 mbit upload, but upload never seen more than 40 mbit.

If I connect same wireguard client config thru mobile app at android smartphone which is Wifi connected (wap ac) to RB760iGS , I can see 95\95 download and upload.

Why I’ve got so slow wireguard upload when the wireguard client is RB760iGS ?

Tried auto clamp, manual clamp mss with 20 different values, but no luck (change mss mangle rule).

Thank you beforehand.
Config RB760iGS attached.
Truly yours,
–Alexander.
Simpe Config with wireguard only:
24-04-2023-hex-s-wireguard-only_o.rsc (7.25 KB)
Simpe Config with ipsec:
24-04-2023-hex-s-empty-wgireguard_and_ipsec_o.rsc (7.76 KB)

CharGPT error, bad input!

here was unrelevant info

CharGPT error, bad input!

We are talking about wireguard ?

Why then screenshot of something ipsec and then zerotier ?
Or did I miss something ?

Please remove your serial number from export. Not that it makes that much difference but better to stay on the safe side.

  • Default MTU for wireguard should be 1420, why did you change it ?
  • what’s the story with CountryIPBlocks ? How many times is that rule being hit ? Can be pretty resource intensive.
  • during the wireguard testing, run tool/profile, all. Sort by CPU usage top down. What’s on top ?

done.

My guess 1420 is calculated as 1500-80. Since my pppoe mtu = 1492, I did 1492-80=1412.
Anyway , I tried many different values with no effect (1420, 1412, 1492, 1450, 1432, 1392, 1350, 1320, 1300, 1280, 1200, etc.)
The best upload speed is for 1412.

The meaning of this address list (8000 records) is that if the destination (internet address) is within CountryIPBlocks list, then do nothing, go directly, no marks for routing or packets.
(Please note here, 192.168.0.0/24 is added into CountryIPBlocks, so LAN networking works fine).
If destination (internet address) is outside CountryIPBlocks list, then mangle rule does mark routing, and traffic goes outside thru vps (thru wireguard).

The cpu load and upload \ download speed has no changes (!!) if I simply remove comparison against CountryIPBlocks list.
Exactry same speeds! Same Cpu loads !
Seems this simple math compare of 8000 numeric records is just an easy walking a park for this cpu.
Just tried this to be sure (and updated config above not to focus attention at CountryIPBlocks list):
simple_mangle.png

Here is download, 85 mbit:
download.png
Here is Upload, 30 mbit:
upload.png

We are talking why wireguard upload (20-40 mbit) twice slower then ipsec+l2tp upload (65 mbit), both installed on same remote vps, both clients configured within same hex-s.

Sorry, my bad regarding the ZeroTier part, i’d say way out of line this time! :zany_face:

And once again. If I use android smartphone connected with Wap AC wired to this rb760igs:

  1. Wireguard Mobile app using exactly same wireguard client config (same remote wireguard vps), speedtest to exactly same server = 95\95 mbit !
  2. rb760igs as a client to remote wireguard vps, speedtest to exactly same server = 90\40 mbit !

Mikrotik bug or misconfiguration ?

Most likely misconfiguration. Bug would have been notified by a lot more users, I’d say.
But where ?

Or … it might be a bug. A sneaky one.
How was your device brought to ROS7-level coming from ROS6 ? Upgrade, upgrade, upgrade, … ? Never did netinstall in between ?

Wild option (since I have been chasing some WG issues myself early last year and have seen various reports of others needing to start CLEAN after chasing ghosts)

  • make digital backup, save outside device (just to make your life easier afterwards to restore if the following test fails)
  • export config, make sure it is COMPLETE (check the export !)
  • netinstall device, default config
  • re-import earlier exported config block by block ONLY taking over Wireguard config (default again to 1420 MTU) and needed things for router services (DHCP, DNS, firewall, …). Leave out LT2P for now.
  • don’t change anything else.

What happens then ?

Just did a test with Hex lying around here (not the same but exactly the same CPU/RAM/Storage as Hex-s).

Tests where done from Hex → AX3 → RB5009, all 1Gb links.
1420 MTU as default
Wireguard tunnel between Hex and RB5009, route rule for direct connection between Hex and RB5009 via Wireguard.

And yes, I know I shouldn’t be testing on the devices themselves but for this context, it may do.

Observations:

  • On all 4 tests one of the CPU cores on Hex was maxing out. Networking being one of the highest process, wireguard was way down (never above 20).
  • For both TCP and UDP, download is higher then upload
  • TCP Up went way over 100Mb, down pretty close to it.
  • what I did not expect, was that UDP download was even less then TCP ??

Also did a test from PC via Hex to iperf server on RB5009, about 136Mb, again one core on Hex maxing out (and kept an eye on the interfaces to be sure it was using Wireguard :laughing: )
Disabled route rule between Hex and RB5009 to have reference, reran iperf test from PC to RB5009, 940-ish as expected.

So … 3 things I conclude:
1- There is effectively a discrepancy visible between upload and download. Can’t explain right away why.
2- for TCP (and given your 100/100 line) it should be able to get a bit higher then what you see, I think.
3- You may be hitting the limit of what that device is capable of. It IS still a wonderful device for the price range it’s in, but we can not expect miracles either. Most smartphone processors are a lot more powerful then MT7621A CPU Hex has.
2023-04-21_08-59-03.jpg

upgrade, never netinstall

You are great. Its my honor to speak with you.
Doing such things just for internet people is very noble.

What I afraid of, is the ip-route settings, which I made in accordance with my feelings , but I m not a pro.
Since I have pppoe and I do mark routing for other table (g_wg).
While other examples around represent tests with wireguard directly, I mean specifying ip-routes with main table.
Tried different distances, no luck.
ip-routes.png

I’ve never conducted any performance tests myself using WG on MT units since we mostly use it for OOB managment. Though keep in mind that encryption using ChaCha20 is performed purely through software thus will foremost hog the cpu and is most likely the root cause of the bottleneck, especially at higher speeds.

Btw, that’s also why ZT sometimes greatly outperforms WG in terms of throughput provided there is support for AES hardware offload using e.g. Intel/AMD and high end ARM procs. Sadly, ZT/AES offload is currently not implemented on MT hardware (but soon I hope)

As long as the protocol is purely handled in SW, it’s CPU and nothing but the CPU.

AX Lite (using ARM32 IPQ-5010, quite a bit beefier then MT7621A) in that same test setup as above using PC for iperf testing (see here):
TCP: 195 Mbps down, 211Mbps up.
UDP: 405 Mbps down, 400 Mbps up

RB5009 was at that moment still picking its nose waiting for something to do :laughing:

BTW Larsa, as you know I also tested zerotier on that same AX Lite and wireguard runs circles around zerotier as far as performance is concerned on the same platform.
You need to compare apples with apples.

Use a decent CPU and your bottleneck will almost always become your internet connection at which point the whole HW offloading discussion becomes useless.

That’s what I tried to explain but apparently failed miserably! I’ll try to do better next time.. :slight_smile:

No, you did make it clear zerotier can run quite a bit faster provided correct HW is available, but then we are not comparing on the same base anymore.

Ok, beat me, kick me while lying down and chop my head off but I hereby do promise I’ll try do better next time. Sorry please please with sugar on top! Cheers! :face_with_tongue:

OK, I started from scratch.

  1. I did fresh NET Install 7.8 + default config.
    Manually (no auto imports) I set up pppoe and wireguard ONLY. Nothing else. Couple of clicks. Extremely simple config, see attached.
    Same situation - wireguard download 85-90 mbit, upload 40-45 mbit. cpu far far away from being heavy used when upload.
  2. I added IPSEC to this very simple config . See attached config.
    Same situation - ipsec much faster than wireguard for upload !
    IPSEC download 75-80 mbit, upload 75 mbit. Cpu load - higher than with wireguard.

See simple configs in the post #1 attached…
https://forum.mikrotik.com/viewtopic.php?p=998234#p997254

Hi @sas2k (Alexander),

I have the same issue with wireguard upload speed but in my case the difference between download and upload far higher because my ISP provides me more speed. My setup is quite simple, in one city I have HAP AX3 (500/500 link) and in another RB4011 (1G/1G), overall 10ms latency (5ms to and 5ms from). In this case, my HAP AX3 is struggling to saturate full upload speed and locks at ~130-150 mbps when download is about 410-420 mbps, CPU is not a bottleneck and stands around 50% during download and 20-25% during upload.

I’m testing with my raspberry pi 4b and connection looks like this:
Client device → HAP AX3 → Wireguard → RB4011 → Raspberry

MTU is 1420 and in Mangle I have TCP MSS rule with “clamp to pmtu” on both Mikrotiks so there won’t be any fragmentation with TCP. However, when I connect with my iPhone to RB4011 wireguard server and browse the internet (do speedtests) upload and download saturates at full 430-440 mbps. What’s more interesting, when I use Bandwidth Test directly on Mikrotiks both download and upload speed the same, so HAP AX3 is actually capable to use full speed over wireguard and over existing internet link.

I’m really wondering what can cause such problem and just wanted to let you know that you are not alone with such weird behavior. I already tried disabling fasttrack, disabling mangle rule and everything is pretty much defaulted. In any case I’m glad to help in any kind of tests and configs setup to figure out what might be the problem here. The next thing I was thinking about to test is to run iperf3 container on HAP AX3 and try to run upload to the raspberry server instead of RB4011, hopefully CPU is not a bottleneck here.

Regards