WireGuard throughput depending on running torch

Hi folks.

I use a hAP ax^3 since 2023-10 and have the following issue:

I followed basically this blog post (https://scholz.ruhr/blog/mullvad-as-second-wan-on-mikrotik/, thanks to the author) to setup WG tunnel to my friends place. Everything was working like a charm with RouterOS v7.11.2. Yesterday I updated my router to v7.13.4 and I noticed that my WG throughput to/from remote site is nearly non existing (iperf server running on friends site, client on my end):

$ iperf3 -c 192.168.100.12 -t 5
Connecting to host 192.168.100.12, port 5201
[  5] local 192.168.20.10 port 39308 connected to 192.168.100.12 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  94.3 KBytes   772 Kbits/sec    5   1.33 KBytes       
[  5]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec    1   2.66 KBytes       
[  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    2   1.33 KBytes       
[  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    2   1.33 KBytes       
[  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    1   1.33 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-5.00   sec  94.3 KBytes   154 Kbits/sec   11             sender
[  5]   0.00-5.03   sec  2.66 KBytes  4.32 Kbits/sec                  receiver

And here is the fun part:

I was investigating the issue with torch and when starting torch on WG interface, the throughput increased immediately (and dropped immediately when I stopped torch).

When I start:

[admin@MikroTik] > /tool/torch mullvad-upstream

… and run again iperf test:

$ iperf3 -c 192.168.100.12 -t 5
Connecting to host 192.168.100.12, port 5201
[  5] local 192.168.20.10 port 45150 connected to 192.168.100.12 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.18 MBytes  9.90 Mbits/sec    0   71.7 KBytes       
[  5]   1.00-2.00   sec   999 KBytes  8.18 Mbits/sec    2   47.8 KBytes       
[  5]   2.00-3.00   sec  1.10 MBytes  9.21 Mbits/sec    0   62.4 KBytes       
[  5]   3.00-4.00   sec  1.04 MBytes  8.69 Mbits/sec    1   55.8 KBytes       
[  5]   4.00-5.00   sec   999 KBytes  8.18 Mbits/sec    0   66.4 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-5.00   sec  5.26 MBytes  8.83 Mbits/sec    3             sender
[  5]   0.00-5.06   sec  4.95 MBytes  8.22 Mbits/sec                  receiver

… I’ve acceptable throughput and my connection works as expected. All applications run smoothly, no issues at all like before the update.

Here I started torch at ~4s to and stopped at ~12s:

$ iperf3 -c 192.168.100.12 -t 20
Connecting to host 192.168.100.12, port 5201
[  5] local 192.168.20.10 port 54992 connected to 192.168.100.12 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  94.3 KBytes   772 Kbits/sec    5   1.33 KBytes       
[  5]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec    1   2.66 KBytes       
[  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    2   1.33 KBytes       
[  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    2   1.33 KBytes       
[  5]   4.00-5.00   sec  65.1 KBytes   534 Kbits/sec    5   9.30 KBytes       
[  5]   5.00-6.00   sec   377 KBytes  3.09 Mbits/sec    0   26.6 KBytes       
[  5]   6.00-7.00   sec   936 KBytes  7.67 Mbits/sec    0   66.4 KBytes       
[  5]   7.00-8.00   sec  1.28 MBytes  10.7 Mbits/sec    1   67.7 KBytes       
[  5]   8.00-9.00   sec  1.10 MBytes  9.21 Mbits/sec    1   57.1 KBytes       
[  5]   9.00-10.00  sec  1.10 MBytes  9.20 Mbits/sec    0   69.1 KBytes       
[  5]  10.00-11.00  sec   936 KBytes  7.67 Mbits/sec    1   61.1 KBytes       
[  5]  11.00-12.00  sec  1.10 MBytes  9.20 Mbits/sec    0   69.1 KBytes       
[  5]  12.00-13.00  sec   375 KBytes  3.07 Mbits/sec    2   2.66 KBytes       
[  5]  13.00-14.00  sec  0.00 Bytes  0.00 bits/sec    2   1.33 KBytes       
[  5]  14.00-15.00  sec  0.00 Bytes  0.00 bits/sec    2   1.33 KBytes       
[  5]  15.00-16.00  sec  0.00 Bytes  0.00 bits/sec    1   1.33 KBytes       
[  5]  16.00-17.00  sec  0.00 Bytes  0.00 bits/sec    2   1.33 KBytes       
[  5]  17.00-18.00  sec  0.00 Bytes  0.00 bits/sec    1   1.33 KBytes       
[  5]  18.00-19.00  sec  0.00 Bytes  0.00 bits/sec    2   1.33 KBytes       
[  5]  19.00-20.00  sec  0.00 Bytes  0.00 bits/sec    1   1.33 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-20.00  sec  7.29 MBytes  3.06 Mbits/sec   31             sender
[  5]   0.00-20.03  sec  6.96 MBytes  2.91 Mbits/sec                  receiver

For me the behavior looks quite strange and I don’t think that live monitoring my WG interface should make my tunnel working/usable.
I double checked configuration and changed nothing. Only change I did was the RouterOS update from v7.11.2 to v7.13.4.

MTU for WG interface is set to 1412.
I also tested with v7.14beta10 Testing but no success.
My current workaround is a SSH session with running torch on my Raspberry PI which is online 24/7.
I am also thinking about downgrading back to v7.11.2

Does anybody has a similar experience or can explain me why it behaves like this and what I do wrong?

Let me know if further configuration is required to investigate the issue.

Thanks.

Not aware of any reason why it went wonky on you… Can suggest the below.

Change the MTU back to defaults. Simply add this rule on your router. ( change wireguard1 to your actual wireguard interface name )
add action=change-mss chain=forward comment=“Clamp MSS to PMTU for Outgoing packets”
new-mss=clamp-to-pmtu out-interface=wireguard1 passthrough=yes protocol=tcp tcp-flags=syn

Thanks for fast reply. I changed to your suggestion:

[admin@MikroTik] > /ip/firewall/mangle/print
Flags: X - disabled, I - invalid; D - dynamic
 0  D ;;; special dummy rule to show fasttrack counters
      chain=prerouting action=passthrough

 1  D ;;; special dummy rule to show fasttrack counters
      chain=forward action=passthrough

 2  D ;;; special dummy rule to show fasttrack counters
      chain=postrouting action=passthrough

 3    chain=prerouting action=mark-routing new-routing-mark=mullvad in-interface=vlan-20

 4    ;;; Clamp MSS to PMTU for Outgoing packets
      chain=forward action=change-mss new-mss=clamp-to-pmtu passthrough=yes tcp-flags=syn protocol=tcp
      out-interface=mullvad-upstream

Unfortunately, no improvement.

Hmm, it seems we’ll have to educate @Mesquite (just like we had to educate @anav): torch disables fasttrack. And this prompts to reading the tutorial @rooterle linked … which introduces mangle rules. And we all know that fasttrack and mangle rules aren’t exactly friends, right Mesquite?

In short: disable firewall filter rule which enables fasttrack. In principle it should be possible to adjust this rule so that it doesn’t act on packets/connections which have to be mangled. But it’s questionable if this is actually necessary on hAP ax3 …

:laughing:

Priceless !

So torch is the clue, that mangling or queueing aka fastrack disruptor is being used.
Tres cool and of course, as my spouse says, I have to be told something 5 times. :stuck_out_tongue_winking_eye:

Therefore, my conclusion is that the issue is not my lack of knowledge of how torch affects fastrack (which is rather good to know in general), but
the real problem is the lack of process to ensure new posters provide the required information to make support more efficient, based on presented evidence.
The OP didn’t know better, as presenting the config ( a basic tenant of any post ) would have made this clearer and not a guessing game, or relevant to those that know torch intimately.

So holovoetn, what you find priceless must be the realization that we need a better process for new posters as I have suggested for some time. ;-PPPPP

No, the fact quite some others can see through your disguise quite easily.
So why hide ? Because it’s carnival season ?

Your caught up in name? Get serious, I am talking about the OP and his issue… geez

@rooterle. If you are mangling,queing traffic, then its likely your fastrack rule either needs to be disabled or modified… is the logical conclusion pointed out by mkx.

Please clarify …. Forgetting Torch completely … “ Everything was working like a charm with RouterOS v7.11.2”

So when you upgraded to “v7.13.4 and I noticed that my WG throughput to/from remote site is nearly non existing”

Confirm that you were happy with the throughput under ver 7.11.2 … your site and your Client site
BUT after the upgrade to ver 7.13.4 you observed a dramatic difference???

Yeah, it’s working now with disabled Fasttrack:

[admin@MikroTik] > /ip/firewall/filter/print
Flags: X - disabled, I - invalid; D - dynamic
 0  D ;;; special dummy rule to show fasttrack counters
      chain=forward action=passthrough

...

 8 X  ;;; defconf: fasttrack
      chain=forward action=fasttrack-connection hw-offload=yes connection-state=established,related

 9 X  ;;; defconf: accept established,related, untracked
      chain=forward action=accept connection-state=established,related,untracked
      
...



Please clarify …. Forgetting Torch completely … “ Everything was working like a charm with RouterOS v7.11.2”

So when you upgraded to “v7.13.4 and I noticed that my WG throughput to/from remote site is nearly non existing”

Confirm that you were happy with the throughput under ver 7.11.2 … your site and your Client site
BUT after the upgrade to ver 7.13.4 you observed a dramatic difference???

Let’s say it like this: I didn’t test with iperf on v7.11.2 at all because applications were running OK. There were some sporadic dropouts but they were negligible.

But after the update to v7.13.4 none of the applications could even start so I started looking around what could have changed. From your question I assume that in v7.11.2 Fasttrack was also enabled by default so maybe it worked for no reasons but always on the edge.

Anyway, thanks to everybody (the padawan and the knights/masters) participating here. System runs reliable now - and maybe even better than before - and I learned something (still trying to understand Fastpath/Fastrack from MK docs).