Community discussions

MikroTik App
 
farisr
just joined
Topic Author
Posts: 7
Joined: Wed Feb 03, 2021 1:39 pm

MTU oddness with WireGuard and maybe VLANs

Sat Dec 31, 2022 7:37 pm

I'm flummoxed by a problem that seems to be to do with MTUs but may not be.

When connecting to my hAP AC3 from a mobile device (cellular) using WireGuard (which gives me access to my local network and out to the internet too) I cannot access the GUI on port 80 of a specific Netgear switch on the local network unless I set the WireGuard MTU to 1500.

Using the default WireGuard MTU of 1420, or using 1432 or indeed any MTU below 1500, I can get to the web interface of the hAP itself and indeed also two other Netgear Switches on the same local network, vlan and IP range - just not the GUI on this specific Netgear switch. The browser says "Connection Refused"

Incidentally, when I say I set the Wireguard MTU to a specific value, I mean at both ends, not just on the client device or just on in hAP.

I can ping the problem Netgear no matter what MTU I use. It is only accessing the GUI on port 80 that's the problem.

The issue occurs no matter whether I connect my mobile device to the hAP via WireGuard over a cellular link to the public Internet or via WireGuard more directly through one of the hAP's WLAN interfaces.

When connected via one of the hAP's WLAN interfaces and when using a WireGuard MTU less than 1500, if I disable WireGuard then I immediately get access to the problem Netgear without any issues. So the problem isn't the mobile device or mobile device's browser.

The Netgear switch models are all different. The problem one is a JGS524Ev2 with up to date firmware. The other two are a GS105Ev2 and a (very disappointing IMHO) GS308E, again both with the latest firmware.

The hAP is connected to the Internet via PPPoE. I know just enough to know that there's an overhead when PPPoE is used, and I've experimented with setting the Ethernet port used for the WAN to 1504, but this made no noticeable difference to my single thread download speeds (the main reason I wanted to try it) and more specifically made no difference to the issue at hand with the Netgear switch GUI. In any case, as I mentioned, I can reproduce the problem when connecting to the hAP from the same mobile device directly over one of the hAP's WLAN interfaces with WireGuard enabled and MTU less than 1500, thus bypassing WAN PPPoE overheads.

Does anyone have any ideas about what might be going on? It is driving me nuts.
I'm afraid my networking knowledge contains some big gaps - some I know about, some I don't - so I hope I'm not asking about something that should be patently obvious to me.

You may be wondering why I care about any of this. It works, doesn't it? Well, firstly this is where my knowledge gaps come into play. I know enough to know that if you set the Wireguard MTU too high, you can have problems with packet fragmentation that can later come to bite you in the behind (although I admit that I do not understand why fragmentation is such big a problem), and this is in fact why the default MTU for Wireguard is 1420 to start with. Secondly, when something doesn't work as expected, I want to know why. This is how you learn stuff! I don't like bodging something and never learning why the bodge was needed.

Incidentally, I should add that the JGS524Ev2 or any of the other Netgears I use don't allow you to change the internal/port MTU settings in any way - all they do is show that the max MTU is 9702 on each port.

In case it helps, I've posted what I think are the important parts of the hAP's configuration below. I'm really a newbie with these devices though, so treat with caution.

It has some complications that are irrelevant but may cause confusion. In particular, I have two WAN interfaces to two independent internet connections. You can safely ignore ether2-WAN2 and PPPoE-out2, and anything involved with that second connection and routing - it really does not come into the equation.

Some points of explanation may also help, but you can probably skip this bit as I fear I'm going overboard with details here:

---------[SKIP]------------
ether1-WAN1 > PPPoE > 1Gbit fibre Internet connection with Public IP that I connect to via WireGuard from the mobile device
ether4-LAN4 > "good" Netgear switch that I can access without problem > "bad" netgear switch that I can ping with any Wireguard MTU but can only access its GUI with a 1500 wireguard MTU
The second "good" netgear router is also connected to the first "good" netwgear router.
wlan1 and wlan2 - I can connect my mobile device to either of these and it works fine unless I run WireGuard on the device and Wireguard's MTU is set to anything less than 1500.

There is a VLAN involved (vlan id 20). I don't think it has anything to do with anything, but maybe this is one of my knowledge gaps speaking, so I think I'd better outline what it does and how it is configured.

I make use of VLAN 20 for IoT devices and guest wifi access, plus some "potentially dangerous" physically connected devices (e.g. on ether3-LAN3 and ether5-LAN5).

All three Netgears have a default management vlan of 1 enabled when you turn on 802.1q on them, with all ports being members of this vlan and all ports untagged for that vlan id. The PVID of all ports is also set to 1.

On the hAP, ether4-LAN4 is technically set up as a vlan trunk port with vlan id 20 Tagged.
It does NOT have vlan id 1 Tagged. The hAP magically seems to know about vlan id 1 though, as this vlan id shows up as a Dynamic item in the IP > Bridge > VLANs list, with all active ports shown in the Current Untagged list.

The port on the "good" Netgear connected to ether4-LAN4 is set to Tag for vlan 20.
Similarly, the ports used to link the good and bad Netgears together have vlan 20 set to Tag.
Physically, everything branches off - there is no chance of any accidental network loops. None at all.

The hAP and Netgears are in the 192.168.1.x range.
My wireguard IPs are in the 192.168.66.x range.
Devices on VLAN 20 get 192.168.20.x IPs.

-------------[/SKIP]-----------------
/interface bridge
add admin-mac=[REDACTED] auto-mac=no \
  ingress-filtering=no name=bridge vlan-filtering=yes

/interface ethernet
set [ find default-name=ether1 ] name=ether1-WAN1
set [ find default-name=ether2 ] name=ether2-WAN2
set [ find default-name=ether3 ] name=ether3-LAN3
set [ find default-name=ether4 ] name=ether4-LAN4
set [ find default-name=ether5 ] name=ether5-LAN5

/interface wireless
set [ find default-name=wlan1 ] band=2ghz-b/g/n channel-width=20/40mhz-XX \
  disabled=no distance=indoors frequency=auto installation=indoor mode=\
  ap-bridge ssid=[REDACTED] wireless-protocol=802.11
set [ find default-name=wlan2 ] band=5ghz-a/n/ac channel-width=\
  20/40/80mhz-XXXX country="[REDACTED]" disabled=no distance=indoors \
  frequency=auto installation=indoor mode=ap-bridge ssid=[REDACTED] \
  wireless-protocol=802.11

/interface wireguard
add listen-port=[REDACTED] name=wireguard1

/interface vlan
add interface=bridge name=vlan20 vlan-id=20

/interface pppoe-client
add add-default-route=yes disabled=no interface=ether1-WAN1 name=pppoe-out1 \
  use-peer-dns=yes user=[REDACTED]
add disabled=no interface=ether2-WAN2 name=pppoe-out2 user=\
  [REDACTED]

/interface list
add name=WAN
add name=LAN


/interface wireless security-profiles
set [ find default=yes ] authentication-types=wpa2-psk mode=dynamic-keys \
  supplicant-identity=MikroTik
add authentication-types=wpa2-psk mode=dynamic-keys name=profile \
  supplicant-identity=MikroTik

/interface wireless
add disabled=no mac-address=[REDACTED] master-interface=wlan2 name=\
  wlan3 security-profile=profile ssid=[REDACTED]
add disabled=no mac-address=[REDACTED] master-interface=wlan1 name=\
  wlan4 security-profile=profile ssid=[REDACTED]


/ip pool
add name=dhcp ranges=192.168.1.10-192.168.1.50
add name=dhcp_pool_vlan20far ranges=192.168.20.20-192.168.20.200

/ip dhcp-server
add address-pool=dhcp interface=bridge name=defconf
add address-pool=dhcp_pool_vlan20far interface=vlan20 name=dhcp1

/routing table
add disabled=no fib name=useWAN2

/interface bridge filter
add action=drop chain=forward in-interface=wlan3
add action=drop chain=forward out-interface=wlan3
add action=drop chain=forward in-interface=wlan4
add action=drop chain=forward out-interface=wlan4

/interface bridge port
add bridge=bridge frame-types=\
  admit-only-untagged-and-priority-tagged interface=ether3-LAN3 pvid=20
add bridge=bridge ingress-filtering=no interface=ether4-LAN4
add bridge=bridge ingress-filtering=no interface=wlan1
add bridge=bridge ingress-filtering=no interface=wlan2
add bridge=bridge interface=wlan3 pvid=20
add bridge=bridge interface=wlan4 pvid=20
add bridge=bridge frame-types=\
  admit-only-untagged-and-priority-tagged ingress-filtering=no interface=\
  ether5-LAN5 pvid=20

/ip neighbor discovery-settings
set discover-interface-list=LAN

/ipv6 settings
set disable-ipv6=yes max-neighbor-entries=8192

/interface bridge vlan
add bridge=bridge tagged=bridge,ether4-LAN4 untagged=\
  ether3-LAN3,ether5-LAN5,wlan3,wlan4 vlan-ids=20

/interface list member
add interface=bridge list=LAN
add interface=ether1-WAN1 list=WAN
add interface=pppoe-out1 list=WAN
add interface=ether2-WAN2 list=WAN
add interface=pppoe-out2 list=WAN
add interface=wireguard1 list=LAN

/interface ovpn-server server
set auth=sha1,md5

/interface wireguard peers
add allowed-address=192.168.66.22/32 interface=wireguard1 public-key=\
  "[REDACTED]"

/ip address
add address=192.168.1.1/24 interface=bridge network=\
  192.168.1.0
add address=192.168.20.1/24 interface=vlan20 network=192.168.20.0
add address=192.168.66.1/24 interface=wireguard1 network=192.168.66.0

/ip dhcp-server lease
[REDACTED]

/ip dhcp-server network
add address=192.168.1.0/24 dns-server=8.8.8.8,1.1.1.1 \
  gateway=192.168.1.1 netmask=24
add address=192.168.20.0/24 dns-server=1.1.1.1,8.8.8.8 gateway=192.168.20.1

/ip dns
set allow-remote-requests=yes

/ip dns static
add address=192.168.1.1 name=router.lan

/ip firewall filter
add action=accept chain=input comment=\
  "defconf: accept established,related,untracked" connection-state=\
  established,related,untracked
add action=drop chain=input comment="defconf: drop invalid" connection-state=\
  invalid
add action=accept chain=input comment="defconf: accept ICMP" disabled=yes \
  protocol=icmp
add action=accept chain=input comment=Wireguard dst-port=[REDACTED] log=yes \
  protocol=udp
add action=accept chain=input comment=\
  "defconf: accept to local loopback (for CAPsMAN)" dst-address=127.0.0.1
add action=drop chain=input comment="defconf: drop all not coming from LAN" \
  in-interface-list=!LAN
add action=drop chain=forward comment="Prevent Inter-VLAN routing" \
  dst-address=192.168.1.0/24 src-address=192.168.20.0/24
add action=drop chain=forward dst-address=192.168.20.0/24 src-address=\
  192.168.1.0/24
add action=accept chain=forward comment="defconf: accept in ipsec policy" \
  ipsec-policy=in,ipsec
add action=accept chain=forward comment="defconf: accept out ipsec policy" \
  ipsec-policy=out,ipsec
add action=fasttrack-connection chain=forward comment="defconf: fasttrack" \
  connection-state=established,related hw-offload=yes
add action=accept chain=forward comment=\
  "defconf: accept established,related, untracked" connection-state=\
  established,related,untracked
add action=drop chain=forward comment="defconf: drop invalid" \
  connection-state=invalid
add action=drop chain=forward comment=\
  "defconf: drop all from WAN not DSTNATed" connection-nat-state=!dstnat \
  connection-state=new in-interface-list=WAN

/ip firewall nat
add action=masquerade chain=srcnat comment="defconf: masquerade" \
  ipsec-policy=out,none out-interface-list=WAN

/ip route
add disabled=no distance=2 dst-address=0.0.0.0/0 gateway=pppoe-out2 pref-src=\
  "" routing-table=main scope=30 suppress-hw-offload=no target-scope=10 \
  vrf-interface=pppoe-out2
add disabled=no distance=2 dst-address=0.0.0.0/0 gateway=pppoe-out2 \
  routing-table=useWAN2 suppress-hw-offload=no

/routing rule
add action=lookup-only-in-table disabled=no dst-address=[REDACTED] \
  table=useWAN2
add action=lookup-only-in-table disabled=no \
  dst-address=[REDACTED] table=useWAN2
  
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3255
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: MTU oddness with WireGuard and maybe VLANs

Sun Jan 01, 2023 2:24 pm

MTU is complex. Clearly both WG and PPPoE have overhead. There are firewall rules that can change the "TCP MSS". (action=change-mss) which can help with MTU issues but quite complex with VLANs involved. But these aren't need if the standard "Path MTU Discovery" (PMTUD) method work – but MTU discovery depends on ping working end-to-end.

But in your firewall you disable ICMP (ping), so that stop the PMTUD from working thus blocking TCP from figure it out MTU automatically... e.g.: disable=yes in your config:
/ip firewall filter
add action=accept chain=input comment="defconf: accept ICMP" disabled=yes protocol=icmp
Renabling ping in firewall above (disabled=no) would like help/fix this, or at least be simple to try.
 
sindy
Forum Guru
Forum Guru
Posts: 10205
Joined: Mon Dec 04, 2017 9:19 pm

Re: MTU oddness with WireGuard and maybe VLANs

Sun Jan 01, 2023 3:28 pm

I do not understand why fragmentation is such big a problem
For two reasons. The first one is intrinsic - if a router receives a packet that doesn't fit into the MTU of the outgoing interface, it has to send it as two or more fragments. For all the subsequent routers on the path to the destination, handling of a complete packet requires the same effort as handling of a fragment. The fragments also have their own IP (and Ethernet where applicable) headers so extra data bytes. Hence fragmentation increases network load by increasing both the packet rate and the raw data volume.

If already the source slices the payload data into packets whose size matches the lowest MTU on the whole path between the source and the destination, the packets need not be fragmented and the ratio between the overhead and the payload is the lowest possible one.

The other reason is that some networks drop non-first fragments. Partially that's caused by firewalls that match on L4 headers (port numbers) that are only present in the first fragments and neither reassemble packets before handling them nor accept non-first fragments generally, and partially it "just happens" - I've seen networks where some non-first fragments did pass and some did not.

Since most devices in your LAN do not have a problem with MTU below 1500, there is nothing that breaks PMTUD on your router or in your phone, so I'd blame that Netgear device itself. As @Ammo suggests, you may try to work it around by mangling the MSS value in TCP SYN packets sent to and from that device to 1380 (which corresponds to MTU 1420), but if the TCP stack of that device is really broken, it may ignore even that.
 
farisr
just joined
Topic Author
Posts: 7
Joined: Wed Feb 03, 2021 1:39 pm

Re: MTU oddness with WireGuard and maybe VLANs

Fri Jan 13, 2023 9:26 pm

Thank you both so much for your input.

In a few words, you have opened my eyes to a lot of things.

I will try to report back if re-enabling pings works, otherwise yes I'm going to blame the Netgear. It is an odd beast. It does not allow naming/labelling of ports, unlike my really inexpensive 4-port Netgear smart switches. I get the feeling it was built on a very severe budget.

Who is online

Users browsing this forum: Amazon [Bot], CGGXANNX and 61 guests