RouterOS blatantly ignores pref-src. Can this really be a bug?

This is a followup question from http://forum.mikrotik.com/t/any-advice-for-further-debugging-handshaking-failed-on-wireguard-roadwarrior-setup/180297/1 but posting new topic because original question is resolved.

Per request, I am starting with a network diagram:
Screenshot 2024-11-27 at 16.27.59.png
RouterOS runs a wireguard “server” for a road warrior setup. Sadly it’s not possible to “bind” wireguard to specific interfaces, so they “listen” on any IP address. The road warrior uses 192.0.2.210:52810 as an endpoint. Now every reasonable person would expect that the response packet from such a request would have source address 192.0.2.210. Sadly RouterOS is always up for surprises and that’s not the case. I assume due to an obscure implementation of wireguard, the source address is left unset and hence the operating system needs to decide what should be the source address for the locally generated packet. For me this is 192.0.2.177. Now you can imagine already that this breaks everything: The road warrier client requested a connection to 192.0.2.210 but gets a response from 192.0.2.177. It thinks this is MIM and discards the packet.

Now there is “pref-src” is exactly for this: It should determine the source address of a locally generated packet based on the route that is chosen. In my case the route comes from BGP and I could add a router filter to add pref-src. But for simplicity (and because it didn’t work), I explicitly add a static route with low distance and pref-src set to 192.0.2.210:

/ip/route/add dst-address=0.0.0.0/0 routing-table=default_myas pref-src=192.0.2.210 gateway=192.0.2.176 distance=1
/ip/route/print detail where routing-table=default_myas                      
Flags: D - dynamic; X - disabled, I - inactive, A - active; c - connect, s - static, r - rip, b - bgp, o - ospf, i - is-is, d - dhcp, v - vpn, m - modem, y - bgp-mpls-vpn; H - hw-offloaded; + - ecmp 
   D b   dst-address=0.0.0.0/0 routing-table=default_myas gateway=172.20.215.129 immediate-gw=192.0.2.185%wg-bg1-ftth distance=200 scope=40 target-scope=30 suppress-hw-offload=no 

   D b   dst-address=0.0.0.0/0 routing-table=default_myas gateway=172.20.215.130 immediate-gw=192.0.2.176%wg-bg2-ftth distance=200 scope=40 target-scope=30 suppress-hw-offload=no 

 1  As   dst-address=0.0.0.0/0 routing-table=default_myas pref-src=192.0.2.210 gateway=192.0.2.176 immediate-gw=192.0.2.176%wg-bg2-ftth distance=1 scope=30 target-scope=10 suppress-hw-offload=no

However, RouterOS blatantly ignores this to its fullest extent and it drives me NUTS. NUTS. NUTS! :

/tool/sniffer/export 
# 2024-11-27 16:24:00 by RouterOS 7.15.3
# software id = 
#
/tool sniffer
set filter-ip-protocol=udp filter-port=51820
/tool/sniffer/packet/print detail
18 time=59.738 num=19 direction=rx interface=wg-bg1-ftth src-address=*.*.5.88:63419 dst-address=192.0.2.210:51820 protocol=ip ip-protocol=udp size=176 cpu=2 ip-packet-size=176 ip-header-size=20 dscp=0 
   identification=64534 fragment-offset=0 ttl=49 

19 time=59.738 num=20 direction=tx interface=wg-bg2-ftth src-address=192.0.2.177:51820 dst-address=*.*.5.88:63419 protocol=ip ip-protocol=udp size=120 cpu=1 ip-packet-size=120 ip-header-size=20 dscp=34 
   identification=32146 fragment-offset=0 ttl=64

You can see here, the request to 192.0.2.210:51820 enters via wg-bg1-ftth and the response (through perfectly valid interface wg-bg2-ftth) is sent with source 192.0.2.177:51820, despite explicitly asking RouterOS to use 192.0.2.210 via the pre-src in the default route.

While I don’t think it’s necessary, I re-post my entire config below.

Can anyone confirm that I am not totally crazy and this is a stupid bug in RouterOS?

If not, what the heck is wrong with this config?



# 2024-11-25 10:54:47 by RouterOS 7.15.3
# software id = 
#
/interface bridge
add name=br-main vlan-filtering=yes
add arp=disabled fast-forward=no name=dum0 protocol-mode=none
/interface ethernet
set [ find default-name=ether1 ] disable-running-check=no
/interface wireguard
add listen-port=13231 mtu=1420 name=wg-bg1-ftth
add listen-port=13232 mtu=1420 name=wg-bg2-ftth
add listen-port=51820 mtu=1420 name=wg-mobile
/interface vlan
add comment=ADM interface=br-main name=vlan1 vlan-id=1
add comment=FTTH interface=br-main name=vlan2 vlan-id=2
add comment=LAN interface=br-main name=vlan3 vlan-id=3
add comment=SRV interface=br-main name=vlan4 vlan-id=4
add comment=ADU interface=br-main name=vlan10 vlan-id=10
add comment=DEV interface=br-main name=vlan11 vlan-id=11
add comment=DEVP interface=br-main name=vlan12 vlan-id=12
add comment=DMZ44 interface=br-main name=vlan44 vlan-id=44
/interface vrrp
add interface=vlan1 name=vrrp1
add interface=vlan3 name=vrrp3 sync-connection-tracking=yes vrid=3
add interface=vlan4 name=vrrp4 vrid=4
add interface=vlan11 name=vrrp11 vrid=11
add interface=vlan12 name=vrrp12 vrid=12
add interface=vlan44 name=vrrp44 vrid=44
/interface list
add name=LAN
add comment="Admin Interfaces" name=ADM
add name=DMZ
add name=SRV
add name=DEV
add name=DEVP
add name=WAN
add include=ADM,DEV,DEVP,DMZ,LAN,SRV name=trusted
add name=peers
add name=WANMYAS
add name=ISP
add name=MACHINES
/routing ospf instance
add disabled=no name=ospf-instance-1 redistribute=static router-id=192.0.2.210
/routing ospf area
add disabled=no instance=ospf-instance-1 name=ospf-area-1
/routing table
add disabled=no fib name=default_myas
add disabled=no fib name=default_isp
add comment="Dummy table, acts only as a routing mark" disabled=no fib name=default_isp_only
/interface bridge port
add bridge=br-main interface=ether1
add bridge=br-main interface=veth1 pvid=3
/ip firewall connection tracking
set enabled=yes
/ip neighbor discovery-settings
set discover-interface-list=none lldp-med-net-policy-vlan=1
/interface bridge vlan
add bridge=br-main tagged=br-main,ether1 vlan-ids=2
add bridge=br-main tagged=br-main,ether1 vlan-ids=3
add bridge=br-main tagged=br-main,ether1 vlan-ids=4
add bridge=br-main tagged=br-main,ether1 vlan-ids=10
add bridge=br-main tagged=br-main,ether1 vlan-ids=11
add bridge=br-main tagged=br-main,ether1 vlan-ids=12
add bridge=br-main tagged=br-main,ether1 vlan-ids=44
add bridge=br-main tagged=br-main untagged=ether1 vlan-ids=1
add bridge=br-main tagged=br-main,ether1 vlan-ids=33
/interface detect-internet
set internet-interface-list=static lan-interface-list=static wan-interface-list=static
/interface list member
add interface=vlan3 list=LAN
add interface=vlan1 list=ADM
add interface=vlan2 list=WAN
add interface=vlan44 list=DMZ
add interface=wg-bg1-ftth list=WAN
add interface=wg-bg2-ftth list=WAN
add interface=vrrp3 list=LAN
add interface=vlan4 list=SRV
add interface=wg-bg1-ftth list=peers
add interface=wg-bg2-ftth list=peers
add interface=vlan11 list=DEV
add interface=vlan12 list=DEVP
add interface=wg-bg1-ftth list=WANMYAS
add interface=wg-bg2-ftth list=WANMYAS
add interface=vlan2 list=ISP
add interface=vlan33 list=ISP
add interface=vlan11 list=MACHINES
add interface=vlan12 list=MACHINES
add interface=vrrp44 list=DMZ
add interface=vrrp1 list=ADM
add interface=vrrp12 list=DEVP
add interface=vrrp4 list=SRV
add interface=vrrp11 list=DEV
add interface=vrrp11 list=MACHINES
add interface=vrrp12 list=MACHINES
/interface wireguard peers
add allowed-address=0.0.0.0/0,::/0 endpoint-address=*.*.*.* endpoint-port=51821 interface=wg-bg2-ftth name=bgate2-ftth persistent-keepalive=5s public-key=“********************************************”
add allowed-address=0.0.0.0/0,::/0 endpoint-address=*.*.*.* endpoint-port=51821 interface=wg-bg1-ftth name=bgate1-ftth persistent-keepalive=5s public-key=“********************************************”
add allowed-address=10.2.33.10/32 client-address=10.2.33.10/32 interface=wg-mobile name=miPhone public-key=“********************************************”
/ip address
add address=10.2.1.2/24 interface=vlan1 network=10.2.1.0
add address=10.2.4.2/24 interface=vlan4 network=10.2.4.0
add address=10.2.79.2/24 interface=vlan3 network=10.2.79.0
add address=192.0.2.177/31 interface=wg-bg2-ftth network=192.0.2.176
add address=192.0.2.186/29 interface=wg-bg1-ftth network=192.0.2.184
add address=192.0.2.210/28 interface=vlan44 network=192.0.2.208
add address=172.20.215.132 interface=dum0 network=172.20.215.132
add address=10.2.80.2/24 interface=vlan10 network=10.2.80.0
add address=192.168.222.2/24 interface=vlan11 network=192.168.222.0
add address=192.168.223.2/24 interface=vlan12 network=192.168.223.0
add address=10.2.79.254 interface=vrrp3 network=10.2.79.254
add address=10.2.33.1/24 interface=wg-mobile network=10.2.33.0
add address=192.168.223.254 interface=vrrp12 network=192.168.223.254
add address=192.168.222.254 interface=vrrp11 network=192.168.222.254
add address=10.2.1.254 interface=vrrp1 network=10.2.1.254
add address=10.2.4.254 interface=vrrp4 network=10.2.4.254
add address=192.168.5.102/24 interface=vlan33 network=192.168.5.0
add address=192.0.2.222 interface=vrrp44 network=192.0.2.222
/ip dhcp-client
add add-default-route=no interface=vlan2 script=":if (\$bound=1) do={\
    \n    /ip route set [find where comment=\"default_isp\"] gateway=\$\"gateway-address\" disabled=no\
    \n    } else={\
    \n    /ip route set [find where comment=\"default_isp\"] disabled=yes\
    \n    }\r\
    \n\r\
    \n" use-peer-ntp=no
/ip firewall address-list
add address=10.2.0.0/16 list=own_hosts
add address=192.0.2.0/24 list=own_hosts
add address=*.*.*.* list=border_gates
add address=*.*.*.* list=border_gates
add address=*.*.*.* list=own_hosts
add address=*.*.*.* list=own_hosts
add address=192.168.222.0/24 list=own_hosts
add address=192.168.223.0/24 list=own_hosts
add address=172.20.215.128/27 list=own_hosts
add address=0.0.0.0/8 comment="defconf: RFC6890" list=no_forward_ipv4
add address=169.254.0.0/16 comment="defconf: RFC6890" list=no_forward_ipv4
add address=224.0.0.0/4 comment="defconf: multicast" list=no_forward_ipv4
add address=255.255.255.255 comment="defconf: RFC6890" list=no_forward_ipv4
add address=127.0.0.0/8 comment="defconf: RFC6890" list=bad_ipv4
add address=192.0.0.0/24 comment="defconf: RFC6890" list=bad_ipv4
add address=192.0.2.0/24 comment="defconf: RFC6890 documentation" list=bad_ipv4
add address=198.51.100.0/24 comment="defconf: RFC6890 documentation" list=bad_ipv4
add address=203.0.113.0/24 comment="defconf: RFC6890 documentation" list=bad_ipv4
add address=240.0.0.0/4 comment="defconf: RFC6890 reserved" list=bad_ipv4
add address=0.0.0.0/8 comment="defconf: RFC6890" list=not_global_ipv4
add address=10.0.0.0/8 comment="defconf: RFC6890" list=not_global_ipv4
add address=100.64.0.0/10 comment="defconf: RFC6890" list=not_global_ipv4
add address=169.254.0.0/16 comment="defconf: RFC6890" list=not_global_ipv4
add address=172.16.0.0/12 comment="defconf: RFC6890" list=not_global_ipv4
add address=192.0.0.0/29 comment="defconf: RFC6890" list=not_global_ipv4
add address=192.168.0.0/16 comment="defconf: RFC6890" list=not_global_ipv4
add address=198.18.0.0/15 comment="defconf: RFC6890 benchmark" list=not_global_ipv4
add address=255.255.255.255 comment="defconf: RFC6890" list=not_global_ipv4
add address=224.0.0.0/4 comment="defconf: multicast" list=bad_src_ipv4
add address=255.255.255.255 comment="defconf: RFC6890" list=bad_src_ipv4
add address=0.0.0.0/8 comment="defconf: RFC6890" list=bad_dst_ipv4
add address=224.0.0.0/4 comment="defconf: RFC6890" list=bad_dst_ipv4
add address=10.2.0.0/16 list=private_lans
add address=192.168.0.0/16 list=private_lans
add address=172.20.215.128/27 list=border_gates
add address=10.2.4.10 list=hairpin_dst
add address=10.2.4.20 list=hairpin_dst
add address=10.2.0.0/16 list=hairpin_src
add address=192.168.222.0/24 list=hairpin_src
add address=192.168.223.0/24 list=hairpin_src
/ip firewall filter
add action=accept chain=input comment="defconf: accept ICMP after RAW" protocol=icmp
add action=accept chain=input comment="defconf: accept established,related,untracked" connection-state=established,related,untracked
add action=accept chain=input comment="SSH access from everywhere" dst-port=1983 protocol=tcp
add action=accept chain=input comment=Wireguard dst-port=51820 protocol=udp
add action=accept chain=input dst-port=8291 protocol=tcp src-address-list=private_lans
add action=accept chain=input dst-port=3784,4784,3785 in-interface-list=peers protocol=udp src-address-list=own_hosts
add action=accept chain=input in-interface-list=peers protocol=ospf
add action=accept chain=input dst-port=179 in-interface-list=peers protocol=tcp src-address-list=own_hosts
add action=accept chain=input dst-port=53 protocol=udp src-address-list=own_hosts
add action=accept chain=input dst-port=53 protocol=tcp src-address-list=own_hosts
add action=drop chain=input comment="defconf: drop all not coming from trusted" in-interface-list=!trusted
add action=accept chain=forward comment="PortFW: For some reason, Fasttrack doesn't work for these." connection-mark=conn_portfw connection-state=established,related
add action=fasttrack-connection chain=forward comment="defconf: fasttrack" connection-state=established,related hw-offload=yes
add action=accept chain=forward comment="defconf: accept established,related, untracked" connection-state=established,related,untracked
add action=jump chain=forward comment="Traffic rules for new connections" connection-state=new jump-target=traffic_rules
add action=drop chain=forward comment="defconf: drop invalid" connection-state=invalid log=yes log-prefix=INVALID
add action=drop chain=forward comment="defconf:  drop all from WAN not DSTNATed" connection-nat-state=!dstnat connection-state=new in-interface-list=WAN
add action=drop chain=forward comment="defconf: drop bad forward IPs" src-address-list=no_forward_ipv4
add action=drop chain=forward comment="defconf: drop bad forward IPs" dst-address-list=no_forward_ipv4
add action=accept chain=forward comment="Check if counters >0. If not, DROP here" log=yes log-prefix="[ACPT]"
add action=accept chain=traffic_rules comment="Allow all ICMP echo" icmp-options=8:0-255 protocol=icmp
add action=accept chain=traffic_rules comment="Allow from ISPs only when DNAT" connection-nat-state=dstnat in-interface-list=ISP
add action=accept chain=traffic_rules comment="Allow internet for public 44" in-interface-list=DMZ out-interface-list=WANMYAS
add action=log chain=traffic_rules comment="Log drops (not wan)" in-interface-list=!WANMYAS log=yes log-prefix="[BLOCKED]" out-interface-list=!WANMYAS src-address=!192.168.222.150
add action=drop chain=traffic_rules
/ip firewall mangle
add action=mark-routing chain=output comment="wg-tunnels: bgate1 over FTTH" dst-address=*.*.*.* dst-port=51821 new-routing-mark=default_isp_only passthrough=no protocol=udp
add action=mark-routing chain=output comment="wg-tunnels: bgate2 over FTTH" dst-address=*.*.*.* dst-port=51821 new-routing-mark=default_isp_only passthrough=no protocol=udp
/ip firewall nat
add action=masquerade chain=srcnat comment="wg-tunnels: FTTH" routing-mark=default_isp_only
add action=masquerade chain=srcnat comment="Hairpin all PortFWs" dst-address-list=hairpin_dst src-address-list=hairpin_src
add action=masquerade chain=srcnat comment="defconf: masquerade WAN interface" out-interface=vlan2
/ip firewall raw
add action=accept chain=prerouting comment="defconf: enable for transparent firewall"
add action=accept chain=prerouting comment="defconf: accept DHCP discover" dst-address=255.255.255.255 dst-port=67 in-interface-list=LAN protocol=udp src-address=0.0.0.0 src-port=68
add action=drop chain=prerouting comment="defconf: drop bogon IP's" src-address-list=bad_ipv4
add action=drop chain=prerouting comment="defconf: drop bogon IP's" dst-address-list=bad_ipv4
add action=drop chain=prerouting comment="defconf: drop bogon IP's" src-address-list=bad_src_ipv4
add action=drop chain=prerouting comment="defconf: drop bogon IP's" dst-address-list=bad_dst_ipv4
add action=drop chain=prerouting comment="defconf: drop non global from WAN" in-interface-list=WAN src-address-list=not_global_ipv4
add action=drop chain=prerouting comment="defconf: drop forward to local lan from WAN" dst-address-list=private_lans in-interface-list=WAN
add action=drop chain=prerouting comment="defconf: drop local if not from default IP range" in-interface-list=LAN src-address=!10.2.79.0/24
add action=drop chain=prerouting in-interface-list=DEV src-address=!192.168.222.0/24
add action=drop chain=prerouting in-interface-list=DEVP src-address=!192.168.223.0/24
add action=drop chain=prerouting comment="defconf: drop bad UDP" port=0 protocol=udp
add action=jump chain=prerouting comment="defconf: jump to ICMP chain" jump-target=icmp4 protocol=icmp
add action=jump chain=prerouting comment="defconf: jump to TCP chain" jump-target=bad_tcp protocol=tcp
add action=accept chain=prerouting comment="defconf: accept everything else from LAN" in-interface-list=trusted
add action=accept chain=prerouting comment="defconf: accept everything else from WAN" in-interface-list=WAN
add action=drop chain=prerouting comment="defconf: drop the rest"
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=!fin,!syn,!rst,!ack
add action=drop chain=bad_tcp comment=defconf protocol=tcp tcp-flags=fin,syn
add action=drop chain=bad_tcp comment=defconf protocol=tcp tcp-flags=fin,rst
add action=drop chain=bad_tcp comment=defconf protocol=tcp tcp-flags=fin,!ack
add action=drop chain=bad_tcp comment=defconf protocol=tcp tcp-flags=fin,urg
add action=drop chain=bad_tcp comment=defconf protocol=tcp tcp-flags=syn,rst
add action=drop chain=bad_tcp comment=defconf protocol=tcp tcp-flags=rst,urg
add action=drop chain=bad_tcp comment="defconf: TCP port 0 drop" port=0 protocol=tcp
add action=accept chain=icmp4 comment="defconf: echo reply" icmp-options=0:0 limit=5,10:packet protocol=icmp
add action=accept chain=icmp4 comment="defconf: net unreachable" icmp-options=3:0 protocol=icmp
add action=accept chain=icmp4 comment="defconf: host unreachable" icmp-options=3:1 protocol=icmp
add action=accept chain=icmp4 comment="defconf: protocol unreachable" icmp-options=3:2 protocol=icmp
add action=accept chain=icmp4 comment="defconf: port unreachable" icmp-options=3:3 protocol=icmp
add action=accept chain=icmp4 comment="defconf: fragmentation needed" icmp-options=3:4 protocol=icmp
add action=accept chain=icmp4 comment="defconf: echo" icmp-options=8:0 limit=5,10:packet protocol=icmp
add action=accept chain=icmp4 comment="defconf: time exceeded " icmp-options=11:0-255 protocol=icmp
add action=drop chain=icmp4 comment="defconf: drop other icmp" protocol=icmp
/ip route
add check-gateway=ping comment=default_isp disabled=no distance=1 dst-address=0.0.0.0/0 gateway=*.*.*.* routing-table=default_isp scope=30 suppress-hw-offload=no target-scope=10
/routing bgp connection
add as=64513 disabled=no local.address=172.20.215.132 .role=ibgp name=BorderGate1 remote.address=172.20.215.129/32 .as=64513 routing-table=default_myas
add as=64513 disabled=no local.address=172.20.215.132 .role=ibgp name=BorderGate2 remote.address=172.20.215.130/32 .as=64513 routing-table=default_myas
/routing ospf area range
add area=ospf-area-1 disabled=no prefix=172.20.215.128/27
add area=ospf-area-1 disabled=no prefix=192.0.2.0/24
/routing ospf interface-template
add area=ospf-area-1 auth=md5 auth-id=1 auth-key=**************** cost=100 dead-interval=10s disabled=no hello-interval=5s interfaces=wg-bg1-ftth,wg-bg2-ftth
add area=ospf-area-1 disabled=no interfaces=dum0 passive
add area=ospf-area-1 disabled=no interfaces=vlan44 passive
/routing rule
add action=lookup-only-in-table comment="Connections must never go through other links" disabled=no routing-mark=default_isp_only table=default_isp
add action=lookup comment="main table" disabled=no table=main
add action=lookup comment="default 44net" disabled=no src-address=192.0.2.0/24 table=default_myas
add action=lookup comment="default sonic" disabled=no table=default_isp
/tool sniffer
set filter-ip-address=192.0.2.210/32 filter-ip-protocol=udp filter-port=51820

It’s roughly the same as this issue: http://forum.mikrotik.com/t/wireguard-multi-wan-policy-routing/174145/1

Wireguard, for some unknown reason, is not treated the same as “locally generated traffic”. So pref-src= is I’m guessing a similar victim.

It’s known by MikroTik and they have recently updated the docs with a workaround

https://help.mikrotik.com/docs/pages/diffpagesbyversion.action?pageId=69664792&selectedPageVersions=39&selectedPageVersions=37

Info
When you encounter issues with reply traffic having the wrong source address, using NAT to translate packet source addresses to your loopback interface is a common workaround. This approach helps ensure that the source address is consistent and correct when packets are routed back through the network.

Wow crazy. Thanks. I looked everywhere but did not find this. At least I’m not alone.
Ok, I really want to avoid NAT but I’m OK to use it as a workaround.
However, I am still not able to get it running. I am using the packet sniffer tool and just filter for UDP and the IP of my road warrior setup and often I just can’t see any traffic going out (“tx”).

From the Mikrotik link, it’s really not clear to me what “translate packet source address to your loopback interface” really means.

I tried all possible variants using src-nat I could think of (i.e., source natting the wrong source IP, 192.0.2.177, to the right one, 192.0.2.210; both with a single snat rule as well as using mangle+nat).
I also tried dst-nat but no luck either (i.e., making a port forwarding from port 192.0.2:210:41820 to 192.0.2.210:51820)


Would you mind showing an example of the right NAT rule(s)?

I haven’t tested with your particular config, but on my router, when I want to control the source address as well as the outgoing interface for WireGuard reply-packets, I use dstnat rules:

  • First create an address on the lo interface, let’s say 10.20.30.40/32
  • Add dstnat rule for destination 192.0.2.210/32, UDP dst-port 51820, with action=dst-nat to-addresses=10.20.30.40
  • (Optional) in case you also want to control the outgoing interface, add routing rules for src-address 10.20.30.40

For your case, you’ll probably only need the first two steps. Because incoming packets with destination to 192.0.2.210 get dstnat-ed to 10.20.30.40, their corresponding response packets will get un-NAT-ed accordingly and have their source address changed to 192.0.2.210, which is what you want. If it still doesn’t work (like I said, I haven’t tested with configurations similar to yours) you might need:

  • An additional srcnat rule for src-address 10.20.30.40, with action=src-nat to-addresses=192.0.2.210.

And this srcnat rule must be placed ABOVE your masquerade rules int NAT table.

@cgg,

First create an address on the lo interface, let’s say 10.20.30.40/32

in case you didn’t notice - that 210 is the bridge/loopback address which the wg listen to. the exact same thing as you have proposed.

the @op scenario and problem was full routing on 2 wan interfaces and the wg bridge (he expected no nat needed as he runs full routing). hence the rpf checks problem came in. he wonders why the wg source address is not 210 but rewritten to 177 - mitm/spoofing.

@normis,
please add some notes in the wiki - that this kind of scenario (wireguard listening on local bridge interface with multiple wan interfaces) currently won’t work in full routing except with nat. so that other mt users understand the situation.

No, it’s not the same. The most important thing is the dstnat rule. It doesn’t matter which of the router’s addresses the external user used as endpoint address for the WG connection. What needed is the additional DSTNAT operation (from one of the router’s IP addresses to another one, that’s why an additional IP address needs to be added to the router, and lo was chosen because that interface is always there).

Actually this is not a bug, but simply an effect of how wireguard and linux routing works. By the way as far as I can tell, Mikrotik uses the stock linux implementation - and yes, the same thing happens on straight linux and yes, this is a frequently asked question on linux forums as well.

To break it down:

  • wg sends out the packet and does not specify a source address (this is in line with the philosophy around wg - so I wouldn’t hold my breath until this is changed)
  • the routing decision is made and as the packet has no source address, pref addr is used - however your packet at this point has no routing mark attached, and quite correctly, the main table is consulted
  • later, even if you apply a routing mark to the packet, in the routing adjustment phase of the packet flow, pref src is disregarded (because that’s just how it works - at this point the packet has a source address and regardless of how it was acquired, it’s assumed to be correct)

This behavior leads to a number of sometimes surprising effects. And yes, regardless of Mikrotik or non-Mikrotik, dnat is the universally accepted solution.

For other protocols this does not manifest so blatantly, because there is a clear request-response nature to them (e.g. ping: echo request - echo response; dns: again, request - response; and of course for all things TCP) and in these cases the source ip is set by the originator of the packet. Wg on the other hand does not differentiate meaningfully between client and server, and also supports roaming (changing ip addresses and ports) of each peer, and in line with this philosophy maintains that fw/routing should take care of the situation you are in. Thus every major contributor (as far as I can tell) maintains that dnat is not some workaround but an appropriate solution; of course everyone is entitled to their opinion. shrug

EDIT:
By the way, adding proper VRF support to wg would solve this. Although this was proposed, it never made it into the kernel (at least the last time I was aware of such an effort…)

Hi,

My observations:
Actually wireguard in mikrotik does attempt to use the IP address that the incoming packet was sent to.

Looking at the packet on the output chain is too late. It has already gone through the routing process,
and had it’s ip address changed, probably also natted.

If you use routing rules, you can change the routing table it uses based on the correct receiving IP address.

However this case seems a little unusual in that you have packets coming in interface A and replies leaving via interface B.
Which if you had no firewall rules at all on your router might work ok.

The outbound packets won’t match the connection of the incoming packets, because they are on different interfaces.
So not being part of an existing connection, they will become a new connection started from inside the router and get Natted.

(In another thread MrZ mentioned that in wireguard the reply connection is a different connection to the incoming connection)

You could perhaps disable outbound NAT for packets with source of .210 ip address?

@cgg,

It doesn’t matter which of the router’s addresses the external user used as endpoint address for the WG connection.

well, if @op would listen to put wg to listen on interface address 177 (which is persistent in terms of path) - then he won’t have this headache resolving pref-src or nat or spoofing issues. please read his other thread as well so you get the picture. i did tell him to use nat - but he insisted on full routing scenario.

I really think I am not that dumb but this wireguard is one of the biggest headaches I have come across to date. It exceeds all the OSPF and BFD trouble.

I did what you said and I agree this is how it should work. Let me start what I did:

/interface bridge
add arp=disabled fast-forward=no name=dum1 protocol-mode=none
/ip address
add address=172.20.215.1 interface=dum1 network=172.20.215.1
/ip firewall filter
add action=accept chain=input comment=test dst-port=41820,51820 log=yes log-prefix="[***WG-IN]" protocol=udp
add action=accept chain=output log=yes log-prefix="[***WG-OUT]" port=51820,41820 protocol=udp
/ip firewall nat
add action=dst-nat chain=dstnat dst-address=192.0.2.210 dst-port=41820 log=yes log-prefix="[***WG-DNAT]" protocol=udp to-addresses=172.20.215.1 to-ports=51820



  • Created a dummy interface “dum1” with address 172.20.215.1
  • Created a DNAT rule to port forward from 192.0.2.210:41820 to 172.20.215.1:51820
  • Client now connects to 192.0.2.210:41820 which would then be “port forwarded” to internal 172.20.215.1:51820
  • I agree with you that “their corresponding response packets will get un-NAT-ed accordingly and have their source address changed to 192.0.2.210, which is what you want”
  • What RouterOS does just doesn’t make any sense whatsoever.

Here is the log output:

15:13:06 wireguard,debug wg-mobile: [miPhone] ****: Receiving handshake initiation from peer (200.95.5.88:38846) extra:0 (einval) 
15:13:06 wireguard,debug wg-mobile: [miPhone] ****: Sending handshake response to peer (200.95.5.88:38846) 
15:13:06 firewall,info [***WG-DNAT] dstnat: in:wg-bg1-ftth out:(unknown 0), connection-state:new proto UDP, 200.95.5.88:38846->192.0.2.210:41820, len 176 
15:13:06 firewall,info [***WG-IN] input: in:wg-bg1-ftth out:(unknown 0), connection-state:new,dnat proto UDP, 200.95.5.88:38846->172.20.215.1:51820, NAT 200.95.5.88:38846->(192.0.2.210:41820->172.20.215.1:51820), len 176 
15:13:06 firewall,info [***WG-OUT] output: in:(unknown 0) out:vlan2, connection-state:new proto UDP, 125.170.121.236:51820->200.95.5.88:38846, len 120

WTF? Why is the source address of the return packet now 125.170.121.236? (This is the address of vlan2!). Because of DNAT, the return packet should be 192.0.2.210, that’s the whole point of doing it and now again it picks some other random address.

Before we go into adding src-nat rules (in case they should be required) I would really like to understand what the heck is going on here. It defies any logic.

Do you have any further advice what’s wrong here and why my DNAT rule is not cutting it?

EDIT: I also added a connection mark to the incoming connection via mangle … but the output packet is still as “connection-state:new” and does not include my connection mark either! It seems connection tracking fails to be able to track this wireguard connection. If this is really the case, how on earth are you able to use DNAT for this?


EDIT2: I have never been so lost as with this ~!#$*@ wireguard. I’ve spent again hours trying any conceivable combination of DNAT+SNAT but there just doesn’t seem to be any way:

SNAT does not work because:

  • Rewriting to any other (arbitrary) SADDR/SPORT works but only 192.0.2.210:51820 does not work!
  • 192.0.2.210:51820 does NOT work because wireguard already listens on it! —> Rewritten packet is discarded and never transmitted (does not appear in packet sniffer)

DNAT does not work:

  • Response packet is not identified as part of the existing connection (see above) It’s always connection-state:new
  • Without assigning dum1 to default_myas table, the source address has the source IP of the main table (see above)
  • When assigning dum1 to default_myas, the source address is, again, 192.0.2.177

DNAT+SNAT does not work:

  • Idea: Use DNAT to move to a different port (51820–>41820)
  • DNAT 192.0.2.210:41820 —> 172.20.215.1:51820
  • SNAT 192.0.2.177:51820 —> 192.0.2.210:41820
  • And we run into the same issue as why SNAT does not work: 192.0.2.210:41820 is already taken by the DNAT

I am really going in circles and it seems everything about wireguard is buggy as crazy :frowning: :frowning: :frowning: :frowning:

Can’t be that this is literally impossible?



EDIT3: Absolutely insane. The very reason that DNAT doesn’t work is believe my original question which is because pref-src is ignored. Of course, if the source address is again 192.0.2.177, it won’t be matched as a DNAT connection which would expect 172.20.215.1. But 172.20.215.1 is never chosen because of the default route. I can confirm this by forcing a dummy routing table that only contains “0.0.0.0/0 interface=dum1”. Then the source address is properly taken as 172.20.215.1 and I can see that DNAT works. But then the packet is routed into dum1 and never arrives. It’s really circles here! I don’t think that DNAT is a solution of my question. The core is really as to WHY RouterOS ignores my request for pref-src

I answered to this already in the other posting. I would love if I could do that. But wg can’t listen on a specific interface or address! It listens on all. If I miss something obvious, please do let me know.

Yes, I would prefer (and insist) on full routing scenario but I am starting to accept that WG implementation is buggy. I am fine to use NAT (S and/or D) as a bugfix but even after many days of trying many rules I am still not able to make it work, even with NAT. See previous answer just above. If you got any further advice on the NAT solution, I really appreciate it.

I’m considering the idea of moving all routing away from the WG endpoint, and instead having the work split across two (or more) devices: WG tunnel terminators at each end, and have them hand off EOIP/VXLAN/IPIP/whatever to the next router(s) up/down the line. This way the WG router doesn’t get confused with multiple routing tables, NAT, etc. I can still manage it with a secondary port plugged into the on-net routers, even move all management to a different VRF.

I completely disagree with the statement it’s not a bug. Saying it’s a big is just an excuse for not fixing bad code. Yes, wireguard uses UDP and yes, UDP is stateless.

However, when the road warrier first sends a request to 192.0.2.210:51820, it creates a “virtual state” inside wireguard. It is the user space applications’ duty to then fill out the source address (see https://blog.cloudflare.com/everything-you-ever-wanted-to-know-about-udp-sockets-but-were-afraid-to-ask-part-1/#sourcing-packets-from-a-wildcard-socket). I can’t think of any meaningful reason why this shouldn’t happen. There is no reason on earth, the response packet should ever come from a different source address than it was connected to. There is no reason to leave this decision to the operating system.

Hmm… Are you sure that’s true? If the main table would have been consulted, the response packet wouldn’t even find its way out, because the default route for the main table is vlan2 (see my posted config) and it be MASQ’ed and go out there.

I am fairly certain that the packet does use the right table (“default_myas”) and right route because of this in my posted config:

/routing rule
add action=lookup-only-in-table comment="Connections must never go through other links" disabled=no routing-mark=default_isp_only table=default_isp
add action=lookup comment="main table" disabled=no table=main
add action=lookup disabled=no src-address=192.0.2.0/24 table=default_myas

Yet still, pref-src of that default route in table default_myas is ignored.

Again, if that route would not be used, not even the packet with source .177 would make its way out.

If you disagree, can you elaborate?

But I’m not even applying any route marks. I just use “ip rule” that selects table “default_myas” whenever source address is 192.0.2.0/24. See my config.

Understand now what your argument is. Yet I still don’t see what it would break if the response packet of a new (=unseen) connection comes from the right IP address.

Anyway, I am now OK to use DNAT but as I wrote, still no working setup for me in sight :frowning:

100%

The lack of VRF isn’t only a pain for this example but for my complete setup: I establish wireguard tunnels for uplinks over different ISPs with dynamic IP. Right now I have to use dirty hacks with route marks (see my posted config). But the correct solution would be to put the respective wireguard connections into the VRFs of the respective ISPs … and the resulting wg interfaces into my VRF “myas”.

It’s a major lacking feature :frowning:

I have written Mikrotik and they answered it’s on their list and they will implement this some time, but of course, who knows how long it would take.

I agree with the Cloudflare sentiment, and as someone who writes network services professionally, neglecting this irritates me to no end. (Btw. you can feel the writer’s frustration dripping between the lines :slight_smile: )

The problem is that wg itself is connection-less, i.e. there is no “new” connection, just another handshake.

Let’s assume that wg did it the way you suggest. I am using my phone and in communication with a wg “server”. I’m using mobile data. I walk into my favorite coffee shop. My phone connects to their wifi. My phone sends out wg packets with the src address appropriate for my previously used LTE connection and not with the address acquired over wifi. Connection fails.

Clearly the packet should go out on the default route and pick up its src address from the pref address there.

This is what I tried to explain. The two decisions:

  • don’t differentiate between peers as “server” and “client”
  • support roaming IPs
    don’t allow for setting the src address.

It’s not lazy coding, but a result of a philosophy regarding the protocol. One which obviously quite a few people don’t agree with… but not exactly a bug.

I do disagree. Refer to the packet flow compiled by the nice Mikrotik guys at: https://help.mikrotik.com/docs/spaces/ROS/pages/328227/Packet+Flow+in+RouterOS

  • you packet ingresses the diagram titled “routing” at K
  • routing decision is made - please bear in mind that you packet has no src address, and so your third rule does not mach your packet - main is selected
  • src addr is set based on pref src in main
  • packet goes to output
  • (if snat happens, it happens here - but now this is of no interest for us)
  • at the end of output “routing adjustment” occurs - here the src address is already set and your third rule now matches: the packet is rerouted according to the *myas routing table; however in the routing adjustment phase, pref addr - as I wrote - is not consulted

Hence your packet goes out on the correct interface with the src address not set as you expected.

Please feel free to say so if you disagree on how/why this is happening…

Explained above. In the “routing decision” phase your rule cannot and does not match the packet, only in the “routing adjustment” does it pick up the route it eventually takes.

Let me be a bit clearer on this: someone actually wrote the code and it was not included. I know not what the future holds :slight_smile: but if I were in a position to suggest what Mikrotik should do (which I am clearly not in,) I would probably wait until it’s settled in the kernel.

I think VRF support would be more elegant, however - and more practically - if the underlying behavior is clearly understood, dnat is in 99% of cases totally fine.

Thank you for your unusually thoughtful reply. Hope this clears things up at least somewhat.

EDIT:
Messed up snat/dnat naming. Corrected.

Thank you, you cleared up some important points for me!

Ok, I get now what you are saying.
But please hear me out until the end of my response where I believe I can show that it might be impossible to use wireguard (in my described setup). This is really inconceivable to me…


Thank you for walking me through this. I think it makes sense now.

Now I think we can proclaim that pref-src by definition is irrelevant in any table that’s references from a rule that uses source address.
In other words, it doesn’t work at all for policy based routing.
That’s a HUGE bummer.

I think it would be more than important to add this as a remark in the documentation for pref-src.



Now here I am again claiming that DNAT or SNAT does actually not seem to be a solution at all. To avoid repeating things, would you mind checking my EDIT1-EDIT3 in this answer? http://forum.mikrotik.com/t/routeros-blatantly-ignores-pref-src-can-this-really-be-a-bug/180360/11

TLDR is: the main crux is that pref-src would be required to make DNAT work (but then I wouldn’t need DNAT). And the issue with SNAT is that address/port is already in use.

I am afraid I have hit a big road block … is there any other hope?

Well… assigning a source address based on source address does seem to be somewhat circular…

It does work for PBR, in fact VRFs are internally based on machinery that was originally put in place for PBR. It’s just that the packet has to originate in the appropriate VRF (namespace.)

Maybe… I’m sure that if you propose suitable language, Mikrotik will be happy to add it. :slight_smile:

That’s just because you’re not trying hard enough :slight_smile:

Example:
(192.168.80.1 would be the normal address of the server, 192.168.80.8 is the client, 192.168.84.1 is the address we want the server to use.)

First we identify the traffic and assign it to the correct routing table. (WG on linux supports the fw-mark attribute, but Mikrotik doesn’t seem to…)

/ip/firewall/mangle
add chain=output protocol=udp src-port=13231 action=mark-routing new-routing-mark=wg passthrough=yes

The actual address translation is triggered in dst-nat:

/ip/firewall/nat
add chain=dst-nat dst-address=192.168.84.1 protocol=udp dst-port=13231 action=dst-nat to-addresses=192.168.80.1

This results in outgoing packets using the correct routing table, and also the address is set. Logged as.

connection-state:established,dnat proto UDP, 192.168.80.1:13231->192.168.80.8:13231, NAT (192.168.80.1:13231->192.168.84.1:13231)->192.168.80.8:13231

Tested it, and it works flawlessly.

EDIT:
Syntax.
EDIT 2:
Initial solution was wrong.
EDIT 3:
Don’t use this, it doesn’t handle a specific case correctly. See next post.

I’ve played around a bit more and this is the nicest version:

/ip/firewall/mangle
add chain=output action=mark-packet new-packet-mark=wg passthrough=yes protocol=udp src-port=13231
add chain=output action=mark-routing new-routing-mark=wg passthrough=yes packet-mark=wg

/ip/firewall/nat
add chain=dstnat action=dst-nat to-addresses=192.168.80.1 protocol=udp dst-address=192.168.84.1 dst-port=13231
add chain=srcnat action=src-nat to-addresses=192.168.84.1 packet-mark=wg

@lurker,

ok. let us go back to the first problematic ip from the @op - which is 210. and not 177.

how about if we use this output nat just to force the output ip using 210 - because the @op said he doesn’t have any problem reaching that ip 210 from the internet (full routing)??

iptables - t nat -a output - s 210 -o wan1 - j snat -to 210 ??

can’t that simple line work?

210 is local vlan bridge ip. the same effect as loopback bridge - which is difficult to put some marking.

That’s a good article, which likely explains roughly what’s going on. But the issue is UDP “app” is WG and lives in the kernel. Still 100% agree this is a “bug” — WG’s logic shouldn’t supersede the rest of RouterOS’s routing logic. But again mucking with WG’s code/logic may have risks & different side-effects… So I can see why this isn’t a quick fix.


I think @divB hit the nail on the head that pref-src= should actually be respected, since that’s were WG should be getting the address from IMO…