Connection tracking table not cleared completely after WAN IP address change

Hello friends,

TL;DR: After the public IP address of my RB5009 changes, Wireguard tunnels behind the router do not work anymore since they are still source NATed to the old public IP address.

I’m running a RB5009 with ROS 7.18.1 at home, connected to an ISP which offers a public IPv4 address via DHCP. One of my Proxmox VE hosts is connected directly to the router at ether8. This PVE host runs an OPNsense and CHR VM, both VMs are connected with their WAN interfaces to vmbr0, which is a bridge interface directly connected to the ethernet interface of the PVE host itself. Both VMs get a private IP address from the RB5009 through DHCP, which is their default gateway. Both VMs have a wireguard interface configured with a persistent keepalive of 10s, which connects to a VM at a Hetzner datacenter. While this setup works just fine, whenever I get a new public IP address from my ISP, the wireguard tunnels (and only those) do not work anymore.
Starting the packet sniffer at the WAN interface of the RB5009, it shows that connections for the wireguard tunnels are still source NATed to the old IP public address. As soon as I reset the corresponding connections in the tracking table or set the udp-stream-timeout lower than 10s (i.e. lower than the persistent keepalive of the wireguard tunnels), the tunnels come up again and everything works just fine. Since I’m using a masquerade NAT rule for outgoing connections at the WAN interface, a change of the public IP should reset the connection tracking entries, but this seems not to be the case here.

Any ideas?

On dhcp-client script:

/ip firewall connection
:foreach idc in=[find where timeout > 60] do={
    remove [find where .id=$idc]
}

Thank you for the script, however it doesn’t really answer my question. I guess I should be more specific: Is this expected behaviour or a possible bug?

It works as designed. To elaborate a bit, there is a difference between NAT action=masquerade and action=src-nat. In case masquerade is used, the conntrack entries are purged automatically. For src-nat they are not. In this case ou can clear them using a script as suggested. This doesn’t affect connections behaving in a “more usual” manner, where some sort of keepalive and ephemeral ports are used, because when the source port is changed, a new (and now correct) conntrack entry is created.

If you go with the script approach, I would probably filter on the ip/protocol/port that is used specifically for your wg connections. Also, a variable $bound is passed to the lease script that informs you whether there is a new binding or a new IP address. Using this as condition in your script, you won’t purge the connections unless they have to be.

To elaborate a bit, there is a difference between NAT action=masquerade and action=src-nat. In case masquerade is used, the conntrack entries are purged automatically. For src-nat they are not.

Well, exactly that’s the reason I think it’s a bug, because there is no src-nat rule in place. I’m using masquerade only:

/ip/firewall/nat/print
Flags: X - disabled, I - invalid; D - dynamic
 0    ;;; masquerade
      chain=srcnat action=masquerade out-interface-list=wan log=no log-prefix=""
chain=srcnat action=masquerade

It’s not a bug it was as designed, there you have NAT/Masquerade

In may experience the masquerade part of conntrack works correctly and purges the appropriate entries.

I’m not trying to be dismissive, but I would assume that there is something going on other than (or besides) what you are describing. Maybe the wg tunnels are not actually initiated on the VMs? How exactly are you getting a new address - is it during renew? The list could go on, but I’m only guessing. Please provide more information.

No. See the documentation: NAT - RouterOS - MikroTik Documentation

I’m not trying to be dismissive, but I would assume that there is something going on other than (or besides) what you are describing. Maybe the wg tunnels are not actually initiated on the VMs? How exactly are you getting a new address - is it during renew? The list could go on, but I’m only guessing. Please provide more information.

Since there is no way to initiate the wg connection from the outside (there are no DNAT rules configured), the VMs initiate the connection. A new IP address is usually assigned during maintenance windows of my ISP, so the renew fails and the router gets a new lease with a new IP address from a different DHCP server.
If I can provide any more information or config, please let me know.

Exactly! Do so. A full config would be the first step. (Redact private stuff as needed.)

There you go. I left as much as possible in there, but removed some sections, namely WiFi (for CAPsMAN), DHCP static leases and firewall filter rules. The router itself has a wg interface as well, which connects to the exact same peer as the VMs. This tunnel works without issues after a new public IP address is assigned, which is another indication that the problem is related to the SNAT connection tracking table.


# 2025-03-27 15:21:43 by RouterOS 7.18.1
# software id = M4MX-WM6K
#
# model = RB5009UPr+S+
# serial number = XXXXXXXXX
/interface bridge
add name=bridge port-cost-mode=short vlan-filtering=yes
/interface ethernet
set [ find default-name=ether1 ] comment=trunk poe-out=off
set [ find default-name=ether2 ] comment=ap02
set [ find default-name=ether3 ] poe-out=off
set [ find default-name=ether4 ] poe-out=off
set [ find default-name=ether5 ] poe-out=off
set [ find default-name=ether6 ] poe-out=off
set [ find default-name=ether7 ] poe-out=off
set [ find default-name=ether8 ] comment=oob poe-out=off
set [ find default-name=sfp-sfpplus1 ] auto-negotiation=no speed=\
    1G-baseT-full
/interface wireguard
add listen-port=13231 mtu=1420 name=wg_ruth
/interface vlan
add comment=empera interface=sfp-sfpplus1 name=vl-empera vlan-id=751
add comment=lan interface=bridge name=vl-lan vlan-id=245
add comment=mgmt interface=bridge name=vl-mgmt vlan-id=240
add comment=srv interface=bridge name=vl-srv vlan-id=248
add comment=tr_gw interface=bridge name=vl-tr_gw vlan-id=1000
add comment=util interface=bridge name=vl-util vlan-id=241
add comment=wlan interface=bridge name=vl-wlan vlan-id=246
/interface list
add name=wan
add name=oob
add name=tr_gw
add name=wlan
add name=mgmt
add name=lan
add name=util
add name=srv
/ip dhcp-server
add interface=vl-mgmt name=mgmt
add interface=vl-util name=util
/ip pool
add name=dhcp-oob ranges=192.168.0.10-192.168.0.200
add name=dhcp-wlan ranges=10.24.6.20-10.24.6.50
add name=dhcp-lan ranges=10.24.5.20-10.24.5.50
/ip dhcp-server
add address-pool=dhcp-oob interface=ether8 name=oob
add address-pool=dhcp-wlan interface=vl-wlan name=wlan
add address-pool=dhcp-lan interface=vl-lan name=lan
/queue type
add kind=cake name=cake
/queue tree
add disabled=yes max-limit=118M name=queue-upload packet-mark=no-mark parent=\
    vl-empera queue=cake
add disabled=yes max-limit=298M name=queue-download packet-mark=no-mark \
    parent=bridge queue=cake
/system logging action
add name=log01 remote=10.24.1.31 syslog-time-format=iso8601 target=remote
/interface bridge port
add bridge=bridge interface=ether2 internal-path-cost=10 path-cost=10 pvid=\
    240
add bridge=bridge frame-types=admit-only-untagged-and-priority-tagged \
    interface=ether3 internal-path-cost=10 path-cost=10 pvid=245
add bridge=bridge frame-types=admit-only-untagged-and-priority-tagged \
    interface=ether4 internal-path-cost=10 path-cost=10 pvid=245
add bridge=bridge frame-types=admit-only-untagged-and-priority-tagged \
    interface=ether5 internal-path-cost=10 path-cost=10 pvid=245
add bridge=bridge frame-types=admit-only-untagged-and-priority-tagged \
    interface=ether6 internal-path-cost=10 path-cost=10 pvid=245
add bridge=bridge frame-types=admit-only-untagged-and-priority-tagged \
    interface=ether7 internal-path-cost=10 path-cost=10 pvid=245
add bridge=bridge frame-types=admit-only-vlan-tagged interface=ether1 \
    internal-path-cost=10 path-cost=10
/ip neighbor discovery-settings
set discover-interface-list=oob
/interface bridge vlan
add bridge=bridge comment=tr_gw tagged=bridge,ether1 vlan-ids=1000
add bridge=bridge comment=lan tagged=bridge,ether1 vlan-ids=245
add bridge=bridge comment=mgmt tagged=bridge,ether1 vlan-ids=240
add bridge=bridge comment=wlan tagged=bridge,ether1,ether2 vlan-ids=246
add bridge=bridge comment=util tagged=bridge,ether1 vlan-ids=241
add bridge=bridge comment=srv tagged=bridge,ether1 vlan-ids=248
/interface list member
add interface=vl-empera list=wan
add interface=ether8 list=oob
add interface=vl-tr_gw list=tr_gw
add interface=vl-wlan list=wlan
add interface=vl-mgmt list=mgmt
add interface=vl-lan list=lan
add interface=vl-util list=util
add interface=vl-srv list=srv
/interface wireguard peers
add allowed-address=10.1.1.0/24,10.25.0.0/16 endpoint-address=X.X.X.X \
    endpoint-port=21337 interface=wg_ruth name=ruth persistent-keepalive=10s \
    public-key="XXXXXXXXX"
/ip address
add address=192.168.0.1/24 comment=oob interface=ether8 network=192.168.0.0
add address=10.24.255.1/28 interface=vl-tr_gw network=10.24.255.0
add address=10.1.1.2/24 interface=wg_ruth network=10.1.1.0
add address=10.24.6.1/24 interface=vl-wlan network=10.24.6.0
add address=10.24.0.1/24 interface=vl-mgmt network=10.24.0.0
add address=10.24.5.1/24 interface=vl-lan network=10.24.5.0
add address=10.24.1.1/24 interface=vl-util network=10.24.1.0
add address=10.24.8.1/24 interface=vl-srv network=10.24.8.0
/ip dhcp-client
add !dhcp-options interface=vl-empera
/ip dhcp-server network
add address=10.24.0.0/24 comment=mgmt dns-server=10.24.1.101,10.24.1.102 \
    domain=lodex.io gateway=10.24.0.1 ntp-server=10.24.0.1
add address=10.24.1.0/24 comment=util dns-server=10.24.1.101,10.24.1.102 \
    domain=lodex.io gateway=10.24.1.1 ntp-server=10.24.1.1
add address=10.24.5.0/24 comment=lan dns-server=10.24.1.101,10.24.1.102 \
    domain=lodex.io gateway=10.24.5.1 ntp-server=10.24.5.1
add address=10.24.6.0/24 comment=wlan dns-server=10.24.1.101,10.24.1.102 \
    gateway=10.24.6.1 ntp-server=10.24.6.1
add address=192.168.0.0/24 comment=oob dns-server=192.168.0.1 gateway=\
    192.168.0.1 ntp-server=192.168.0.1
/ip dns
set allow-remote-requests=yes mdns-repeat-ifaces=vl-util,vl-lan,vl-wlan
/ip firewall nat
add action=masquerade chain=srcnat comment=masquerade out-interface-list=wan
/ip ipsec profile
set [ find default=yes ] dpd-interval=2m dpd-maximum-failures=5
/ip route
add comment="route to ph.lodex.io" disabled=no dst-address=10.25.0.0/16 \
    gateway=10.1.1.4
/ip service
set telnet disabled=yes
set ftp disabled=yes
set www disabled=yes
set api disabled=yes
set api-ssl disabled=yes
/ip ssh
set always-allow-password-login=yes strong-crypto=yes
/system clock
set time-zone-name=Europe/Berlin
/system identity
set name=gw.lodex.io
/system leds settings
set all-leds-off=after-1h
/system logging
add action=log01 topics=info
add action=log01 topics=wireless,debug
add action=log01 topics=dhcp,debug
/system note
set show-at-login=no
/system ntp client
set enabled=yes
/system ntp server
set enabled=yes use-local-clock=yes
/system ntp client servers
add address=pool.ntp.org
add address=0.de.pool.ntp.org
add address=1.de.pool.ntp.org
add address=2.de.pool.ntp.org
add address=3.de.pool.ntp.org
/system routerboard settings
set auto-upgrade=yes
/tool bandwidth-server
set enabled=no
/tool mac-server
set allowed-interface-list=oob
/tool mac-server mac-winbox
set allowed-interface-list=oob

Could you also share the sniffed packets as received and transmitted by the rb5009, both on the external and internal interface?

you dont have any firewall rules?

Looked it up amongst the kernel patches. For some time period the behavior was indeed to purge the connections if the address changed (pr added/removed). This was reverted, and it is not the behavior any more. The purging of the entries only happens on link down. (At least for the 5.6 that Mikrotik bases v7 on. There were some other helper-related fixes later on, such as in 5.14, but even if those were backported, they would not really help you.)

Through a quick google search I’ve found several pieces of software impacted by this, and the foolproof solution is to purge the entries on triggers by the dhpcd, so basically on-lease scripts.

So: works as intended.

I do, I just removed them from the output.


Wow, looks like you dug really deep, thank you very much! I will open a support ticket to get an official confirmation and so it can be removed from the documentation.

Since you’re seeing something that seems to completely contradict the documentation, I actually prototyped the issue fully. (I drank too much coffee and have too much time on my hands.)

In this case of course I control the DHCP server.

When I force the client to renew (and of course it’s NAK’d, and instead gets a different address), the conntrack entry is not cleared, and its reply-dst is unchanged. Of course it can’t usefully forward traffic, however as long as it’s receiving packets from the side behind the NAT, it keeps renewing the timeout.

If I force the output interface down, the conntrack entry is purged. (Regardless of whether there is any IP change later when it’s brought up again.)

a firewall rule in forward chain droping invalid traffic can help ?

I have such a rule in place:

[itadm@gw.lodex.io] /ip/firewall/filter> export where chain =forward
# 2025-03-27 23:47:25 by RouterOS 7.18.1
# software id = M4MX-WM6K
#
# model = RB5009UPr+S+
# serial number = XXXXXXX
/ip firewall filter
add action=fasttrack-connection chain=forward comment=fasttrack connection-state=established,related hw-offload=yes
add action=drop chain=forward comment="drop invalid" connection-state=invalid
add action=accept chain=forward comment="accept established,related, untracked" connection-state=established,related,untracked
add action=accept chain=forward comment="any -> any | ICMP" protocol=icmp
add action=jump chain=forward comment="jump to 'wan-forward'" in-interface-list=wan jump-target=wan-forward
add action=jump chain=forward comment="jump to 'oob-forward'" in-interface-list=oob jump-target=oob-forward
add action=jump chain=forward comment="jump to 'mgmt-forward'" in-interface-list=mgmt jump-target=mgmt-forward
add action=jump chain=forward comment="jump to 'lan-forward'" in-interface-list=lan jump-target=lan-forward
add action=jump chain=forward comment="jump to 'wlan-forward'" in-interface-list=wlan jump-target=wlan-forward
add action=jump chain=forward comment="jump to 'util-forward'" in-interface-list=util jump-target=util-forward
add action=jump chain=forward comment="jump to 'srv-forward'" in-interface-list=srv jump-target=srv-forward
add action=jump chain=forward comment="jump to 'tr_gw-forward'" in-interface-list=tr_gw jump-target=tr_gw-forward
add action=jump chain=forward comment="jump to 'wg_ruth-forward'" in-interface=wg_ruth jump-target=wg_ruth-forward
add action=drop chain=forward comment="drop everything else" log=yes log-prefix="[FORWARD-DEFAULT-DROP]"

But just out of curiosity: How would the firewall identify the traffic to be invalid in such a case?

It’s considered “established”.

Next up will be to disable fasttrack. Sounds good, doesn’t work.

src-nat is executed after firewall filter … so dropping invalid traffic with firewall filter rules won’t work. Raw filters only have “prerouting” and “output” chains, but something in “postrouting” would be needed.

Quick Update: Opened a support ticket and waiting for a reply.