Router Leaking Packets (ICMP) Marked for Wireguard Tunnel

blacksnow · June 24, 2023, 7:37pm

It seems that in Router OS 7.10 stable, in certain situations ICMP packets (maybe others but I didn’t see any others) are being leaked and exiting through the default WAN route instead of going through the Wireguard tunnel or being dropped. To see the issue you need to have a sniffer monitoring the output chain of the router. So either put something upstream to the router or attach a rule that monitors the output chain of the router with the protocol icmp.

Scenario:
*Tested on CCR2216

Setup some WAN connection via dhcp client on router os.
Setup a wireguard tunnel to another location (can be a VPN service, other site, other router whatever)
Masquerade that tunnel interface so multiple LAN clients can connect through and use it.
Setup some LAN client to connect to some target on the other end and start a download or some longer running process that you can interrupt. (In my particular case I have some wifi clients connecting to the facebook app or downloading google play store updates). I used policy routing rules to route the entire LAN subnet (where my wifi clients come connect into) into the wireguard tunnel, and added a 0.0.0.0/0 default rule in a seperate routing table.
Force disconnect the LAN clients (I switch the phone to airplane mode to drop it from the wifi)

Result:
Since the client has now been forcefully disconnected, whatever in transit packets that require some response will still be flowing or trying to return, the router tries to respond with ICMP code 3 type 3 (port unreachable) or sometimes code 3 type 7 (destination host unknown) which makes sense. However, it does so by using the default route WAN connection instead of sending the response back through the wireguard tunnel to the intended target.

Outcome:
In some situations your privacy or anonymity can be at risk if these packets leak into the default route and then reach the target server. I don’t think this is what is expected to occur. All packets should go through the wireguard tunnel if generated for or as a result of a client that was originally trafficing through the wireguard tunnel. Or simply drop these packets. In the mean time I have setup a block rule on the output chain of my router to prevent my router from sending ICMP. This obviously does not impact forwarding of ICMP from other clients on my network which is good.

Notes:

Tested with Fasttrack both disabled and enabled, this didn’t make a difference.
Tested with L3HW off and on, didn’t make a difference. (I didn’t expect it to because the packets are going to CPU anyway for a wireguard tunnel).

msatter · June 24, 2023, 10:16pm

Very interesting, could you test this in Mangle if the output counter increases?

/ip/firewall/mangle
add action=mark-packet chain=input in-interface-list=WireGuard new-packet-mark=encrypted passthrough=yes
add action=passthrough chain=output out-interface-list=!WireGuard packet-mark=encrypted passthrough=yes

I have here a interface list named WireGuard and you have most likely one interface for WireGuard and then you use that in in/out-interface.

I think that ICMP response by the router is not packet marked so this may not work.

Update: I could not reproduce your problem and maybe that has to do that I have all WireGuard traffic is connection and routing marked. The packet mark extra.

Update 2: if your client create their own tunnels than ICMP packet is not leaking when the tunnel is terminated, because the tunnel outside is.

blacksnow · June 24, 2023, 11:56pm

I also suspected that the policy rule (shown below) probably isn’t doing the same thing as connection marking through mangle but that seems odd. Wouldn’t they work in the same way? I mean to steer all traffic RouterOS likely is attaching an internal connection-mark to packets coming from the specified interface and then it knows to route them through the particular target interface. Based on some additional testing, I agree, I think the packet that is generated from the router must be some other kind of process that doesn’t get marked by connection-marking. In the output logs, the packet is marked as “related” when it is sent by the router, so clearly it knows it was generated in response to some other packet that originally came from inside the wireguard tunnel. I suspect at this point there is some code missing to route it back through the same route it came from.

/routing rule
add action=lookup-only-in-table disabled=no interface="General Wifi" table=mullvad_general

anav · June 25, 2023, 12:41am

What do you mean interface, what are you trying to send through wireguard a subnet ??

msatter · June 25, 2023, 12:42am

I have the same rule and and yours routes traffic coming in through WiFi. Output is local so you have to match on the connection-mark you set. In Mangle output you can route it or kill it by setting TTL to 0.

blacksnow · June 25, 2023, 12:59am

I have a single wireguard address, so I’m masquerading that interface “WG1” with a src-nat rule. I have X clients on my LAN network that connect through wifi that I want to route all traffic through the WG1 interface. The policy rule simply looks up and routes all traffic according to the rules in that mullvad_general table which is just a default route 0.0.0.0/0 which has a nexthop or gateway of the WG1 interface. Everything here works as expected. The issue is when the router responds to traffic after a client has forcefully disconnected or there is an interruption and it sends ICMP back to the target it sends it through the default route in the main table. Instead of sending it back through the expected WG1 interface.

blacksnow · June 25, 2023, 1:08am

With regards to your approach are you saying I need to do something like below.

Keep policy routing rule.
Add mangle rule that marks incoming packets coming into the input chain of the router that originated from the WG tunnel with an action of mark-packet.
Add a mangle rule on the output chain that routes marked packets back into the WG tunnel?

Your suggestion makes sense and I will try it.

Regarding the current functionality of the router I guess it makes sense that this is happening because technically the router is only using the main table for it’s own routes and when it recieves an action like “respond to ICMP” it is no longer subject to the policy routing rule that is routing from the wifi interface. For these particular situations I would prefer to either have an option for the router not to respond to ICMP for packets that originate from a wireguard connection, or it should automatically send the response back through the wireguard tunnel it is acting on behalf of (best approach).

blacksnow · June 25, 2023, 1:48am

So after adding some more logging, the issue cannot be fixed by mangle rules outside of just dropping ICMP that originates from the router. What happens is before the packet reaches the state of invalid (which is then dropped by appropriate rules) it remains in a related state and this triggers the router to respond via ICMP. To be clear, I orginally thought the router was responding to a ICMP packet but the actual situation is that flow of TCP packets that cannot be delivered generate a ICMP packet from the router back to the source letting them know the packet cannot be delivered etc. So the original issue still remains, the router will try to repond appropriately (this is the expected process) to a packet that has been natted and cannot be delivered with variations of ICMP code 3. As seen below it tries to respond to the source IP (52.50.114.126) and even seems to know that it needs to pass through the WG tunnel (10.65.230.150) but ends up sending it through the default WAN interface (WAN 4000).

icmp out output: in:(unknown 0) out:WAN (4000), connection-state:related,snat proto ICMP (type 3, code 1), WAN-IP->52.50.114.126, NAT (10.0.8.4->10.65.230.150)->52.50.114.126, len 80

wiseroute · June 25, 2023, 2:56am

hello blacksnow,

To be clear, I orginally thought the router was responding to a ICMP packet but the actual situation is that flow of TCP packets that cannot be delivered generate a ICMP packet from the router back to the source letting them know the packet cannot be delivered etc.

it (the icmp 3 generation) depends on remote server requirements - which it needs to know the status of remote device accessing its service/ports (ie. doing keep alive while file transfer nor the tcp flags status from remote clients etc) for the server to close its open connection. so when the server sense it doesn’t get the ack or fin flags - the server raise question to the offending ip - to remote device with icmp message (which is unfortunately your router masquerade ip). the router doesn’t have the logic to return it back to the source if the conntrack is disabled (since the router still have that phone macaddr on the table). so the router looks like blatantly forwarding unknown traffic everywhere.

the key : conntrack, and remote service requirements (keep alive, status flags etc)

hope this helps.

anav · June 25, 2023, 10:23am

Mixing users routing within a subnet is never clean.
Suggest trying one SSID reserved (one vlan) for those wishing to go out mullvad.
Instead of identifying X out of total users Y on vlan ( ssid ) by firewall address list and mangling them to go out mullvad.

msatter · June 25, 2023, 1:18pm

So the packet return is flagged related and so you can combine that with that it is generated by the router self on behalf of the disconnected client. Then you can kill that packet returning anywhere, by killing it. Routing in Mangle is not available on output so that avenue is closed.

/ip/firewall/mangle
add action=change-ttl chain=output comment="kill related out ICMP packets" icmp-options=3:0-255 new-ttl=set:0 passthrough=no protocol=icmp

Please test this line if it really stops those ICMP packets going out the WAN.

Optional narrowing off the is adding check on connection-marking of even if the packet marked encrypted. You can also narrow the icmp-options to 3:7

ps. filtering on related is not needed because it is always related when the router answers on behalf.

Normal output: https://help.mikrotik.com/docs/display/ROS/Packet+Flow+in+RouterOS#PacketFlowinRouterOS-Output

Encrypted output starting at K and loops one time. To me, your logging does not show that the packet is encrypted.

Update, not knowing if traffic is encrypted or not I made the rule only kill those that are not going out through the VPN. You can also narrow it by reversing is to only filter WAN. [Post-routing] traffic is encrypted, blue lines, so the rule underneath is obsolete.

add action=change-ttl chain=postrouting comment="Kill related ICMP OUT" icmp-options=3:0-255 log=yes log-prefix=KillRelatedOUT new-ttl=set:0 \
    out-interface=!wireguard passthrough=no protocol=icmp

wiseroute · June 25, 2023, 1:50pm

@ msatter

ps. filtering on related is not needed because it is always related when the router answers on behalf.

agreed. related means conntrack should be enabled. otherwise we will see huge amount of alien Traffic.

but. why do we bother to stop the end result of a stale connection (outgoing to any other link than the one they were coming from) - if we can filter it using - m state – state invalid, untracked at inbound traffic on the tunnel?

msatter · June 25, 2023, 2:42pm

@wisroute thanks and then invalid will catch that. Unmarked is traffic that is not present in connection tracking.

add action=change-ttl chain=prerouting connection-state=invalid,untracked in-interface=wireguard log=yes log-prefix=KillInvalidInWG new-ttl=set:0 passthrough=no

invalid - a packet that does not have a determined state in connection tracking (usually - severe out-of-order packets, packets with wrong sequence/ack number, or in case of a resource over usage on the router), for this reason, an invalid packet will not participate in NAT (as only connection-state=new packets do), and will still contain original source IP address when routed. We strongly suggest dropping all connection-state=invalid packets in firewall filter forward and input chains

Reading this I think this is not the golden bullet because the incoming traffic can be valid and have a active connection tracking. Only when the router want to push the traffic to the client the client says, won’t accept it because I don’t have an active memory of that connection.

So the invalid + untracked limits but does not account for all traffic. This is my thought on that.

ps. the default configuration has already a drop on invalid packet in filter in the input chain.

wiseroute · June 25, 2023, 3:22pm

@ msatter,

Reading this I think this is not the golden bullet because the incoming traffic can be valid and have a active connection tracking. Only when the router want to push the traffic to the client the client says, won’t accept it because I don’t have an active memory of that connection.

absolutely

but this,

So the invalid + untracked limits but does not account for all traffic. This is my thought on that.

so, afaik, which I always did to make the work easier… i always taught myself there are 2 modes of firewalling :

passive firewall, which is the most easiest for me to calculate the results, examples :

-m state --state !related, established -j drop

drop is the key in passive Firewall. no hassle. no back info. just drop. efficient.

active/aggressive firewall, i don’t like it - its cpu and bandwidth intensive. example:

-m state --state ! related, established -j reject

that reject will send reply to remote server - which could be anything from host unreachable, tcp reset etc.

so, in this very topic - i think it is obvious that either the @OP have no firewall at all on the tunnel - or… the @OP set the firewall reject mode hence there are routers activities for stale connections.

just a thought. nice discussion btw👍🏻

blacksnow · June 25, 2023, 4:26pm

So I do have a firewall on all interfaces, and specifically these WG tunnels are treated as WAN connections in terms of my rules because they can reach the outside internet. I have invalid packets dropped at the very top of my rule list (rule 1 & 2) for both the input chain as well as the forward chain. This works because I see the counters increasing and I’ve logged/tested. But with regards to the WG tunnel, the connection is not marked as invalid right away because remember when you force disconnect you do not send a FIN packet so the NAT entry is not closed. Only when the server tries to continue sending TCP packets that don’t get ACK in return then it says “hey are you there at all”. This is when the connection gets closed by the server end and the result is the NAT entries are deleted, then the packets are marked as invalid and dropped as expected. So the issue is really in between the connection closing and the NAT entry being deleted, the router responds with ICMP.

With regards to the solutions presented, they are in one form or another just the same as adding a firewall filter rule that blocks ICMP from the router (so output chain) going out the WAN interface. You can do this a variety of ways but the result is the same the packets are dropped. My question and this thread revolves around should this be the behavior of the router, should there be an option to not respond to ICMP on behalf of a client that was traversing through a WG tunnel or should there be the capability to route the ICMP automatically back through the associated WG tunnel.

wiseroute · June 25, 2023, 4:57pm

hello blacksnow,

So the issue is really in between the connection closing and the NAT entry being deleted, the router responds with ICMP.

yes. but this chicken and the eggs thing could be overlap one another. which is who will close the door first: the server or the router?

supposed that remote server https keep alive at 5 second before closing the connection maybe not enough for the router to detect the absence of your phone macaddr on the table.

My question and this thread revolves around should this be the behavior of the router, should there be an option to not respond to ICMP on behalf of a client that was traversing through a WG tunnel or should there be the capability to route the ICMP automatically back through the associated WG tunnel.

as i said on my previous post, your router doesn’t have the logic to send back those alien Traffic unless you create that routing and connection mark for the wg tunnel.

mikrotik wiki has that very nice pcc how to example

if that pcc doesn’t work for your icmp 3,

you can create 3 layers of firewall in drop mode - not reject mode. because action=reject will still process any stale connection.

(put your wg accept rules before these lines)

you already did masquerade on wq tunnel - so let’s put one filter on this natted interface :

prerouting -m state --state !related, established -j drop.

extra layer on wg input

input -i wg -m state --state ! related, established -j drop

extra layer on wg forward

forward -i wg -o any -m state --state ! related, established -j drop

let us see how it works

good luck

blacksnow · June 25, 2023, 5:36pm

So I tested adding a connection-mark similar to the load balancing setups and what you all have been suggesting and you are correct that when you have connection-marks setup, it will route accordingly assuming you place the appropriate rules to route specific connection-marks comming off of the router output chain into the appropriate tunnel. Somewhat annoyed I have to setup connection-marks in addition to policy rules when I want to steer all traffic coming from a particular interface including any related traffic generated on behalf of those clients. I still think this is something that should be handled automatically if one chooses to use policy rules rather than the mangle approach. Both should essentially do the same thing, attach a connection-mark and use that to steer the packets into the right place. I even verified that connection-marks remain on the connection even when the router is responding via ICMP on behalf of a client so it only makes sense that the policy rules should be able to route that traffic as well.

output: in:(unknown 0) out:WAN (4000), connection-mark:wg_mark connection-state:related,snat proto ICMP (type 3, code 1), WAN-IP->142.250.27.188, NAT (10.0.8.254->10.65.230.150)->142.250.27.188, len 576

Summary:
Simply using policy routing rules will steer traffic for 99.9% of cases, however with regards to ICMP since the router will respond on behalf of a client (due to disconnection or otherwise) unless the packets have been marked specifically using mangle rules (connection marking) then the router will not know how to route the ICMP packets that it generates back through the appropriate tunnel. The solution is to either setup connection marks and appropriate routes, or to drop all ICMP traffic coming off the router output chain or for more granularity you can use connection marks and drop only ICMP traffic that has the WG mark coming off the output chain if you choose not to route it. This still seems like a bug/missing functionality to me and I’m hoping a Mikrotik engineer can take a look and weigh in.

blacksnow · June 26, 2023, 12:20am

Just had a situation where TCP and UDP packets were leaking. I’m going to switch my setup to use mangle rather than policy routing, I would recommend the same for anyone else unless you want leaked packets.

blacksnow · June 26, 2023, 1:15am

My addition to completely resolve the issue and maintain other desired functionality. Added at the very top of my filter list.

/ip firewall filter
add action=drop chain=output comment="Drop outgoing related router packets." connection-state=related out-interface-list=WAN

msatter · June 26, 2023, 8:53am

I had till now not any related traffic going out so testing is not possible here. Can you put this log line in and check if this is detecting the same packets. It is a Postrouting and allow any last minute routing by routing adjustment to be done. It looks only a traffic coming form your router itself (10.0.8.254 ?).

/ip firewall mangle
add action=passthrough chain=postrouting connection-state=related log=yes log-prefix=DetectRelated out-interface-list=WAN src-address=10.0.8.254

Traffic that is routed last minute correctly is not detected but traffic that wants to exit through the WAN is detected and can after changing the rule also be halted.

But your choice to halt all answers on behalf is also good.