NAT masquerade skipping some packets?

We have a Router OS 6.40.4 [CHR] router that appears to be masquerading the vast majority of outbound packets, but then missing some now and then for reasons unclear. We detected this because the router in question connects to our AS edge router, which has Firewall rules to ensure all packets heading out of the AS have an permissible source IP address (IE: An IP within our allocated prefixes)

Does anyone have some pointers as to why this might be occurring?

Do you have in your masq config out-interface assigned? If bypass the firewall connections won’t loose, then problem in forwarding these packets back from firewall device.

It is my experience as well! The cause appears to be down in the Linux kernel, as other devices running Linux tend to do the same thing.
The most common problem is occuring at the end of a TCP session. When the TCP session is closed (FIN/FIN ACK) the NAT connection tracking entry is immediately deleted.
However, when the other side has not received the FIN ACK and re-transmits the FIN, the upper layer answers with an RST and it is not translated because it does not match the “new” filter.

This exhibits itself as 2 different problems:

  • untranslated traffic sometimes being routed
  • when you have a strict connection tracking firewall that allows all established/related and some new traffic, then drops or rejects everything with logging, you will get TCP FIN and RST packets logged in the firewall that are completely innocent (belong to successfully accepted connections)

To fix this, TCP connection tracking entries should get an extra timer to keep them lingering after connection close, maybe like 10 or 30 seconds, to catch any FIN re-transmits and RST responses.
However, that probably would have to go via the Linux developers to get it in the main kernel sources. It could be that this already has happened and a new kernel version would fix it, but I have no indications of that.

Thanks - That’s exactly what I suspected and was hoping to get confirmation of; It does appear there is a NAT table issue caused by the premature removal of TCP sessions specifically affecting ACK FIN, ACT FIN PST and ACT RST packets… And although the end user impact is minimal, it is still a little bit annoying because our AS border router detects these are not belonging to a valid source IP address within the AS, which feeds into Graylog counters so we can see abnormalities and resolve issues proactively etc. I might need to design a filter in Graylog to ignore these packets; But I hope that in the future such won’t be necessary.

we are at 6.40.4 too, making a torch on our border router sporadically i can see untranslated ip adresses from neighboring routers, i cannot see much because we use rp filter strict