IPSec drops and requires reboot

Hi all,

I have an RB750 with GRE/IPSec tunnels to two other Routerboards, using the default IPsec configuration.

Most of the time this works fine, and it always works fine after a clean boot, but if the uplink goes down and then comes back, about 25% of the time this kills both IPSec connections and nothing I can do brings them back until I reboot the Routerboard, at which point everything goes back to normal. Everything else works fine, but the IPSecs stick at “Ready to send” and will not establish a phase 1 link even if I flush the active connections, disable and re-enable or even delete and re-create the links. But rebooting fixes it 100% of the time.

Strangely I have the same setup on a number of other Routerboards and they do not experience the same issue.

I am on software version v6.48.2

Any tips?.. our customer is not pleased…


The connections are dynamic so the only relevant config is:

/ip ipsec profile
set [ find default=yes ] dh-group=modp2048 dpd-interval=10s dpd-maximum-failures=3 enc-algorithm=aes-256 hash-algorithm=sha256
/ip ipsec proposal
set [ find default=yes ] auth-algorithms=sha256 enc-algorithms=aes-256-cbc pfs-group=modp2048

You auto-diagnose yourself.
End of help.

Well, that’s the output of /ip ipsec export

Is there further config you think could be causing the issue?..

When you say you flushed the connection, may I assume you mean the “Installed SAs” under IPSec? Have you tried flushing the connections in “Connection Tracking” under firewall (both ends)?

That was the Installed SAs and the Active Connection list.

Today the same thing happened but on a different router (different location, different IPSec endpoints, different physical routerboard) - which had been running fine for 2-3 months until this morning.

I lost 5 different GRE/IPSec tunnels and a manually configured set of IPSec connections to different places, all at once after the WAN link dropped and reconnected. Several hours later it was still not reconnecting despite everything I tried (including removing the connections under connection tracking this time!) but again nothing I did brought it back and I had to resort to rebooting the Routerboard.

I saw lots of “[remote IP] retransmitted packet” showing it was receiving IPSec traffic from the various peers, but somehow it had got itself into a state whereby it would not reconnect.


This is now getting us in real trouble with the client - has anybody else seen this happen!?

It might be a shot in the dark, but I’ve experienced similar issues with ipsec until today - dropping ipsec connections (active peer state message 2 sent).

Rb3011 / 6.48.3 / 3 peers (2 ip & 1 ip cloud dns).

Noticed an active peer entry with an ip address of peer_a and comment of identity of peer_b.
With “[ip peer_a] parsing packet failed, possible cause: wrong password” messages in syslog.

Fixed it by using ip address instead of ip cloud address (mynetname) in peer_a.


rextended,

if you have nothing useful to add or not willing to help at all, please try to refrain from replying.

I asked about the connection tracking, because I used to have similar mystery problems until I removed the IPsec protocol and the IPsec specific UDP ports (500 and 4500) from connection tracking in the Raw firewall tab (prerouting and output chains). It stabilized our tunnels tremendously.

Is your MikroTik router directly on an internet connection with the public IP address appearing on the MikroTik?
Or is it behind some ISP-provided router with or without “port forwarding”, “setting as DMZ”, etc?
Such configurations often cause problems with IPsec.

When you are directly connected indeed it could be a firewall issue, there is a bug in GRE tracking.

Does any of these help?

To provide some answers to posters:

  1. I am using IP addresses and not cloud DNS for the peers, so I don’t believe that is the issue

  2. All the routers on the network have real IP addresses and are not behind NAT.

I’m interested in the “removing from connection tracking” thing - how did you achieve this?