Hi,
I set up a ipsec/ike2 tunnel some months ago between two office branches. It does work well so pcs on one branch can connect to network shares and servers on the other.
Nevertheless, from time to time, the tunnel drops connection and I have to restart it (usually by rebooting both routers it does resolve, have not tried other ways).
My question is why does this happen and how to avoid it? Is there any keep-alive mechanism for this kind of tunnels that restarts itself even if one of the site goes down for a while? I have been searching for solutions online but most of them are not clear to me (quite new to mikrotik and just using the graphical interface, not scripting).
Are there other kind of tunnels or VPNs that are more geared towards an always-on scenario or fault tolerant one? Perhaps an openVPN site-to-site?
There must be some root cause behind both the failures of the tunnel and its inability to re-establish autonomously. Most of the devices I’m running IKEv2 tunnels among restart quite frequently due to the regional specifcs and the fact that none of them is on a UPS, and all my tunnels automatically re-establish if some of the devices restarts. The longest uptimes I’ve found right now were above 2 weeks.
The default configuration in RouterOS is to check that the peer is alive every 2 minutes and automatically re-establish the tunnel if it is not; the re-establishing attempts continue forever. So it seems that something in the network between the two devices causes problems in re-establishing; one model situation is when multiple local peers of the same remote peer are behind a NATing device and that device restarts; in this case, the external ports at the NAT device may get swapped, causing the NAT at the remote end to mix together the streams, so the responses arrive to the wrong peer. It may or may not be your case depending on the network topology (individual clients at one site connecting to the remote site could cause this).
So when analysing, you have to take into account the overall topology of the network, not just the two Mikrotik devices.
Other types of VPN will likely suffer from the same effects and root causes. OpenVPN in particular is feature-limited in RouterOS 6, it can use only TCP.
If I restart one of the sides of the tunnel, the tunnel reatarts. The problem is that sometimes, without an apparent reason, the tunnel drops connection and it is not restablished automatically and I must restart mikrotiks by hand.
I am not sure I understand the possible reason you are suggesting. I have my mikrotiks DMZed and all PCs connect through mikrotiks… Should I observe something else?
The DMZ functionality may be implemented in many ways in the LAN->WAN direction. In WAN->LAN direction, a DMZ is always a 1:1 dst-nat that doesn’t care about ports and protocols. But if there is more than a single device on LAN, the src-nat functionality must still permit N:1 translation in order that all the devices could access internet. And src-nats use different strategies of allocating ports at the public (outside) address. So if the Mikrotik at local end initiates a new connection towards the Mikrotik at the remote end, the remote end may not see it coming from source port 4500. To check that, use ip firewall connection print detail where dst-address~“:4500” while the IPsec connection is up. If the assumption above is correct, the end where the IP address part of the dst-address is the local one (i.e. the responder) will show a different port part of the src-address, example from the real world (the public addresses are not the real ones I use):
To see something useful in the log as @erlinden suggests, you have to do the following once the tunnel gets stuck, ideally at both ends simultaneously: /system logging add topics=ipsec,!packet to activate detailed logging of IPsec /log print follow-only file=ipsec-struggling where topics~“ipsec” to start logging the IPsec into a file, as the standard log buffers are quite small
After about two minutes, you should have enough data, so you can press Ctrl-C to break the logging, download the files and start analysing them.
Sorry, I missed out the last two replies… I am attaching here the output files, both conf and log. I have already set 500 and 4500 UDP ports forwarded to my mikrotiks (there are no other elements between company router and mikrotik).
Hope this helps, I cannot see anything on the logs (not trained yet).
Another thing that puzzles me… should I set default gateway on LAN devices to the mikrotik or to the company router? In either case, what could be the implications (if any) to the NAT mechanism?