L2TP over IPSEC disconnecting repeatedly

Hi, I have 2 sites. 1 site mikrotik as L2TP server. 1 site mikrotik as L2TP client. My L2TP client keep disconnecting continually after a few minutes. I have to reroute again every time. How can I solve this ?

Maybe you have setup a default route via the L2TP link that becomes active when your link has been established?
In that case you should also set a specific route for the L2TP server itself in the client router (pointing to the ISP)

Yes, I have already setup all of the routes. Just now I just found a way, I don’t know if that’s recommended solution. I just disable the keepalive timeout value in L2TP server and client. Is it recommended ? Because I think if in one time, there is no packets available it will disconnected automatically.

By the way, this is site to site connection. So L2TP client is from Mikrotik in Branch office, and L2TP server is from Mikrotik in Central Office.

No it is not a recommended solution. It is recommended to find the root cause of the problem.
I use L2TP/IPsec with keepalive and for extended periods of time without any problem. So there has to be some issue.
Are there more L2TP connections than this one? E.g. from users at the branch office? Or other end-users that have an account at the same ISP as the branch office?

Only 1 L2TP connection in Branch office (which is connected to central office) as L2TP Client.

In Central Office, I have two L2TP connections. 1 as a server and 1 as a client to connect to the other Central office.

Will there be a problem ?

There is a problem when you run 2 L2TP/IPsec connections over the same NAT.
Not sure if this is happening here. When your central office is on a static IP with the MikroTik directly on that external IP (which is not in one of the private ranges) and not another router between the MikroTik and internet, it should just work.
When it does not, try to debug what is going wrong.

Ok. I think I find the solution here. Actually, the problem is everytime the VPN connection is lost, I have to reroute manually again. I just have to add dynamic routes to VPN profile and everytime it disconnect, the routes automatically recreated.

I think the problem the L2TP server disconnect when there is no packet received according to keepalive timeout parameter. Is that right ?

Is it ok to run L2TP server and client at the same time with one Mikrotik ?

That’s a good question, because both the server and the client use UDP port 1701. I’d assume this to be handled automagically by a common stack being used for both the server and client role. Unfortunately I don’t have enough boxes handy to check at the moment.

Issues definitely exist when two L2TP connections encrypted using IPsec and terminated on the same device pass through the same NAT on the remote side like this:


                                  ____                    ______________        |               |

_______________ ( ) | |-------| L2TP endpoint |
| | public IP A ( ) public IP B | | ||
| L2TP endpoint |------------------( )----------------| WAN(NAT) LAN | _______________
|
| () | | | |
|
|-------| L2TP endpoint |
|
_______|In this case, the traffic selector of the IPsec security policies at the L2TP endpoint running at public IP A cannot distinguish the two remote endpoints’ UDP sockets from each other as they have the same IP address (public IP B) and the same port (1701). And it doesn’t matter whether the endpoint at public IP A acts as L2TP client or L2TP server or both. Details and solution can be found here.

For that I always just use BGP.
Setup BGP at each end (give one of them a different AS number, e.g. one higher, configure peer and the local subnets) and it will automatically exchange the routes.

I think the problem the L2TP server disconnect when there is no packet received according to keepalive timeout parameter. Is that right ?

No. Keepalive is a regular packet sent in each direction also when there is no traffic. When some keepalives are not received the connection is closed.
This points to a real problem, not just “there was no traffic”, as those keepalives are also sent when there is no traffic.
“a common problem” is to add a default route over the VPN without having either a /32 route to the VPN server or policy routing (multiple routing tables).
When you add the default route, not only the user traffic but also the VPN traffic itself is routed via the VPN, this causes a loop and the VPN fails.

I forgot to mention earlier that I do have a Mikrotik that is set as L2TP client to connect to L2TP server at the head office and as L2TP server so that Android / IOS / External device can’t connect via VPN. This two use different IP Address, So I think that it will be no problem if same mikrotik use as L2TP server and L2TP client. Right ?

I think I’m begin suspecting that it comes from ISP problem.

Well, there is nothing in parameters of /interface l2tp-server server nor of /interface l2tp-server client which would allow you to bind them to only a particular local IP address, so the server listens at all of them and the client chooses one for connections to remote server using normal routing procedures. So separation of the l2tp server role from the l2tp client role by different local IP address doesn’t seem possible to me; however, as said before, I assume the same process handles l2tp server and l2tp client functionality so it can deal with both roles simultaneously on the same socket address (ip:port).

But my picture above would apply e.g. for a situation where you would use your Android phone connected to a WiFi in the Head Office to your home Mikrotik acting as L2TP client of the HO and as an L2TP server for the Android phone, because in this case, packets from both the phone and the HO’s L2TP server would arrive to your Mikrotik from the same public IP.

So which connection experiences interruptions? The one to HO (and if so, does it also happen when you disable the server at your end and keep only the client active)? Or the one from an external device?

Ok. To make it clear, let me explain my completely situation using L2TP server.

I have total 3 Mikrotik, which is distributed as follows : 1 Mikrotik located in Head Office, 1 Mikrotik located in Regional Office, and 1 Mikrotik located in Branch office.

The L2TP setup are as follows :

  1. In the Regional office, I set L2TP Server and L2TP Client. L2TP server is used to receive connection from external device (like android, pc, etc). L2TP client is used to connect to L2TP server located in Head Office.
  2. In the Branch Office, I set L2TP Client to connect to L2TP server located in Regional office.

Problem is L2TP connection between Regional and Branch office is often disconnected after some spesific duration, average is 30 to 1 hour.

What is wrong here ?

Are both the regional and the branch office routers directly on an external globally routed IP address?
Or is there some NAT inbetween at your local setup or at the ISP?
(visible by having an external address like 100.64.x.x on your internet line)
Is it possible that your L2TP link fails at the moment when some VPN user decides to connect to the regional office?
Try using a different technology (like GRE) instead of L2TP for your statically configured link, leaving L2TP for the users only.
See if there is a difference.

Okay. Finally enough to make a drawing and understand the topology.


So the other connection (RO client to HO server) works all the time, and so does the BO client to RO server at least for the first 20 minutes. That proves that the client and server at RO can bear together well and you may stop thinking about that as a potential cause of the drops.

IPsec and L2TP log at both ends (RO, BO) should tell you what is going on. The explanation which is most likely in usual cases is the second client behind the same public IP, that’s why I’ve started from there. So any PC/Android connecting from HO or BO network to RO’s server could cause the conflict if it was there.
If it is not, and the time is 30 minutes or 1 hour, it might be related to IPsec security associations’ lifetime. Or it may be an ISP problem as you suspect. It requires to read the logs.

So switch the logging on (at both machines)
/system logging
add topics=ipsec,!packet
add topics=l2tp

start logging into a file (several possibilities exist, this is just one of them) at both machines:
/log print follow-only file=l2tp-reconnect where topics~“ipsec|l2tp”

In another window, start pinging the address of the RO from the BO.

Then, wait for the reconnection to happen, then Ctrl-C the commands above.

If the ping shows more than 1% of packets lost, you can blame the internet path between BO and RO. If it doesn’t, download the files, make yourself a coffee and start analysing them.



may/12 06:01:04 l2tp,debug tunnel 61 received no replies, disconnecting 
may/12 06:01:04 l2tp,debug tunnel 61 entering state: dead 
may/12 06:01:04 l2tp,debug session 1 entering state: dead 
may/12 06:01:04 l2tp,ppp,debug <103.83.173.11>: LCP lowerdown 
may/12 06:01:04 l2tp,ppp,debug <103.83.173.11>: LCP closed 
may/12 06:01:04 l2tp,ppp,debug <103.83.173.11>: CCP lowerdown 
may/12 06:01:04 l2tp,ppp,debug <103.83.173.11>: BCP lowerdown 
may/12 06:01:04 l2tp,ppp,debug <103.83.173.11>: BCP down event in starting state

This is the log from RO router. It said tunnel 61 received no replies, disconnecting. What does it mean ?

Btw this log happenned when I connect to RO router through L2TP VPN from my home network. Is that will cause the L2TP connection from BO to RO lost ?

L2TP is permanently sending keepalive messages of several types, LCP EchoReq once every 30 seconds and HELLO once every minute, and if it doesn’t get a response on several retransmissions, it considers the connection broken and drops it. You should see these attempts a few lines higher in the log:

09:26:50 l2tp,debug,packet sent control message to re.mo.te.ip:1701 from lo.c.al.ip:1701
09:26:50 l2tp,debug,packet tunnel-id=5, session-id=0, ns=5, nr=7
09:26:50 l2tp,debug,packet (M) Message-Type=HELLO
09:26:50 l2tp,ppp,debug,packet <re.mo.te.ip>: sent LCP EchoReq id=0x7
09:26:50 l2tp,ppp,debug,packet <magic 0x3ce3903e>
09:26:51 l2tp,debug,packet sent control message to re.mo.te.ip:1701 from lo.c.al.ip:1701
09:26:51 l2tp,debug,packet tunnel-id=5, session-id=0, ns=5, nr=7
09:26:51 l2tp,debug,packet (M) Message-Type=HELLO
09:26:51 l2tp,ppp,debug <re.mo.te.ip>: LCP missed echo reply
09:26:51 l2tp,ppp,debug,packet <re.mo.te.ip>: sent LCP EchoReq id=0x8
09:26:51 l2tp,ppp,debug,packet <magic 0x3ce3903e>
09:26:52 l2tp,debug,packet sent control message to re.mo.te.ip:1701 from lo.c.al.ip:1701
09:26:52 l2tp,debug,packet tunnel-id=5, session-id=0, ns=5, nr=7
09:26:52 l2tp,debug,packet (M) Message-Type=HELLO
09:26:52 l2tp,ppp,debug <re.mo.te.ip>: LCP missed echo reply
09:26:52 l2tp,ppp,debug,packet <re.mo.te.ip>: sent LCP EchoReq id=0x9
09:26:52 l2tp,ppp,debug,packet <magic 0x3ce3903e>
09:26:53 l2tp,ppp,debug <re.mo.te.ip>: LCP missed echo reply
09:26:53 l2tp,ppp,debug,packet <re.mo.te.ip>: sent LCP EchoReq id=0xa
09:26:53 l2tp,ppp,debug,packet <magic 0x3ce3903e>
09:26:54 l2tp,debug,packet sent control message to re.mo.te.ip:1701 from lo.c.al.ip:1701
09:26:54 l2tp,debug,packet tunnel-id=5, session-id=0, ns=5, nr=7
09:26:54 l2tp,debug,packet (M) Message-Type=HELLO
09:26:54 l2tp,ppp,debug <re.mo.te.ip>: LCP missed echo reply
09:26:54 l2tp,ppp,debug,packet <re.mo.te.ip>: sent LCP EchoReq id=0xb
09:26:54 l2tp,ppp,debug,packet <magic 0x3ce3903e>
09:26:55 l2tp,ppp,debug <re.mo.te.ip>: LCP missed echo reply
09:26:55 l2tp,ppp,debug <re.mo.te.ip>: LCP lowerdown
09:26:55 l2tp,ppp,debug <re.mo.te.ip>: LCP closed
09:26:55 l2tp,ppp,debug <re.mo.te.ip>: CCP lowerdown
09:26:55 l2tp,ppp,debug <re.mo.te.ip>: BCP lowerdown
09:26:55 l2tp,ppp,debug <re.mo.te.ip>: BCP down event in starting state

However, there are no lines with “ipsec” topic in your log, so it is not clear whether the actual connectivity was broken or whether the IPsec layer had a problem.

Also, it is possible that transmission in only one direction was broken, so you have to check logs from both sides (on one side you would see requests being received and responses sent, on the other side only requests sent but no responses received).

IPsec uses “Dead Peer Detection” (DPD) to check whether the remote peer is up - if you haven’t changed the settings in default peer profile, an R-U-THERE message is sent every 2 minutes, and if 5 consequent messages are not responded, the IPsec stack concludes that the peer dead and tears down the connection. So if there is a hyperactive firewall somewhere between the RO and BO, you may want to reduce the dpd-interval in /ip ipsec profile or /ip ipsec peer profile (depending on your software version) from 2m to 30s (most firewalls keep UDP and similar connections open for 2’30" since the last packet seen in either direction, so if one DPD exchange is lost elsewhere, the pinhole closes and doesn’t let the subsequent DPD through).


That would be possible in two cases:

  • if your home and the BO shared the same internet connectivity with a single public IP address
  • if the Mikrotik in the BO and your L2TP VPN client at home shared the same PPP account at the RO

I had a similar problem, and it was, the DHCP offer, when the ip address expired