Problem with l2tp over LTE

Hi,

I am trying to figure it out but I cannot understand what I am doing wrong. l2tp over IPsec VPN connects when over ISP but not when failover on LTE.

Topology:
Site A:
hAP ax2 (RouterOS v7) => dual-WAN failover, ISP and LTE with different distance. When ISP is down it failovers to LTE.

  • ISP side => dhcp-client
  • LTE side => wap AC LTE6(RouterOS v6). Connect with a /30 subnet with hAP ax2.
  • Has 2x l2tp-clients, one for each CCR on Site B
  • There is a single UTP cable that goes to a switch where both hAP ax + ISP + LTE are connected, so they are separated with vlans.

Site B:
2x CCR1016 (RouterOS v6) =>

  • 2 different ISPs
  • Each CCR connects to 1 ISP and have assigned a public static IP from separate /29 subnet.
  • Each CCR act as l2tp server


    Problem Descritpion:
    When Site A has internet over ISP, both l2tp-clients connect without problem. But, when I unplug ISP so that it failovers to LTE, one l2tp-client works, the other doesn’t connect.
    On CCR side of the failing l2tp-client I get
    l2tp,info “first L2TP UDP packet received from XX.XX.XX.XX”
    l2tp,error “L2TP connection rejected no IPsec encryption while it was required”

Provided that I can not share full export, only specific partial, could you please help me with an idea how to troubleshoot this.
I can only assume that this has to do with MTU problem.

Generally I haven’t noticed other problem, clients have internet connectivity without problem on both cases on Site A.

Let me know what info you would need to provide.

Thank you.

Unless the log message is incorrect (which is not very likely), it clearly states that the initial L2TP packet has arrived directly in plaintext rather than encrypted using IPsec. So the IPsec configuration at the client side seems to be unable to deal with the dual WAN failover.

Something like this cannot be debugged remotely without seeing the complete configuration, but if you obfuscate all the public subnets, usernames and any kind of passwords&secrets from the configuration export, it should be safe to post it. Just do the obfuscation in such a way that addresses in the same subnet stay in the same subnet, so if you have e.g. 197.36.25.42/24 and gateway 197.36.25.1, obfuscate that as a.a.a.42/24 and a.a.a.1 .

How do you configure the IPsec layer, just by specifying the ipsec-secret and setting use-ipsec to yes in the L2TP client configuration? If so, bear in mind that the IPsec configuration is created dynamically, taking into account the route to the server used at the moment of creation, which also determines the source address that will be used for the L2TP transport packets, and the local-address of the dynamically created IPsec peer is set accordingly. So if the routing changes, the L2TP transport packets stop matching the traffic selector of the IPsec transport policy linked to the previous local-address and leak unencrypted.

Upon further investigation that happened yesterday, I realized some more things.

  1. The l2tp-client doesn't connect on either failover (be it ISP->LTE, or LTE->ISP).
  2. The l2tp-client, while failing to connect for any amount of time if left untouched after a failover, the moment I manually clear the connections with dst-address of the l2tp-server (which in reality has only traffic for ports 500,1701,4500) it will connect successfully.
  3. The connection I notice when the problem happens is src-natted connection to l2tp-server port 1701

Notes: LTE router is a wAP. Its internal address in this example is 10.12.98.1 which appears in static route. ISP gateway in this example is 10.1.1.10/24

Export of hAP ax2 where the l2tp-clients reside (part of it).

/interface bridge
add arp=proxy-arp name=br.LAN protocol-mode=none vlan-filtering=yes
/interface ethernet
set [ find default-name=ether1 ] advertise=1G-baseT-half,1G-baseT-full
comment=WAN
set [ find default-name=ether2 ] comment="trunk to switch"
/interface l2tp-client
add connect-to=AA.AA.AA.99 disabled=no max-mru=1500 max-mtu=1500 name=
l2tp_AAAA use-ipsec=yes user=REDACTED
add connect-to=BB.BB.BB.138 disabled=no max-mru=1500 max-mtu=1500 name=
l2tp_BBBB use-ipsec=yes user=REDACTED
/interface vrrp
add group-authority=self interface=br.LAN interval=3s name=vrrp.LAN
on-backup="REDACTED" on-master="REDACTED" priority=127 vrid=98
/interface vlan
add interface=br.LAN name=vlan11.ISP vlan-id=11
add interface=br.LAN name=vlan12.LTE vlan-id=12
/interface bonding
add mode=active-backup name=bond.LAN primary=ether3 slaves=ether2,ether3
/interface list
add name=intVPN
add name=WANs
/ip ipsec proposal
set [ find default=yes ] enc-algorithms=aes-128-cbc
/ppp profile
set *FFFFFFFE use-encryption=required
/interface bridge port
add bridge=br.LAN interface=bond.LAN
add bridge=br.LAN interface=ether1 pvid=11
/ip firewall connection tracking
set udp-timeout=10s
/interface bridge vlan
add bridge=br.LAN comment=ISP tagged=br.LAN untagged=ether1 vlan-ids=11
add bridge=br.LAN comment=LTE tagged=br.LAN,ether1 vlan-ids=12
add bridge=br.LAN comment=MAIN untagged=bond.LAN,br.LAN vlan-ids=1
/interface list member
add interface=l2tp_AAAA list=intVPN
add interface=l2tp_BBBB list=intVPN
add interface=vlan11.ISP list=WANs
add interface=vlan12.LTE list=WANs
/ip address
add address=10.0.98.252/24 interface=br.LAN network=10.0.98.0
add address=10.0.98.254 interface=vrrp.LAN network=10.0.98.254
add address=10.12.98.2/29 interface=vlan12.LTE network=10.12.98.0
/ip dhcp-client
add add-default-route=no comment=test interface=vlan11.ISP

/ip firewall nat
add action=masquerade chain=srcnat out-interface-list=WANs

/ip route
add comment="backup route LTE" disabled=no distance=5 dst-address=0.0.0.0/0
gateway=10.12.98.1 routing-table=main scope=30 suppress-hw-offload=no
target-scope=10
add comment="main route ISP" disabled=no distance=1 dst-address=0.0.0.0/0
gateway=10.1.1.10 routing-table=main scope=30 suppress-hw-offload=no
target-scope=10 vrf-interface=vlan11.ISP

>

Script that runs every 30 seconds to failover (it changes distance on main ISP route from 1 to 10 and vice versa).
You can see I have experimented to with deleting the connections programmatically during the script, but still somehow there is that src-natted connection on port 1701 sometimes.

> ```text
:global MainRouteEnabled;
:global isVRRPmaster [/interface/vrrp/get [find where name~"LAN"] master]
:local l2tpAddress1 [/interface l2tp-client get 0 connect-to ]
:local l2tpAddress2 [/interface l2tp-client get 1 connect-to ]

:if ( [ /ping 8.8.8.8 interface=vlan11.ISP count=5  ] = 0 ) do={
  :set MainRouteEnabled "false";
  :if ( [ /ip route get [ find where comment~"ISP" ] distance ] = 1 ) do={
    /interface l2tp-client disable [find];
    :delay 2s;
    /ip route set [ find where comment~"ISP" ] distance=10;
    :log error "MainRouteEnabled changed to FALSE";
    :log error "PINGs FAILED";
    :if ($isVRRPmaster) do={
     :delay 8s;
#      /ip firewall connection remove [find where dst-address~[:get $l2tpAddress1]]
#      /ip firewall connection remove [find where dst-address~[:get $l2tpAddress2]]
      :delay 10s;
      /interface l2tp-client enable [find];
    }
  }
} else={
  :set MainRouteEnabled "true";
  :if ( [ /ip route get [ find where comment~"ISP" ] distance ] = 10 ) do={
    /interface l2tp-client disable [find];
    :delay 2s;
    /ip route set [ find where comment~"ISP" ] distance=1;
    :log error "MainRouteEnabled changed to TRUE";
    :log info "PINGs OK";
    :if ($isVRRPmaster) do={
     :delay 8s;
#      /ip firewall connection remove [find where dst-address~[:get $l2tpAddress1]]
#      /ip firewall connection remove [find where dst-address~[:get $l2tpAddress2]]
      :delay 10s;
      /interface l2tp-client enable [find];
    }
  }
}

Print connections on various states.

#############################################

WITH PROBLEM - BEFORE CLEARING CONNECTION

#############################################
[admin@Mikrotik_M1] > interface/l2tp-client/print
Flags: X - disabled; R - running
0 name="l2tp_AAAA" max-mtu=1500 max-mru=1500 mrru=disabled connect-to=AA.AA.AA.99 user="REDACTED" password="REDACTED" profile=default-encryption keepalive-timeout=60
use-peer-dns=no use-ipsec=yes ipsec-secret="REDACTED" allow-fast-path=no add-default-route=no dial-on-demand=no allow=pap,chap,mschap1,mschap2 l2tp-proto-version=l2tpv2
l2tpv3-digest-hash=md5

1 R name="l2tp_BBBB" max-mtu=1500 max-mru=1500 mrru=disabled connect-to=BB.BB.BB.138 user="REDACTED" password="REDACTED" profile=default-encryption keepalive-timeout=60
use-peer-dns=no use-ipsec=yes ipsec-secret="REDACTED" allow-fast-path=no add-default-route=no dial-on-demand=no allow=pap,chap,mschap1,mschap2 l2tp-proto-version=l2tpv2
l2tpv3-digest-hash=md5

[admin@Mikrotik_M1] > ip firewall/connection/print detail where dst-address~"AA.AA.AA.99"
Flags: E - expected; S - seen-reply; A - assured; C - confirmed; D - dying; F - fasttrack; H - hw-offload; s - srcnat; d - dstnat
0 SAC protocol=udp src-address=10.12.98.2:500 dst-address=AA.AA.AA.99:500 reply-src-address=AA.AA.AA.99:500 reply-dst-address=10.12.98.2:500 timeout=2m55s orig-packets=150
orig-bytes=66 000 orig-fasttrack-packets=0 orig-fasttrack-bytes=0 repl-packets=151 repl-bytes=42 992 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=0bps
repl-rate=0bps

1 SAC protocol=udp src-address=10.12.98.2:4500 dst-address=AA.AA.AA.99:4500 reply-src-address=AA.AA.AA.99:4500 reply-dst-address=10.12.98.2:4500 timeout=2m58s orig-packets=392
orig-bytes=61 037 orig-fasttrack-packets=0 orig-fasttrack-bytes=0 repl-packets=162 repl-bytes=35 748 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=4.5kbps
repl-rate=0bps

2 SAC s protocol=udp src-address=10.12.98.2:1701 dst-address=AA.AA.AA.99:1701 reply-src-address=AA.AA.AA.99:1701 reply-dst-address=10.12.98.2:55619 timeout=2m57s orig-packets=692
orig-bytes=58 610 orig-fasttrack-packets=0 orig-fasttrack-bytes=0 repl-packets=459 repl-bytes=28 736 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=0bps
repl-rate=0bps

>
\
<br>
> ```text
#############################
# AFTER CLEARING CONNECTION #
#############################
[admin@Mikrotik_M1] > interface/l2tp-client/print 
Flags: X - disabled; R - running 
 0  R name="l2tp_AAAA" max-mtu=1500 max-mru=1500 mrru=disabled connect-to=AA.AA.AA.99 user="**REDACTED**" password="**REDACTED**" profile=default-encryption keepalive-timeout=60 
      use-peer-dns=no use-ipsec=yes ipsec-secret="**REDACTED**" allow-fast-path=no add-default-route=no dial-on-demand=no allow=pap,chap,mschap1,mschap2 l2tp-proto-version=l2tpv2 
      l2tpv3-digest-hash=md5 

 1  R name="l2tp_BBBB" max-mtu=1500 max-mru=1500 mrru=disabled connect-to=BB.BB.BB.138 user="**REDACTED**" password="**REDACTED**" profile=default-encryption keepalive-timeout=60 
      use-peer-dns=no use-ipsec=yes ipsec-secret="**REDACTED**" allow-fast-path=no add-default-route=no dial-on-demand=no allow=pap,chap,mschap1,mschap2 l2tp-proto-version=l2tpv2 
      l2tpv3-digest-hash=md
	  

[admin@Mikrotik_M1] > ip firewall/connection/print detail where dst-address~"AA.AA.AA.99"
Flags: E - expected; S - seen-reply; A - assured; C - confirmed; D - dying; F - fasttrack; H - hw-offload; s - srcnat; d - dstnat 
 3  SAC      protocol=udp src-address=10.12.98.2:4500 dst-address=AA.AA.AA.99:4500 reply-src-address=AA.AA.AA.99:4500 reply-dst-address=10.12.98.2:4500 timeout=2m58s orig-packets=82 
             orig-bytes=14 984 orig-fasttrack-packets=0 orig-fasttrack-bytes=0 repl-packets=81 repl-bytes=14 796 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=0bps 
             repl-rate=0bps 

 4  SAC      protocol=udp src-address=10.12.98.2:1701 dst-address=AA.AA.AA.99:1701 reply-src-address=AA.AA.AA.99:1701 reply-dst-address=10.12.98.2:1701 timeout=2m58s orig-packets=75 
             orig-bytes=10 362 orig-fasttrack-packets=0 orig-fasttrack-bytes=0 repl-packets=75 repl-bytes=10 255 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=0bps 
             repl-rate=0bps



#####################

FAILOVER ISP->LTE

#####################

Before Failover

[admin@Mikrotik_M1] > ip firewall/connection/print detail where dst-address~"AA.AA.AA.99"
Flags: E - expected; S - seen-reply; A - assured; C - confirmed; D - dying; F - fasttrack; H - hw-offload; s - srcnat; d - dstnat
6 SAC protocol=udp src-address=10.1.1.200:4500 dst-address=AA.AA.AA.99:4500 reply-src-address=AA.AA.AA.99:4500 reply-dst-address=10.1.1.200:4500 timeout=2m59s orig-packets=90
orig-bytes=15 521 orig-fasttrack-packets=0 orig-fasttrack-bytes=0 repl-packets=78 repl-bytes=13 737 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=1280bps
repl-rate=0bps

7 SAC protocol=udp src-address=10.1.1.200:1701 dst-address=AA.AA.AA.99:1701 reply-src-address=AA.AA.AA.99:1701 reply-dst-address=10.1.1.200:1701 timeout=2m59s orig-packets=81
orig-bytes=10 448 orig-fasttrack-packets=0 orig-fasttrack-bytes=0 repl-packets=71 repl-bytes=9 362 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=872bps
repl-rate=0bps

During Failover - Performing disable/enable l2tp-client Interface => These are conections while l2tp DISABLED

[admin@Mikrotik_M1] > ip firewall/connection/print detail where dst-address~"AA.AA.AA.99"
Flags: E - expected; S - seen-reply; A - assured; C - confirmed; D - dying; F - fasttrack; H - hw-offload; s - srcnat; d - dstnat
6 C s protocol=udp src-address=10.1.1.200:4500 dst-address=AA.AA.AA.99:4500 reply-src-address=AA.AA.AA.99:4500 reply-dst-address=10.12.98.2:57297 timeout=8s orig-packets=1
orig-bytes=124 orig-fasttrack-packets=0 orig-fasttrack-bytes=0 repl-packets=0 repl-bytes=0 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=0bps repl-rate=0bps

8 C s protocol=udp src-address=10.1.1.200:1701 dst-address=AA.AA.AA.99:1701 reply-src-address=AA.AA.AA.99:1701 reply-dst-address=10.12.98.2:1701 timeout=9s orig-packets=1
orig-bytes=64 orig-fasttrack-packets=0 orig-fasttrack-bytes=0 repl-packets=0 repl-bytes=0 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=0bps repl-rate=0bps

During Failover - Performing disable/enable l2tp-client Interface => These are conections while l2tp DISABLED after some seconds

[admin@Mikrotik_M1] > ip firewall/connection/print detail where dst-address~"AA.AA.AA.99"
Flags: E - expected; S - seen-reply; A - assured; C - confirmed; D - dying; F - fasttrack; H - hw-offload; s - srcnat; d - dstnat
8 C s protocol=udp src-address=10.1.1.200:1701 dst-address=AA.AA.AA.99:1701 reply-src-address=AA.AA.AA.99:1701 reply-dst-address=10.12.98.2:1701 timeout=9s orig-packets=4
orig-bytes=256 orig-fasttrack-packets=0 orig-fasttrack-bytes=0 repl-packets=0 repl-bytes=0 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=512bps repl-rate=0bps

During Failover - Performing disable/enable l2tp-client Interface => These are conections while l2tp ENABLED

Notice the src-natted traffic for port 1701

[admin@Mikrotik_M1] > ip firewall/connection/print detail where dst-address~"AA.AA.AA.99"
Flags: E - expected; S - seen-reply; A - assured; C - confirmed; D - dying; F - fasttrack; H - hw-offload; s - srcnat; d - dstnat
8 C s protocol=udp src-address=10.1.1.200:1701 dst-address=AA.AA.AA.99:1701 reply-src-address=AA.AA.AA.99:1701 reply-dst-address=10.12.98.2:1701 timeout=4s orig-packets=4
orig-bytes=256 orig-fasttrack-packets=0 orig-fasttrack-bytes=0 repl-packets=0 repl-bytes=0 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=0bps repl-rate=0bps

9 SAC protocol=udp src-address=10.12.98.2:500 dst-address=AA.AA.AA.99:500 reply-src-address=AA.AA.AA.99:500 reply-dst-address=10.12.98.2:500 timeout=6s orig-packets=2
orig-bytes=880 orig-fasttrack-packets=0 orig-fasttrack-bytes=0 repl-packets=2 repl-bytes=568 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=0bps repl-rate=0bps

10 SAC protocol=udp src-address=10.12.98.2:4500 dst-address=AA.AA.AA.99:4500 reply-src-address=AA.AA.AA.99:4500 reply-dst-address=10.12.98.2:4500 timeout=7s orig-packets=4
orig-bytes=688 orig-fasttrack-packets=0 orig-fasttrack-bytes=0 repl-packets=2 repl-bytes=472 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=0bps repl-rate=0bps

11 SAC s protocol=udp src-address=10.12.98.2:1701 dst-address=AA.AA.AA.99:1701 reply-src-address=AA.AA.AA.99:1701 reply-dst-address=10.12.98.2:57535 timeout=9s orig-packets=25
orig-bytes=1 694 orig-fasttrack-packets=0 orig-fasttrack-bytes=0 repl-packets=21 repl-bytes=1 288 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=9.6kbps
repl-rate=7.2kbps

>

You rely on too many things that happen automatically but are not prepared for your way of handling the WAN failover.

When you enable the L2TP interface with an attached IPsec configuration, it takes the connect-to address and finds the currently active route to it and the corresponding out-interface and the address attached to it. Then it creates an IPsec peer with a transport mode policy between the address found this way as the source one and the connect-to address from the L2TP configuration as the destination one for the peer and the policy. And apparently when you change the priorities of the routes, that change does not trigger a reset of those choices. So on top of removing the connections, your script should also disable and re-enable the L2TP connections, to make them go through that process of selection of local address to use and creating the IPsec configuration for it again. Disabling and re-enabling the L2TP will, however, not remove the connections immediately, and they may interfere, so you have to do both, preferably in the “disable the L2TP, remove connections, re-enable L2TP” order. But you would not need to remove the connections if you exempted them from getting masqueraded.

But unless you are really counting every single byte that passes through the LTE, if I were you, I would create all 4 possible L2TP over IPsec tunnels permanently, and set the routes through all of them with the required priorities (distances) respecting your order of preference of use of the uplinks at the client side (where clearly the “wired” one is preferred) and at the server side. Since the L2TP client interface goes down once keepalive stops getting through, the route through that interface becomes inactive and the one with next lowest distance comes into use. That way, only the maintenance traffic would flow for each tunnel not in use.

@nsarant; This is OT and I’m not trying to hijack this thread. The suggestion below doesn’t really fix your current issue with your own failover solution using scripts, but rather an alternative way to solve it:

Set up a separate tunnel (of any type) for each WAN connection like Sindy explained and then let OSPF handle the routing instead of the scripts. If you also enable BFD with OSPF you can achieve fast failover in just a few milliseconds (adjustable). OSPF with BFD is very easy to set up between just two sites.

Thank you both for the answers.

After testing, it did work by preventing masqueraded traffic from l2tp-client towards lt2p-server’s public address just like @sindy mentioned!
Thank you for pointing to the right direction!

I preferred this option (even though I will have slower failover) because in reality, on “server-side” we have 2x CCR, each connected to 2 different ISPs. This means that if I followed the path of creating distinct l2tp interfaces for each combination, I would have to create in total 8 and create that many on server-side (4 on each CCR). It doesn’t matter in our case if it needs 1-2 minutes to failover from ISP to LTE, as long as it actually happens.