IKEv2/IPsec high latency

Allow me to ask for help to solve me with the following problem with latency. I have the network in which is on one side the Strongswan server and on the opposite side are terminals connected through DSL, cable or wifi. When I am testing latency through the IKEv2/IPsec tunnels in the directions from these terminals to the server and from server to these terminals with the datagram lengths from 40 to 8000 and the pinging intervals from 5s to 1min the RRT is from 25 to 50ms, so all is O.K. Now I had to connect to this network some terminals which are connected through LTE and these LTE terminals I have created from RBM33g + Huawei ME909s-120 module. When I am testing latency through the tunnels between these terminals and the server in direction from the terminal to the server with datagrams length from 40 to 1500 the RRT values are in interval 30-60ms (in average cca 45ms) (O.K). If the datagram length is larger than 1500 the packet does not pass to the server at all (?). The problem is with the latency in the opposite direction from the server to the LTE terminal. With the datagrams lengths from 40 to 1500 the RRTs varies between 40 to 700ms and sometimes from 200 to 1000ms. With these latencies the terminals are unusable for reliable bidirectional transfer of any data. The datagram larger then 1500 also does not pass from the server to the terminal (?).
When I insert to the LTE terminal the SIM card with public IP address and then I am testing the latency between the server and the terminal connected directly without the IPsec tunnel the RRT values are around 50ms in the both directions (O.K). The MTU and other parameters on the Debian strongswan server have their default values at present. I am not experienced on this field and I hope and believe with your help the mentioned problem will be successfully solved :sunglasses:

For any help thank you in advance.

There are actually two points:

  • the datagram size limitation
  • the fact that the round-trip time differs depending on from which side you ping, did I get you right? I.e. the path is the same, but if you ping from the server and the terminal responds, the latency is much worse that when you ping from the terminal and the server responds, correct?

As for the datagram size, I would simply assume that the other-than-LTE paths support larger MTU than 1500 so larger datagrams can get through. Do you really need other than TCP datagrams larger than 1500 bytes or you’ve provided that information just to make the picture more complete?

As for the round-trip time difference, I can imagine the mobile network working principles to be the reason, although I’m no expert on this. But it sounds reasonable that if there is a period of silence, the modem goes to a kind of a sleep mode where it only “listens” on a service channel (with much less power spent), so if a packet comes from the network side, the modem has to get a wake-up message from the network first, which takes some time to be delivered and reacted to. So I’d do two types of test:

  • compare the behaviour of the tunneled traffic while using the “private-IP” SIM in the modem and while using the “public-IP” SIM in the same modem, to exclude any impact of different settings for different contracts,
  • with the private-IP SIM, look at the RTT if you ping from the server to the terminal with a higher rate, such as 1s or even 300ms. If my speculation is correct, the first ping after a long period of silence should be responded late, whereas the subsequent ones should be responded within the usual 50 ms.

To the first point: we can left the datagram size limitation out because its 1500 length is sufficient for purpose.
To the second point: yes you are right.

I did now the latency test in both directions with the same SIM card which has public IP, through the IKEv2/IPsec tunnel and directly between the Strongswan server public IP and LTE terminal public IP. When I am pinging between public IP addresses from the server to the terminal and from the terminal to the server the RRTs are around 50ms in the both cases. When I am doing the same through the tunnel RRT is around 60ms (some time is probably required for encryption/decryption) when I am pinging from the terminal to the server. When I am pinging from the server to the terminal RRT are from 60 to 900ms.

In my opinion that implies the problem is not in LTE network but in the traffic in IPsec tunnel in direction from the server to the terminal when the MTU is strictly limited to 1500 on the terminal side because this is a limit of LTE path.

There is still a small chance that the mobile network is somehow responsible for the difference, because from the point of view of the network, the direct ping to the public address is an ICMP packet whereas a ping through the tunnel is a UDP packet (inside which the ICMP one is encapsulated).

In receiving direction, /tool sniffer will show you both the transport packet (UDP-encapsulated ESP) and the payload one (ICMP) decrypted from it. In sending direction, only the transport ones are sniffed. So it might make sense to sniff at both ends into a .pcap file and then look using wireshark where the delay occurs.

Once doing this, you may also want to look at how the 1500-byte packets are transported, you should see two transport packets per each payload one, because the IPsec headers occupy some extra bytes so the payload doesn’t fit completely into a single transport packet.

But still it sounds weird, because the ICMP echo request and ICMP echo reply to it are equal in size and both have to pass through the tunnel, so the difference in behaviour between the two cases (echo request sent from terminal and response coming from server vs. echo request sent from server and response coming for terminal) makes little sense.

Thank you for your answer sindy. I am thinking now if it could not be caused by bad MTU, MSS settings on some side of tunnel.

Sindy, because you are experienced on the field of IKEv2/IPsec tunnels allow me to ask you the following question (I don’t want to create new topics for it). I need to create reliable tunnel on L2 layer through NAT between MikroTik and Linux server. Till present I am using the EoIP over IKEv2/IPsec but such solution is not reliable in some situations because EoIP protocol is not supported in linux kernel and exist some implementations as linux daemons only. Unforunately, GRETAP tunnel is not implemented in ROS. Would it be possible to solve the problem with reliability through MPLS/VPLS over IKEv2/IPsec?

Forget about MSS, that’s a TCP-specific value which is related to MTU (roughly, it is MTU minus IP header length minus TCP header length), so if you observe issues already with ping, MSS is not related.
I cannot see a relationship between MTU and delay either. If everything works as it should, the worst thing that can happen is that a plaintext payload packet will be fragmented into two transport packets, which cannot make the RTT more than 4 times longer (assuming two packets to be transmitted in each direction and same speed in both directions, in fact the download at the LTE side is faster than upload and the second fragment is not very long so the actual additional delay will be much less than 3 times). If something doesn’t work properly, though, I could speculate that the network stack is waiting for next fragment to come for some time and releases the reassembled packet after that time expires, but that would at first place require a bug in fragmentation or reassembly (the last fragment of each packet is marked as the last one, so either fragmentation would have to fail to set the “last fragment” marker, or the reassembly would have to ignore it.


Bad news, no. IPsec transports only IP payload, so anything else has to be encapsulated into IP first. MPLS/VPLS are other type of Ethernet payload, so you can tunnel Ethernet frames carrying MPLS through EoIP or GRETAP, but encapsulating Ethernet into MPLS won’t help you tunnel it using IPsec.

Good news, if the EoIP userspace daemon doesn’t fit your needs, you might want to patch the GRETAP kernel code to do EoIP. This should not be a big deal in fact, because as far as I can see, the biggest difference to deal with is that EoIP places the tunnel-id into the two last bytes of the 32-bit GRE key field (gretap adds this field only if you specify the key value when adding the link as an additional parameter following the local and remote IP addresses) and misuses the first two bytes of the key field as a payload size field (which is redundant as the IP packet length in the IP header gives the same information). The rest are static modifications, EoIP sets the GRE header version field to 1 (enhanced GRE) and payload protocol type code to 0x6400, whereas GRETAP uses GRE version 0 and payload protocol type code 0x6558. So you can either patch the kernel, which will definitely be faster, or ask Mikrotik to add a gretap field to EoIP parameters, where yes would modify the two headers in question in Rx and Tx direction and set the “payload length” field to 0 in transmission/determine the payload length using other means on reception.

There is also the BCP (RFC3518), however the last kernel patch regarding its support in the PPP suite I have found is from 2010, and it requires kernel patching either, as there is a kernel module implementing L2TPv3 which seems to make everyone happy enough. On the other hand, Mikrotik uses BCP as it can be used not only along with L2TP but also with SSTP, PPTP… but does not plan to support L2TPv3.

I had a closer look to the way how the reduction of interface MTU caused by IPsec encapsulation looks like, and there’s nothing specific for IPsec handling about it. If the payload augmented with the IPsec transport protocol’s overhead doesn’t fit into the available MTU on the interface, the resulting transport packet is fragmented on transmission and reassembled on reception. So the flags in the IP header of the first IPsec transport packet say:
Flags: 0x2000, More fragments
0… … … … = Reserved bit: Not set
.0.. … … … = Don’t fragment: Not set
..1. … … … = More fragments: Set
…0 0000 0000 0000 = Fragment offset: 0

In the second one, it’s
Flags: 0x00b9
0… … … … = Reserved bit: Not set
.0.. … … … = Don’t fragment: Not set
..0. … … … = More fragments: Not set
…0 0101 1100 1000 = Fragment offset: 1480

In your OP, you wrote:

It actually indicates two issues which I deem unrelated.

Did you test the RTT with don’t fragment flag forced to true? Because if you don’t explicitly prohibit fragmentation while pinging (we’re talking about IPv4 here), larger packets should be getting through normally, just fragmented into smaller ones (which the ping doesn’t show as the networking stack reassembles the received responses before handing them over to the application process). If they don’t, it means that some device in the path reports larger MTU than it actually supports, so instead of getting fragmented, the larger packets get dropped. To confirm this, and also as a temporary workaround until you resolve that with the actual culprit, you have to reduce the MTU value on an interface closer to the source which you can control. By setting up a lower MTU there, the packets will get fragmented already while being sent out that interface and will not be dropped further in the network. It is only a workaround because unlike the MSS modification, it cannot be done selectively only for that part of outgoing traffic that is routed via the problematic path.

For smaller packets (up to, say, 200 bytes) there should be no reason anywhere in the network to fragment them, so I can see no relationship between the eventual MTU issues and those delays. So really only sniffing on the LAN and WAN interface simultaneously (i.e. into a single file) can show which part of the packet path causes those delays. Assuming that the Mikrotiks at both ends act as routers, i.e. that he traffic of the actual client and the actual server are forwarded through them, the simultaneous sniffing on LAN and WAN interface should show how long it took the Mikrotik to encapsulate the packet received from LAN and send the resulting transport packet out the WAN etc.

Dear sindy, thank you very much for your response and, because I feel you are so kind and would like to help me, allow me to inform you about the background of the mentioned latency problem. At present I am working together with Telit firmware developers on adaptation of the generic firmware of LM960A18 module for LTE carrier aggregation schemas of Czech mobile operators. During testing of these firmware adaptations I have occasionally found the mentioned latency problem in one direction. At present I don’t know what is causing the problem with latency exactly. The problem can either be caused by some of beta firmware incompatibility with LTE network or by some problem with IKEv2/IPsec tunnel or somewhere on Strongswan server which is used for testing. All these variants are possible and next week I would like to eliminate some of them. So, I will do detailed tests of the latency of the IPsec tunnel with some other LTE module on RBM33g board (probably with Huawei ME909s-120) and some small (46) and large(1300 - 1400) datagrams size and pinging intervals from 10s to 1min in both directions. If all will be O.K. it implies the problem is in our beta firmware for LM module. If not the problem is in server or tunnel. If you would be interested in it I can send you the results of these tests. Also if you would be interested in it and would like to help us to find the solution of the problem with latency I could send to you (via PM) the configuration parameters of the tunnel between tested router and Strongswan server which I am using for testing and I could create some RADIUS user/password for W10 or Mac or some other client of this server for you. It would be useful because if you would test it, it would be on different LTE network implies I would be sure that the problem is not caused by LTE network here, but it is only idea and I wouldn’t like to bother you :slight_smile:

What’s your opinion on this article ?
https://www.zeitgeist.se/2013/11/26/mtu-woes-in-ipsec-tunnels-how-to-fix/

By the way, the problem with latency could be caused also by ECM drivers and their version in RouterOS 6. This driver is not optimal for Qualcomm chips which are optimized for RMnet drivers. My plan is to do the latency tests also for ECM, MBIM and RMnet drivers on linux pc and compare the differences. As I was informed in ROS 7 should be added MBIM drivers for LTE modules. RMnet with qmi unfortunately not.

Regarding GRETAP I have found the following linux kernel mode EoIP tunnel implementation on github:
https://github.com/bbonev/eoip

Unfortunately, my knowledge of linux kernel is not sufficient to adapt this implementation for the newer kernels (4.9.0-8 or higher) which I am using on our servers :frowning:

PM doesn’t work here. So the only private way seems to be to post here your public key (created for the occasion) so that I could use it to encrypt the contact information and send it to you so that you could decipher it using your private key (the inverse application as compared to certificates). See details here, the first method for short files (pass).

Or you may send me your contact information instead, here is my public key:
-----BEGIN PUBLIC KEY-----
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAnlTJBm7Rvt6oJXn9baFc
RhR3nsx2p47fWarA5h/EdWpgnBmZNkk3IKF3KSxTB5ur3pItyAdpKri8UNITU1ud
09nTa1QSKlZiJKysBsU/bq59shHlzrDwQisLFSBBLDcL03FbRAiYGoGRIjuRoWu+
QTs6zYzVX5eSaB+wH7V5+X6HnawkWa125If0ZqOtTC0bOR96GGRJn6WNJ1e291DH
kP53jeH7hEMtjkIscI8m/sGPnbCLyRr+iWdsmS5D0bAQDL+JwUnwmQk9f12gyzBD
jPOBy4pXGB1gJrod616pO0UZOCsRNhWfVnlinAgCv8DnNyKXHGqZw5ODZm6JPP39
DQIDAQAB
-----END PUBLIC KEY-----
Once you encrypt the short text file with the contact info of your choice, post a hexdump of it here.


The article is one of the nice summaries of known problems. But one more time, the latency cannot have anything to do with MTU until fragmentation kicks in, and even with fragmentation, it requires a bug so that it would start causing latency.


Neither is mine. But it’s a good start and it should not be a big deal to catch someone for whom it will be something like 2 hours of work.

I don’t know if it is a good idea but in my opinion does exist more simplier method. My web is unfortunately in Czech language only but it should’t be huge problem and you could leave your contact information here:

https://www.securelink.cz/kontakt/

Allow me one note - if your email has some connection with google mail servers all emails from me which contain my full name in email address are usualy moved to spam folder (nobody know why )

Regarding GRETAP I could try to contact Mr. Bonev here https://github.com/bbonev

Called there, the guy who picked up the phone has told me you’ll call me back.