Community discussions

 
Petrw
just joined
Topic Author
Posts: 6
Joined: Sun Feb 04, 2018 1:17 pm

IKEv2/IPsec high latency

Sun Nov 10, 2019 7:19 pm

Allow me to ask for help to solve me with the following problem with latency. I have the network in which is on one side the Strongswan server and on the opposite side are terminals connected through DSL, cable or wifi. When I am testing latency through the IKEv2/IPsec tunnels in the directions from these terminals to the server and from server to these terminals with the datagram lengths from 40 to 8000 and the pinging intervals from 5s to 1min the RRT is from 25 to 50ms, so all is O.K. Now I had to connect to this network some terminals which are connected through LTE and these LTE terminals I have created from RBM33g + Huawei ME909s-120 module. When I am testing latency through the tunnels between these terminals and the server in direction from the terminal to the server with datagrams length from 40 to 1500 the RRT values are in interval 30-60ms (in average cca 45ms) (O.K). If the datagram length is larger than 1500 the packet does not pass to the server at all (?). The problem is with the latency in the opposite direction from the server to the LTE terminal. With the datagrams lengths from 40 to 1500 the RRTs varies between 40 to 700ms and sometimes from 200 to 1000ms. With these latencies the terminals are unusable for reliable bidirectional transfer of any data. The datagram larger then 1500 also does not pass from the server to the terminal (?).
When I insert to the LTE terminal the SIM card with public IP address and then I am testing the latency between the server and the terminal connected directly without the IPsec tunnel the RRT values are around 50ms in the both directions (O.K). The MTU and other parameters on the Debian strongswan server have their default values at present. I am not experienced on this field and I hope and believe with your help the mentioned problem will be successfully solved 8)

For any help thank you in advance.
 
sindy
Forum Guru
Forum Guru
Posts: 3959
Joined: Mon Dec 04, 2017 9:19 pm

Re: IKEv2/IPsec high latency

Sun Nov 10, 2019 9:15 pm

There are actually two points:
  • the datagram size limitation
  • the fact that the round-trip time differs depending on from which side you ping, did I get you right? I.e. the path is the same, but if you ping from the server and the terminal responds, the latency is much worse that when you ping from the terminal and the server responds, correct?

As for the datagram size, I would simply assume that the other-than-LTE paths support larger MTU than 1500 so larger datagrams can get through. Do you really need other than TCP datagrams larger than 1500 bytes or you've provided that information just to make the picture more complete?

As for the round-trip time difference, I can imagine the mobile network working principles to be the reason, although I'm no expert on this. But it sounds reasonable that if there is a period of silence, the modem goes to a kind of a sleep mode where it only "listens" on a service channel (with much less power spent), so if a packet comes from the network side, the modem has to get a wake-up message from the network first, which takes some time to be delivered and reacted to. So I'd do two types of test:
  • compare the behaviour of the tunneled traffic while using the "private-IP" SIM in the modem and while using the "public-IP" SIM in the same modem, to exclude any impact of different settings for different contracts,
  • with the private-IP SIM, look at the RTT if you ping from the server to the terminal with a higher rate, such as 1s or even 300ms. If my speculation is correct, the first ping after a long period of silence should be responded late, whereas the subsequent ones should be responded within the usual 50 ms.
Instead of writing novels, post /export hide-sensitive. Use find&replace in your favourite text editor to systematically replace all occurrences of each public IP address potentially identifying you by a distinctive pattern such as my.public.ip.1.
 
Petrw
just joined
Topic Author
Posts: 6
Joined: Sun Feb 04, 2018 1:17 pm

Re: IKEv2/IPsec high latency

Mon Nov 11, 2019 1:47 am

To the first point: we can left the datagram size limitation out because its 1500 length is sufficient for purpose.
To the second point: yes you are right.

I did now the latency test in both directions with the same SIM card which has public IP, through the IKEv2/IPsec tunnel and directly between the Strongswan server public IP and LTE terminal public IP. When I am pinging between public IP addresses from the server to the terminal and from the terminal to the server the RRTs are around 50ms in the both cases. When I am doing the same through the tunnel RRT is around 60ms (some time is probably required for encryption/decryption) when I am pinging from the terminal to the server. When I am pinging from the server to the terminal RRT are from 60 to 900ms.

In my opinion that implies the problem is not in LTE network but in the traffic in IPsec tunnel in direction from the server to the terminal when the MTU is strictly limited to 1500 on the terminal side because this is a limit of LTE path.
 
sindy
Forum Guru
Forum Guru
Posts: 3959
Joined: Mon Dec 04, 2017 9:19 pm

Re: IKEv2/IPsec high latency

Mon Nov 11, 2019 3:03 pm

When I am pinging between public IP addresses from the server to the terminal and from the terminal to the server the RRTs are around 50ms in the both cases.

When I am doing the same through the tunnel RRT is around 60ms (some time is probably required for encryption/decryption) when I am pinging from the terminal to the server. When I am pinging from the server to the terminal RRT are from 60 to 900ms.

In my opinion that implies the problem is not in LTE network but in the traffic in IPsec tunnel in direction from the server to the terminal when the MTU is strictly limited to 1500 on the terminal side because this is a limit of LTE path.
There is still a small chance that the mobile network is somehow responsible for the difference, because from the point of view of the network, the direct ping to the public address is an ICMP packet whereas a ping through the tunnel is a UDP packet (inside which the ICMP one is encapsulated).

In receiving direction, /tool sniffer will show you both the transport packet (UDP-encapsulated ESP) and the payload one (ICMP) decrypted from it. In sending direction, only the transport ones are sniffed. So it might make sense to sniff at both ends into a .pcap file and then look using wireshark where the delay occurs.

Once doing this, you may also want to look at how the 1500-byte packets are transported, you should see two transport packets per each payload one, because the IPsec headers occupy some extra bytes so the payload doesn't fit completely into a single transport packet.

But still it sounds weird, because the ICMP echo request and ICMP echo reply to it are equal in size and both have to pass through the tunnel, so the difference in behaviour between the two cases (echo request sent from terminal and response coming from server vs. echo request sent from server and response coming for terminal) makes little sense.
Instead of writing novels, post /export hide-sensitive. Use find&replace in your favourite text editor to systematically replace all occurrences of each public IP address potentially identifying you by a distinctive pattern such as my.public.ip.1.
 
Petrw
just joined
Topic Author
Posts: 6
Joined: Sun Feb 04, 2018 1:17 pm

Re: IKEv2/IPsec high latency

Fri Nov 15, 2019 9:53 pm

Thank you for your answer sindy. I am thinking now if it could not be caused by bad MTU, MSS settings on some side of tunnel.

Sindy, because you are experienced on the field of IKEv2/IPsec tunnels allow me to ask you the following question (I don't want to create new topics for it). I need to create reliable tunnel on L2 layer through NAT between MikroTik and Linux server. Till present I am using the EoIP over IKEv2/IPsec but such solution is not reliable in some situations because EoIP protocol is not supported in linux kernel and exist some implementations as linux daemons only. Unforunately, GRETAP tunnel is not implemented in ROS. Would it be possible to solve the problem with reliability through MPLS/VPLS over IKEv2/IPsec?
 
sindy
Forum Guru
Forum Guru
Posts: 3959
Joined: Mon Dec 04, 2017 9:19 pm

Re: IKEv2/IPsec high latency

Sat Nov 16, 2019 12:43 am

I am thinking now if it could not be caused by bad MTU, MSS settings on some side of tunnel.
Forget about MSS, that's a TCP-specific value which is related to MTU (roughly, it is MTU minus IP header length minus TCP header length), so if you observe issues already with ping, MSS is not related.
I cannot see a relationship between MTU and delay either. If everything works as it should, the worst thing that can happen is that a plaintext payload packet will be fragmented into two transport packets, which cannot make the RTT more than 4 times longer (assuming two packets to be transmitted in each direction and same speed in both directions, in fact the download at the LTE side is faster than upload and the second fragment is not very long so the actual additional delay will be much less than 3 times). If something doesn't work properly, though, I could speculate that the network stack is waiting for next fragment to come for some time and releases the reassembled packet after that time expires, but that would at first place require a bug in fragmentation or reassembly (the last fragment of each packet is marked as the last one, so either fragmentation would have to fail to set the "last fragment" marker, or the reassembly would have to ignore it.

Till present I am using the EoIP over IKEv2/IPsec but such solution is not reliable in some situations because EoIP protocol is not supported in linux kernel and exist some implementations as linux daemons only. Unforunately, GRETAP tunnel is not implemented in ROS. Would it be possible to solve the problem with reliability through MPLS/VPLS over IKEv2/IPsec?
Bad news, no. IPsec transports only IP payload, so anything else has to be encapsulated into IP first. MPLS/VPLS are other type of Ethernet payload, so you can tunnel Ethernet frames carrying MPLS through EoIP or GRETAP, but encapsulating Ethernet into MPLS won't help you tunnel it using IPsec.

Good news, if the EoIP userspace daemon doesn't fit your needs, you might want to patch the GRETAP kernel code to do EoIP. This should not be a big deal in fact, because as far as I can see, the biggest difference to deal with is that EoIP places the tunnel-id into the two last bytes of the 32-bit GRE key field (gretap adds this field only if you specify the key value when adding the link as an additional parameter following the local and remote IP addresses) and misuses the first two bytes of the key field as a payload size field (which is redundant as the IP packet length in the IP header gives the same information). The rest are static modifications, EoIP sets the GRE header version field to 1 (enhanced GRE) and payload protocol type code to 0x6400, whereas GRETAP uses GRE version 0 and payload protocol type code 0x6558. So you can either patch the kernel, which will definitely be faster, or ask Mikrotik to add a gretap field to EoIP parameters, where yes would modify the two headers in question in Rx and Tx direction and set the "payload length" field to 0 in transmission/determine the payload length using other means on reception.

There is also the BCP (RFC3518), however the last kernel patch regarding its support in the PPP suite I have found is from 2010, and it requires kernel patching either, as there is a kernel module implementing L2TPv3 which seems to make everyone happy enough. On the other hand, Mikrotik uses BCP as it can be used not only along with L2TP but also with SSTP, PPTP... but does not plan to support L2TPv3.
Instead of writing novels, post /export hide-sensitive. Use find&replace in your favourite text editor to systematically replace all occurrences of each public IP address potentially identifying you by a distinctive pattern such as my.public.ip.1.
 
sindy
Forum Guru
Forum Guru
Posts: 3959
Joined: Mon Dec 04, 2017 9:19 pm

Re: IKEv2/IPsec high latency

Sat Nov 16, 2019 3:53 pm

I had a closer look to the way how the reduction of interface MTU caused by IPsec encapsulation looks like, and there's nothing specific for IPsec handling about it. If the payload augmented with the IPsec transport protocol's overhead doesn't fit into the available MTU on the interface, the resulting transport packet is fragmented on transmission and reassembled on reception. So the flags in the IP header of the first IPsec transport packet say:
Flags: 0x2000, More fragments
0... .... .... .... = Reserved bit: Not set
.0.. .... .... .... = Don't fragment: Not set
..1. .... .... .... = More fragments: Set
...0 0000 0000 0000 = Fragment offset: 0


In the second one, it's
Flags: 0x00b9
0... .... .... .... = Reserved bit: Not set
.0.. .... .... .... = Don't fragment: Not set
..0. .... .... .... = More fragments: Not set
...0 0101 1100 1000 = Fragment offset: 1480


In your OP, you wrote:
If the datagram length is larger than 1500 the packet does not pass to the server at all (?). The problem is with the latency in the opposite direction from the server to the LTE terminal. With the datagrams lengths from 40 to 1500 the RRTs varies between 40 to 700ms and sometimes from 200 to 1000ms. With these latencies the terminals are unusable for reliable bidirectional transfer of any data. The datagram larger then 1500 also does not pass from the server to the terminal (?).
It actually indicates two issues which I deem unrelated.

Did you test the RTT with don't fragment flag forced to true? Because if you don't explicitly prohibit fragmentation while pinging (we're talking about IPv4 here), larger packets should be getting through normally, just fragmented into smaller ones (which the ping doesn't show as the networking stack reassembles the received responses before handing them over to the application process). If they don't, it means that some device in the path reports larger MTU than it actually supports, so instead of getting fragmented, the larger packets get dropped. To confirm this, and also as a temporary workaround until you resolve that with the actual culprit, you have to reduce the MTU value on an interface closer to the source which you can control. By setting up a lower MTU there, the packets will get fragmented already while being sent out that interface and will not be dropped further in the network. It is only a workaround because unlike the MSS modification, it cannot be done selectively only for that part of outgoing traffic that is routed via the problematic path.

For smaller packets (up to, say, 200 bytes) there should be no reason anywhere in the network to fragment them, so I can see no relationship between the eventual MTU issues and those delays. So really only sniffing on the LAN and WAN interface simultaneously (i.e. into a single file) can show which part of the packet path causes those delays. Assuming that the Mikrotiks at both ends act as routers, i.e. that he traffic of the actual client and the actual server are forwarded through them, the simultaneous sniffing on LAN and WAN interface should show how long it took the Mikrotik to encapsulate the packet received from LAN and send the resulting transport packet out the WAN etc.
Instead of writing novels, post /export hide-sensitive. Use find&replace in your favourite text editor to systematically replace all occurrences of each public IP address potentially identifying you by a distinctive pattern such as my.public.ip.1.

Who is online

Users browsing this forum: MSN [Bot] and 104 guests