Slow speed through gre+ipsec tunnel

Hello,

CHR, 6.44.1, 2 vcpu Xeon Gold
CCR1009, 6.44.1
WAN with 45 ms latency

[CHR]—wan(tunnel gre+ipsec)wan—[CCR1009]

aes128cbc/sha1, Actual MTU = 1426 (Auto)
OR
aes128ctr/sha1, Actual MTU = 1446 (Auto)

Bandwidth Test on CHR to CCR (tcp, receive, 1 connection):
between public ip = up to 300 Mbps
between private ip (over tunnel) = up to 120 Mbps and unstable

Why speed through tunnel is so slow?
Image_gre_ipsec.png

Hello,

the same problem here - two ccr1009 over 1 Gbit/sec fiber links attached to the same switch. This is the result of a bandwith test performed on a weekend with no other load on the network:

Receive test across GRE/IPsec tunnel (with CPU load):
GRE send Bandwidth test.png
Send test across GRE/IPsec tunnel (with CPU load):
GRE receive bandwith test.png
Current setup:
RouterOS 6.44.6
Fiber WAN Link, latency below 1 msec
CCR1009-7G-1C—GRE/IPsec (aes256cbc/sha256 actual MTU = 1422 (Auto)—CCR1009-7G-1C

Believe me, I have spent days of tinkering with encryption algorithms and other advice from the forum. I can only come to the conclusion that this has to do with the single core performance of the tile GX CPU. Apart from a configuration error (I am by no means a network expert), I believe that this will not be resolved until there is more multithreading and/or better hardware encryption support.

Test using iperf3 from a client behind each of your routers.
Not using the routers themselves.

No problem, here you go, not that it makes a huge difference:

iperf3 -s -f M


Server listening on 5201

Accepted connection from 192.168.55.55, port 61613
[ 5] local 192.168.1.22 port 5201 connected to 192.168.55.55 port 61614
[ ID] Interval Transfer Bandwidth
[ 5] 0.00-1.00 sec 15.4 MBytes 15.4 MBytes/sec
[ 5] 1.00-2.00 sec 16.9 MBytes 16.9 MBytes/sec
[ 5] 2.00-3.00 sec 16.8 MBytes 16.8 MBytes/sec
[ 5] 3.00-4.00 sec 17.0 MBytes 17.0 MBytes/sec
[ 5] 4.00-5.00 sec 17.0 MBytes 17.0 MBytes/sec
[ 5] 5.00-6.00 sec 17.0 MBytes 17.0 MBytes/sec
[ 5] 6.00-7.00 sec 17.0 MBytes 17.0 MBytes/sec
[ 5] 7.00-8.00 sec 16.3 MBytes 16.3 MBytes/sec
[ 5] 8.00-9.00 sec 16.4 MBytes 16.4 MBytes/sec
[ 5] 9.00-10.00 sec 16.7 MBytes 16.7 MBytes/sec
[ 5] 10.00-10.01 sec 169 KBytes 18.7 MBytes/sec


[ ID] Interval Transfer Bandwidth
[ 5] 0.00-10.01 sec 0.00 Bytes 0.00 MBytes/sec sender
[ 5] 0.00-10.01 sec 167 MBytes 16.7 MBytes/sec receiver
Unfortunately, this is still extremely slow compared to the available bandwith.

IPsec test results for CCR1009, available from product spec pages state, that with small packets achievable throughput is around 130Mbps (that’s around 16 Mbytes per second).

The problem with test results published for MT products is, that they only show absolute maximum … but in real life, one has to take quite a bit smaller number as relevant one. For example, test result for single IPSec tunnel using large packets shows achievable throughput exceeding 400Mbps (even more than Gbps). But experience (of many forum members) with ethernet test results goes that relevant number for real-life cases is the one with medium packet sizes and for bigger number of processing rules (ip filter rules). IPsec performance doesn’t follow the exact same rules, but generally it does.

BTW, does CHR show IPsec tunnel as HW offloaded? Either way, what’s CPU load there? It could well be that it’s CHR being bottle neck here.

And in case where two CCRs are used: what does CPU profile show, which processes use most CPU cycles?

I am not an network appliance test engineer - I am just a consultant setting up network equipment. If after more than a day of tinkering according to advice from more knowledgable people I cannot achieve decent speed, I simply have to give up for economic reasons.

While it may be true that under whatever theoretical circumstances, near-spec speed can be achieved, this is not practical for me if it cannot be done without specialized knowledge.

  1. In IPSEC are you using single DES or 3 DES? if it is 3 DES change it to single.

You must be joking. This is 2019. Nobody should have to use DES anymore if AES is available in hardware.

My question is therefore: has ANYBODY ever achieved greater than may 17 Megabytes/second across a GRE/IPSEC tunnel and if yes, please tell the forum how it is being done so others could learn.

Have same issue (support ticket SUP-3459) with IPSec between CCR1036 (ROS/ROB 6.44.6) and StrongSwan on CentOS 7 connected to 1Gbp/s links with 300Mbit/s ISP (download/upload) throughput. Latensy between sides abount 18.0ms.

Tested by iperf3:

  • if Hardware AEAD activated (enc-algorithms aes128-cbc-aes256-cbc/aes128-ctr-aes256-ctr with auth-algorithms md5/sha1-sha256) TCP upload speed (CCR → CentOS) unstable and jumps from 3 to 45Mbit/s, UDP upload speed locks on 57Mbit/s with high amount of out of order packets and high packet loss (about 80% on bandwidth 300m). Download speed (CCR ← CentOS) with same settings between 220-280Mbit/s.


  • if software encryption activated (enc-algorithms aes128-cbc-aes256-cbc/aes128-ctr-aes256-ctr/aes128-gcm-aes256-gcm with auth-algorithms null/sha512) i got about 180Mbit/s on TCP/UDP download/upload speed (UDP still has out of order packets and packet loss about 61%).

Tested on:

  • CCR1036-12G-4S with RouterOS 6.44.6 ↔ StrongSwan - LibreSwan (kernels from 3.10 to 5.4) - issue presents;


  • RB1100AHx2 with RouterOS 6.44.6 ↔ StrongSwan - LibreSwan (kernels from 3.10 to 5.4) - issue presents;


  • RB4011 with RouterOS 6.46 ↔ StrongSwan - LibreSwan (kernels from 3.10 to 5.4) - issue presents, but average speed up to 80-100Mbit/s;


  • CHR 6.46 ↔ StrongSwan - LibreSwan (kernels from 3.10 to 5.4) issue not presents, all works perfectly;


  • StrongSwan ↔ StrongSwan - LibreSwan (kernels from 3.10 to 5.4) StrongSwan - issue not presents, all works perfectly;

Also, this problem doesn’t appears if latency 0-3ms.
CCR from CentOS - hwaead_download_rate.png
CCR from CentOS - soft_enc_download_rate.png
CCR to CentOS - hwaead_upload_rate.png
CCR to CentOS - soft_enc_upload_rate.png

Same behaviour observed with 6.46.1 between a CCR1032 and a 4011: on a speed-test tcp download fails sometimes, huge CPU peaks on 4011 (>70%) not reflected on speed-test, and obvious packet loss.

EOIP+IPSec doesn’t display this behaviour, tcp_download starts and finish smoothly.

Measured speeds are really close, I’d say a 0.5-1% less throughput with EOIP.

We moved to CHR 6.46.2. TCP and UDP workes perfectly, now tcp_window_size on Windows hosts increases correctly.

Same behaviour observed in CCR1072 and a few dozen IPsec tunnels in a road warrior configuration (client-to-site) SHA-1 AES-CBC-256 Hardware AEAD.
CPU usage number 15 is most of the time it is almost 99% when the traffic of the WAN interface exceeds 100Mbps (autonegotiated 1000Mbps).
The rest of the 72 cores are below 8% to 10% and even 0%

Your case is apparently different. The original problem reported here was about GRE+IPsec combination (and it was even mentioned later that EoIP+IPsec is unaffected). Yours is road-warrior case, and so very likely just IPsec, without GRE. So… Please start a new thread. And before posting, please use /tools profile to check if it is IPsec (encryption) or something else that eats CPU.

Issue is still observed on 6.47.1:
image_bwtest_tcp_ccr_6471.png
first graph - test from ccr to chr public ip
second graph - test from ccr to chr private ip (via tunnel)

Issue still not fixed on 6.48:
image_bwtest_tcp_ccr_648.png
mikrotik technical support is silent…

KENYx120
Have same issue (support ticket SUP-3459) with IPSec between CCR1036 (ROS/ROB 6.44.6) and StrongSwan on CentOS 7 connected to 1Gbp/s links with 300Mbit/s ISP (download/upload) throughput. Latensy between sides abount 18.0ms.

did you get a response from technical support?
or did they refuse to solve the problem?

Issue still not fixed in 6.49 :frowning:

almost same problem on ccr2000/chr series V7.13
any news/updates?

pc - ccr1036 - wire - ccr1036 - pc
routerOS 7.15.3
TCP - bandwith test and iperf3 from pc to pc
eoip ipsec 238Mbit - 1core 82%, top - networking
ipip ipsec - 260Mbit 1 core 95%, top - networking
gre ipsec - 240Mbit 1 core 90%
wireguard - 550Mbit