Community discussions

MikroTik App
 
Lemahasta
just joined
Topic Author
Posts: 14
Joined: Wed Dec 30, 2015 9:52 am

CCR 1009 - IPSEC throughput

Mon May 18, 2020 11:55 am

Hello,
[at the bottom is the TL,DR...]

I'm trying to figure out what I might be doing wrong with my IPSEC setup, as I get much lower throughput than "advertised" and uplink is capable.

scenario 1)
For test purposes I have:

left side: file server / iperf3 server -> mikrotik ccr 1009-7g-1s-1s+ (LAN gateway + firewall) -> mikrotik ccr 1009-8port (WAN / IPSEC tunnel end - I'm using on this model ports 6-7, so direct-to-cpu, not the atheros-switch chipped)
right side: windows client -> MIKROTIK ccr 1009-7g-1s-1s+ (WAN / IPSEC tunnel other end).

I don't have issues withbasic connectivity (everything connects at routes as it should), but the throughput is much lower than expected based on official data sheet and reading posts on this and other forums.

All 1009ccr run latest stable ROS (6.46.6 right now). I know that earlier there were issues with CCR packet reordering, but I think they were fixed many ROS versions ago.

All IPSEC tests done with IKEV2 and sha1/ aes128 or sha1/ aes256 with different variants:
* GRE
* IPIP
*just ipsec (tunnel mode)

WIth both mikrotiks connected directly to each other via 1 Gb/s ports [port 7 on both devices] maximum throughput I could achieve was roughly 230 Mb/s encrypted (roughly 20 kpp/s )both on iperf3 and SMB traffic.
When I did the test over actual WAN with 300/300 uplink and 600/30 downlink over different ISPsI had even worse, with 150-ish Mb/s (150 upload - 6 download). (12-15 kpp/s).
In LAN I can achieve full 1 Gb/s.unencrypted with mikrotik 1 doing routing.

Both ccr1009 doing encryption use hardware acceleration (at least they say so in installed SA's). On one of them (the 8 port "older" version) CPU load is split on 2 cores from what I've seen, load is around 90% qualified equally as "networking" and "firewall".

As for firewall I used RAW no-track for local subnets that are being involved in the tests and first rule in filter is just ACCEPT for the interfaces.
2) scenario 2:
windows 10 client using ikev2 -> mikrotik ccr 1009 as VPN server -> mikrotik 1009 as router/firewall -> file server/iperf3

client has 600/30 and server 300/300
With various available speedtests I'm getting around 200 Mb/s, with iperf 150-ish and actual file download over SMB is around 14-17 MB/s.

It's roughly 60-65% of max WAN upload (300%) so there's still plenty of room left. In LAN (so client - mikrotik 1009 doing just firewall/routing - file server) i'm getting perfect 1 Gb/s,

Now my general questions:

Should I be looking/vetting my config for a possible flaw/misconfig, or is just it - CCR 1009 is unable to do over 200Mb/s in single IPSEC tunnel? I don't want to run a fool's errand and look for issues where there might be none.

What I would love to hear from someone with ccr 1009 experience is wether I should be technically getting better throughput, or not? The only thing that makes me wonder is that CPU (specific cores) on 'tiks never hit 100%, so it'd look as there is still some potential. THe highest I've seen was around 70% on single core (35%-50% as "networking" and the rest is firewall).

TL,DR

What is the expected (in real production environment) single tunnel Mikrotik CCR 1009 IPSEC throughput (IKEV2 plain and with tunnel protocolos like IPIP or GRE) and is my ~220-ish Mb/s close or way, way behind what I should be getting?

Regards
 
kos
Frequent Visitor
Frequent Visitor
Posts: 63
Joined: Mon Oct 31, 2016 11:51 am

Re: CCR 1009 - IPSEC throughput

Mon May 18, 2020 1:07 pm

I have asked similar thing once. No answer! ;)

viewtopic.php?f=2&t=150484
 
User avatar
hike
just joined
Posts: 12
Joined: Mon May 18, 2020 1:44 pm

Re: CCR 1009 - IPSEC throughput

Mon May 18, 2020 3:16 pm

You don't mention the transport protocol but I assume you tried TCP.
Try multiple TCP connections or try UDP and you'll probably approach 1gbps.

AFAIR, encryption of IP packets containing a TCP payload will be bound to one core in order to "enforce" ordered output.
I remember getting ~350mbps with CCR1036-8G-2S+ and one single TCP stream using nuttcp.
 
User avatar
osc86
Member Candidate
Member Candidate
Posts: 197
Joined: Wed Aug 09, 2017 1:15 pm

Re: CCR 1009 - IPSEC throughput

Tue May 19, 2020 12:09 am

I spent days in testing different configurations, the best I could achieve was ~140Mb/s using IPSec over GREv6/EoIPv6 using an MTU of 1390, ~25ms rtt between the routers.
The other device was a CHR with 1 Gb/s WAN connection and plenty of resources.
Neither of the devices cpu cores were fully utilized during the speed test, I tested through the router not from / to it.
It also made no difference if tcp or udp was used inside the tunnel.
I think the problem is MT's ipsec implementation, and we probably won't see any improvement until v7, if at all.
The new generation of CCRs are based on ARM64, which means TILE is officially abandoned, so don't expect much.
 
Lemahasta
just joined
Topic Author
Posts: 14
Joined: Wed Dec 30, 2015 9:52 am

Re: CCR 1009 - IPSEC throughput

Tue May 19, 2020 10:35 am

My tests are also through the router and can't achieve so far anything above 220-ish Mb/s. For single connection (one client PC using tunnel for iperf and or file transfers from server behind other tunnel end) I see 2 cores being used at they're being used at around 70-80% max. I've been doing tests with direct connection between 2 CCR's (so full 1 Gb/s and 1 ms latency).
obrazek1.png
I tried some MTU tweaking on both sides in /ip firewall mangle but the difference wasn't noticeable if at all.

I had similar results both with simple file transfer and iperf3. UDP (using iperf) could push much more with increasing packet loss. iperf3 with multiple streams made no difference, since everything goes through single tunnel I suppose (?).
Thanks everyone for input. That doesn't sound very hopeful at all.

Do You know other mikrotiks that can do (in practice, not in theory) -350 Mb/s ipsec?
You do not have the required permissions to view the files attached to this post.
 
User avatar
hike
just joined
Posts: 12
Joined: Mon May 18, 2020 1:44 pm

Re: CCR 1009 - IPSEC throughput

Tue May 19, 2020 12:06 pm

Found the test.

Windows client using a AES-128-CBC/SHA256 tunnel
Linux client not using a tunnel
nuttcp using TCP in both directions:

More than 300mbps was indeed only possible in one direction (where Windows did all the encryption):
.
# win -> lin
PS C:\nuttcp-8.1.4.win64> .\nuttcp-8.1.4.exe -w500k -T30 -i -v 172.X.X.X
nuttcp-t: v8.1.4: socket
nuttcp-t: buflen=65536, nstream=1, port=5101 tcp -> 172.X.X.X
nuttcp-t: time limit = 30.00 seconds
nuttcp-t: connect to 172.X.X.X with RTT=1.218 ms af=inet
nuttcp-t: send window size = 512000, receive window size = 212992
nuttcp-r: Warning: receive window size 212992 < requested window size 512000
nuttcp-r: v8.1.4: socket
nuttcp-r: buflen=65536, nstream=1, port=5101 tcp
nuttcp-r: interval reporting every 1.00 second
nuttcp-r: accept from 10.X.X.X with af=inet
nuttcp-r: send window size = 46080, receive window size = 425984
nuttcp-r: available send window = 23040, available receive window = 212992
   36.5625 MB /   1.00 sec =  306.6950 Mbps
   38.4375 MB /   1.00 sec =  322.4497 Mbps
   39.2500 MB /   1.00 sec =  329.2525 Mbps
   ...
nuttcp-t: 1132.6250 MB in 30.01 real seconds = 38647.59 KB/sec = 316.6010 Mbps
nuttcp-t: 18122 I/O calls, msec/call = 1.70, calls/sec = 603.87
nuttcp-t: 0.0user 0.1sys 0:30real 0% 0i+0d 4050maxrss 17+0pf 0+0csw

nuttcp-r: 1132.6250 MB in 30.03 real seconds = 38619.85 KB/sec = 316.3738 Mbps
nuttcp-r: 222506 I/O calls, msec/call = 0.14, calls/sec = 7409.11
nuttcp-r: 0.4user 3.2sys 0:30real 12% 0i+0d 92maxrss 0+15pf 202530+11csw


# lin -> win
PS C:\nuttcp-8.1.4.win64> .\nuttcp-8.1.4.exe -r -F -w500k -T30 -i -v 172.X.X.X
nuttcp-r: v8.1.4: socket
nuttcp-r: buflen=65536, nstream=1, port=5101 tcp
nuttcp-r: interval reporting every 1.00 second
nuttcp-r: connect to 172.X.X.X with RTT=1.392 ms af=inet
nuttcp-r: send window size = 212992, receive window size = 512000
   18.8125 MB /   1.00 sec =  157.7378 Mbps   158 retrans    215 KB-cwnd
   20.5625 MB /   1.00 sec =  171.9762 Mbps     0 retrans    272 KB-cwnd
   18.8750 MB /   1.01 sec =  157.3750 Mbps   259 retrans    170 KB-cwnd
   ...
nuttcp-r: 612.2550 MB in 30.06 real seconds = 20858.24 KB/sec = 170.8707 Mbps
nuttcp-r: 263829 I/O calls, msec/call = 0.12, calls/sec = 8777.44
nuttcp-r: 5.8user 6.0sys 0:30real 39% 0i+0d 3604maxrss 21+0pf 0+0csw

nuttcp-t: Warning: send window size 212992 < requested window size 512000
nuttcp-t: v8.1.4: socket
nuttcp-t: buflen=65536, nstream=1, port=5101 tcp -> 10.X.X.X
nuttcp-t: time limit = 30.00 seconds
nuttcp-t: accept from 10.X.X.X with mss=1360, af=inet
nuttcp-t: send window size = 425984, receive window size = 131072
nuttcp-t: available send window = 212992, available receive window = 65536
nuttcp-t: initial congestion window = 13 KB (10 packets)
nuttcp-t: 612.2550 MB in 30.00 real seconds = 20898.22 KB/sec = 171.1982 Mbps
nuttcp-t: retrans = 1668 cwnd = 219 KB
nuttcp-t: 9797 I/O calls, msec/call = 3.14, calls/sec = 326.57
nuttcp-t: 0.0user 0.9sys 0:30real 3% 0i+0d 92maxrss 0+0pf 2910+5csw
 
pe1chl
Forum Guru
Forum Guru
Posts: 10219
Joined: Mon Jun 08, 2015 12:09 pm

Re: CCR 1009 - IPSEC throughput

Tue May 19, 2020 12:29 pm

The new generation of CCRs are based on ARM64, which means TILE is officially abandoned, so don't expect much.
I expect the issue is not as much that it is TILE vs ARM64 but more that a multicore architecture cannot be used for accelerating IPsec in parallel on multiple cores (they tried, and failed due to the reordering problem).
A solution for that (e.g. some form of output queue with sequencing) would benefit any architecture that has multiple cores.

As in any architecture with a largish number of cores, it is not suitable for any random problem you throw at it.
This is what would make me hesitate to buy a "router with 72 cores". With 9 cores I can see some possibilities to sort-of use them, but with 72 cores probably not.
And even with 9 cores as in the CCR1009, you see that you cannot simply apply the best-case performance figures from a datasheet to your own case-at-hand.
It would probably work fine when you used it as a central router with several branches connected to it, and all of them generating parallel traffic. The total of that traffic would likely be fine according to the specifications.
 
Lemahasta
just joined
Topic Author
Posts: 14
Joined: Wed Dec 30, 2015 9:52 am

Re: CCR 1009 - IPSEC throughput

Wed May 20, 2020 10:07 am

I knew that for single tunnel only 1 core will be used, what I don't fully understand is that why my "numbers" differ so much from the datasheet for single tunnel and why my CPU won't even max out.

If I run "just" IPIP tunnel (no IPSEC encryption on top of it) i can push using iperf3 TCP with multiple streams (e.g. iperf3 -c <host> -P 5) I can get 500-ish Mb/s and more with multiple streams (I don't know wether it's "ok" or slow for the 1009 CCR) and max out cores (
Once I add IPSEC on top of it, i drops down to 200 Mb/s and adding multiple streams does nothing (multiple clients or multiple iperf3 streams doesn't increase overall throughput just divides the max 200 Mb/s for every stream/ client, so one client gets 200 Mb/s, 2 clients get 100 Mb/s each (roughly) and so on). And core usage actually drops.

I guess with multiple IP addresses I could run multiple tunnels between sites and "load balance" to increase overall throughput?

Also, is there a Mikrotik that is known to do at least 300-350 Mb/s in such scenario (IPSEC over single tunnel)?
 
pe1chl
Forum Guru
Forum Guru
Posts: 10219
Joined: Mon Jun 08, 2015 12:09 pm

Re: CCR 1009 - IPSEC throughput

Wed May 20, 2020 11:15 am

I knew that for single tunnel only 1 core will be used, what I don't fully understand is that why my "numbers" differ so much from the datasheet for single tunnel and why my CPU won't even max out.
I don't know that, I never studied it in that detail. However, I think it is not impossible that the datasheet figures were determined when the software still used all cores in parallel for IPsec encryption, and that they were not adjusted after the software was fixed.
(the typical throughput was much worse when using bad TCP stacks that did not handle the reordering well, so they were kind of forced to change their design to use only a single core)
 
Lemahasta
just joined
Topic Author
Posts: 14
Joined: Wed Dec 30, 2015 9:52 am

Re: CCR 1009 - IPSEC throughput

Thu May 21, 2020 1:04 am

I've done some more testing/firewall tweaking and I've got some better results but also stumbled upon issue - which someone might maybe sched some light on?

1) I managed to reach up to 350 Mb/s in "pure" IPSEC (tunnel mode, no tunneling protocols like IPIP/EOIP) in single TCP stream (both with SMB file stransfer and in iperf)

2) I realised that there is a behaviour I don't understand: from side A (as sender) to side B (receiver) i can get much higher throughput with parallel TCP streams (like file transfer from 2 different hosts and by using iperf3 with parallel streams 2/3/4/5). BUT from side B (sender) to side A (receiver) I'm getting ONLY max single TCP stream throughput (if max is 350, 2 streams will be roughly 160ish etc.). SO while from side A with 4 streams I'mgetting over 500 Mb/s throughput (with 550ish being absolute max with 5+ streams), from side B I'm getting a total of 350 Mb/s - more connections (streams) just decrease throughput for each stream.
This behaviour is directly tied to IPSEC - with IPSEC disabled (all policies etc. OFF on both sides) both sides of the tunnel behave in the same way (I'm getting up to 1 Gb/s throughput with every stream increasing total throughput).

As soon as IPSEC is enabled side A gets all the benefits of multiple connections (overall throughput increase), side B is capped as "whatever is the highest for single stream)".

I'm too much of a "newbie" (even though I have mtcna, mtcre, mtctre) to know what is the expected behaviour (is side A correct and side B is doing something wrong or the opposite) and I can't for the life of me find the cause. I've sniffed packets between both tunnel ends with wireshark and both sides (side A with "higher" throughput and side B with much lower) are encrypting exactly as they should (I could see only ESP packets with "public" addresses of the tunnel ends).

Example IPERF results:
1) NO IPSEC SIDE A is sending:
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-10.00 sec 248 MBytes 208 Mbits/sec sender
[ 4] 0.00-10.00 sec 248 MBytes 208 Mbits/sec receiver
[ 6] 0.00-10.00 sec 224 MBytes 188 Mbits/sec sender
[ 6] 0.00-10.00 sec 224 MBytes 188 Mbits/sec receiver
[ 8] 0.00-10.00 sec 233 MBytes 195 Mbits/sec sender
[ 8] 0.00-10.00 sec 233 MBytes 195 Mbits/sec receiver
[ 10] 0.00-10.00 sec 243 MBytes 204 Mbits/sec sender
[ 10] 0.00-10.00 sec 243 MBytes 204 Mbits/sec receiver
[SUM] 0.00-10.00 sec 948 MBytes 795 Mbits/sec sender
[SUM] 0.00-10.00 sec 948 MBytes 795 Mbits/sec receiver

2) side A is sending WITH IPSEC:
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-10.00 sec 143 MBytes 120 Mbits/sec sender
[ 4] 0.00-10.00 sec 143 MBytes 120 Mbits/sec receiver
[ 6] 0.00-10.00 sec 155 MBytes 130 Mbits/sec sender
[ 6] 0.00-10.00 sec 155 MBytes 130 Mbits/sec receiver
[ 8] 0.00-10.00 sec 147 MBytes 124 Mbits/sec sender
[ 8] 0.00-10.00 sec 147 MBytes 124 Mbits/sec receiver
[ 10] 0.00-10.00 sec 160 MBytes 134 Mbits/sec sender
[ 10] 0.00-10.00 sec 160 MBytes 134 Mbits/sec receiver
[SUM] 0.00-10.00 sec 606 MBytes 508 Mbits/sec sender
[SUM] 0.00-10.00 sec 606 MBytes 508 Mbits/sec receiver


side B - NO IPSEC:
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-10.00 sec 269 MBytes 225 Mbits/sec sender
[ 4] 0.00-10.00 sec 269 MBytes 225 Mbits/sec receiver
[ 6] 0.00-10.00 sec 273 MBytes 229 Mbits/sec sender
[ 6] 0.00-10.00 sec 273 MBytes 229 Mbits/sec receiver
[ 8] 0.00-10.00 sec 269 MBytes 226 Mbits/sec sender
[ 8] 0.00-10.00 sec 269 MBytes 226 Mbits/sec receiver
[ 10] 0.00-10.00 sec 278 MBytes 234 Mbits/sec sender
[ 10] 0.00-10.00 sec 278 MBytes 234 Mbits/sec receiver
[SUM] 0.00-10.00 sec 1.06 GBytes 914 Mbits/sec sender
[SUM] 0.00-10.00 sec 1.06 GBytes 913 Mbits/sec receiver

side B - WITH IPSEC [here is what feels off]
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-10.00 sec 93.9 MBytes 78.7 Mbits/sec sender
[ 4] 0.00-10.00 sec 93.7 MBytes 78.6 Mbits/sec receiver
[ 6] 0.00-10.00 sec 88.8 MBytes 74.4 Mbits/sec sender
[ 6] 0.00-10.00 sec 88.5 MBytes 74.2 Mbits/sec receiver
[ 8] 0.00-10.00 sec 79.1 MBytes 66.3 Mbits/sec sender
[ 8] 0.00-10.00 sec 79.0 MBytes 66.2 Mbits/sec receiver
[ 10] 0.00-10.00 sec 98.2 MBytes 82.4 Mbits/sec sender
[ 10] 0.00-10.00 sec 98.1 MBytes 82.2 Mbits/sec receiver
[SUM] 0.00-10.00 sec 360 MBytes 302 Mbits/sec sender
[SUM] 0.00-10.00 sec 359 MBytes 301 Mbits/sec receiver


I have similar results with SMB file transfers (side A -> side B) copying from different file servers gives TOTAL higher throughput, file transfers from B to A seem to "share" single stream.

Same thing happens with IPIP/EOIP with IPSEC on/off.

I don't expect anyone here "fixing" my problem, but what I really need is simple answer from anyone who setup tunnels previously:

WHich behaviour is correct (with IPSEC) - SIDE A or SIDE B (not knowing which behaviour is wrong doesn't help at all with troubleshooting).

One thing that may or may not come into play is that the real "path" from server to client is as such:
side A: servers (iperf and file servers) -> CCR 1009 (lan gateway, firewall, router) -> CCR 1009 (IPSEC tunnel endpoint)
side B: clients -> CCR 1009 (IPSEC tunnel endpoint)

So there is one more "hop" between servers -> tunnel endpoint than from the side B.

As for the configuration I've tried keeping it as simple as possible, so in RAW there is no-track + accept rules (so I don't have to deal with NAT), in mangle there is ACCEPT for no-track as first in prerouting for the LAN subnets and in FILTER i've got as first rule accept for no-track in forward and in input (with tunnel-end addresses).


Any help will be much appreciated.
 
pe1chl
Forum Guru
Forum Guru
Posts: 10219
Joined: Mon Jun 08, 2015 12:09 pm

Re: CCR 1009 - IPSEC throughput

Thu May 21, 2020 1:31 pm

I have no idea how the CCR distributes streams over cores in such cases (if it does that at all).
My experience with CCR routers is only in "realistic" networks with hundreds of sessions operating in parallel, not such "benchmarking" cases.
(e.g. in our main office at work all user systems are on 100Mbit ports so a single user can never draw more than 100Mbit/s, but there are like 200 users)
Maybe someone else is in a better position to answer that.
 
Lemahasta
just joined
Topic Author
Posts: 14
Joined: Wed Dec 30, 2015 9:52 am

Re: CCR 1009 - IPSEC throughput

Thu May 21, 2020 7:11 pm

My issue - that I realised after my initial post - is that from one side of the tunel number of connections doesn't matter - it's capped up to a single stream.

I've done some more tests and I still have no idea why and what is the actual expected behaviour.

As I said, both sides of the tunel are equal (ccr1009, ROS version, firmware etc.).

It's not just "benchmark" test but "real life scenario" with file download/upload with IPIP/IPSEC from one side 1 file caps at ~200 Mb/s, 2 files (from different file servers aswell) cap at around 150 Mb/s EACH (so 300 Mb/s total throughput)
(obrazek1.jpg)

Upload (to the same 2 file servers) caps at 200 Mb/s for 1 file, 2 files cap at 100 Mb/s, so total throughput is exactly the same as for single file. Removing IPSEC from the tunnel changes behaviour so both sides of the tunnel work the same.
(obrazek2.jpg)

With IPERF please notice difference with the PARALLESL test (test run with JUST IPSEC, no IPIP or other protocol as I found out that it doesn't make any difference for the sake of my issue):
SIDE A - single TCP stream
single-stream-sideA.jpg
SIDE B - single TCP stream
single-stream-side-B.jpg
both are basically equal, CPU load on one of the cores is around 40 up to 50% (from /tools profiling)

SIDE A - 6 parallel streams:
6-par-stream-side-A.jpg
SIDE B - 6 parallel streams:
6-par-stream-side-B.jpg
During SIDE A as sender CPU (on one of the cores) goes up to 70-80% (which looks fine since it's actually doing more work), while during SIDE B as sender CPU' core never go beyond 40-50 (exactly as with single stream test).

I understand if I were using different hardware/ROS, but with exact same devices only difference in behaviour should be in config, right? I mean if I get totally different numbers just by chaning sender/receiver it has to be in the configuration? I know that "ipsec" is responsible, but what exactly forces this behaviour is beyond me.

For reference ipsec config on side A:
# may/21/2020 17:39:04 by RouterOS 6.46.6
#
# model = CCR1009-8G-1S-1S+

/ip ipsec policy group
add name=group1
/ip ipsec profile
add dh-group=modp1536,modp1024 enc-algorithm=aes-128 lifetime=8h name=\
    profile_1 nat-traversal=no
add dh-group=modp1024 enc-algorithm=3des hash-algorithm=md5 lifetime=8h name=\
    profile_3 nat-traversal=no
add dpd-interval=1m enc-algorithm=aes-256,aes-128 name=profile_4
add dh-group=modp1024 enc-algorithm=aes-128 hash-algorithm=sha256 lifetime=8h \
    name=profile_5 nat-traversal=no proposal-check=strict
add dpd-interval=1h enc-algorithm=aes-256,aes-128 name=profile1
/ip ipsec peer
add address=10.111.111.2/32 disabled=yes exchange-mode=ike2 name=peer2 \
    passive=yes profile=profile_1 send-initial-contact=no
add address=10.111.111.2/32 exchange-mode=ike2 name=peer4 passive=yes \
    profile=profile_1 send-initial-contact=no
/ip ipsec proposal
set [ find default=yes ] auth-algorithms=sha256,sha1 lifetime=8h

add auth-algorithms=sha256,sha1 enc-algorithms=aes-256-cbc,aes-128-cbc name=\
    proposal1
/ip ipsec identity

add peer=peer4 policy-template-group=group1 secret="mysupersecret"

/ip ipsec policy
add dst-address=192.168.11.0/24 peer=peer4 proposal=proposal1 sa-dst-address=\
    10.111.111.2 sa-src-address=10.111.111.1 src-address=192.168.1.0/24 \
    tunnel=yes
side B has same config but in reverse source-destination:
# may/21/2020 18:00:41 by RouterOS 6.46.6
#
# model = CCR1009-7G-1C-1S+
/ip ipsec peer
add address=10.111.111.1/32 exchange-mode=ike2 name=peer1
/ip ipsec policy group
add name=group1
/ip ipsec profile
set [ find default=yes ] enc-algorithm=aes-256,aes-192,aes-128 nat-traversal=\
    no
add enc-algorithm=aes-256,aes-192,aes-128 name=profile1 nat-traversal=no
/ip ipsec proposal
set [ find default=yes ] enc-algorithms=aes-128-cbc,aes-128-ctr,aes-128-gcm
add enc-algorithms=aes-128-cbc name=proposal1
/ip ipsec identity
add notrack-chain=prerouting peer=peer1 secret="mysupersecret"
add peer=peer2 secret="mysupersecret"
/ip ipsec policy
add dst-address=192.168.1.0/24 peer=peer1 proposal=proposal1 sa-dst-address=\
    10.111.111.1 sa-src-address=0.0.0.0 src-address=192.168.11.0/24 tunnel=\
    yes
add disabled=yes dst-address=10.111.111.1/32 peer=peer1 protocol=ipencap \
    src-address=10.111.111.2/32
add disabled=yes dst-address=10.111.111.1/32 peer=peer1 protocol=ipsec-esp \
    src-address=10.111.111.2/32
Both devices are connected directly to each otherand use 10.111.111.0/30 network to connecto to each other
I kind of derailed my own thread, but I still think it's relevant, as there is noticeable difference in total throughput (250 vs 500 Mb/s). Any mikrotik GURU might know where might be an error?
If hardware is the same, if software version is the same, I guess there must an issue with configuration - since IPSEC /on/off changes drastically behaviour.

I might get my hands on some other mikrotiks (not ccr1009) and see if I can get different results (not the throughput but single vs multiple stream relation).
You do not have the required permissions to view the files attached to this post.
 
pe1chl
Forum Guru
Forum Guru
Posts: 10219
Joined: Mon Jun 08, 2015 12:09 pm

Re: CCR 1009 - IPSEC throughput

Thu May 21, 2020 8:13 pm

As I said I have no idea how (if at all) it would split the load for multiple sessions in the current version.
In the original software for the CCR it just put all cores to work on packets as they arrived and cores were available, which of course resulted in high speeds but also in packet reordering.
The Windows users cried wolf because their TCP performance went down the drain (for Linux systems it did not matter much because they handle this correctly, it would merely mean that the window did not go up as large as it would when everything is received in sequence, possibly affecting long-RTT links).
So this "bug" was fixed and now all packets are handled in sequence, but I do not know what is being grouped together for such sequencing (merely IPsec peers, or maybe identified streams within those peer relations, e.g. by source- or destination IP address)

I myself never use these direct IPsec tunnels, I use GRE tunnels over IPsec transport. But for this particular purpose that probably only makes things worse.
 
Lemahasta
just joined
Topic Author
Posts: 14
Joined: Wed Dec 30, 2015 9:52 am

Re: CCR 1009 - IPSEC throughput

Thu May 21, 2020 9:17 pm

I tried different variants:
IPIP with IPSEC
GRE with IPSEC
EOIP with IPSEC
pure IPSEC and in every case I have very similar results (only difference being "max single connection throughput" but that's not an issue right here).

without IPSEC I'm getting paralel streams spread across cores in both directions
with IPSEC I'm getting parallel streams spread across cores from one direction (with full utilization of one of the cores), the other side is using single core.

What I don't understand is why with the same hardware/software I'm getting different results.

I tried today adding fourth mikrotik so I would get:

side A: servers (iperf, file servers etc.) -> mikrotik CCR (LAN, Gateway) - unencrypted data -> mikrotik CCR (IPSEC tunel encryption/decryption)
side B: mikrotik CCR (IPSEC tunel encryption/decryption) <--unencrypted data <- mikrotik 2011 (LAN, Gateway) <-- client/iperf3 server

but I got the same results i.e. one side was able to push in paralel almost twice as much fully using it's cores, other side was doing 50% of side A and used max 40% of single core.


I changed my setup also to:
side A: server (iperf only) -> mikrotik CCR (IPSEC tunel encryption decryption)
side:B: mikrotik CCR (ipsec tunel encryption/decryption) <- server (iperf3)

In this setup I had same behaviour as "side B" previously, so max throughput was limited to whatever max single TCP stream could carry. As soon as server A was moved to the previous setup (with one more CCR inbetween) throughput goes into paralel and goes by 100%.

I'm so puzzled (probably due to my inexperience in the matter) that I don't know what is "expected" behaviour and what is "unexpected", I mean like in general for VPN site-to-site tunnels - should clients behind one side of a tunel benefit from multiple connections going through single tunel (like side A in my setup) or not (like side B) and be capped?

Cores (shown in proiles and resources) seem to reflect the throughput values: when pushing more data (in paralel links) from side A - highest single core is at about 80% on both sides of the tunnel, while the other side (side B) while sending even paralel streams it maxes single core at around 50% (and receiving router also tops at 50% single core with rest being basicly idle).

If I only had access to different gear I'd test myself different scenarios wether it's specifally CCR's fault or VPN tunnels just work like this and in my setting one side of the tunnel is using some hidden/random tricks to work better.
 
Lemahasta
just joined
Topic Author
Posts: 14
Joined: Wed Dec 30, 2015 9:52 am

Re: CCR 1009 - IPSEC throughput

Sun May 24, 2020 3:54 pm

After some searching I found this topic on these forums, which explains a bit more how IPSEC should be handled by the CCR with - probably - some sort of ip/port src-dst based on peer.
viewtopic.php?t=140855

This would certainly explain why from side B to side A in my tests I'm always getting max throughput equal to max single tcp stream, as there is always one tunnel so there is only one ip src-dst pair.

What I need to know, what on earth have I done, so that from side A to side B I'm getting multiple streams with aggregate throughput higher than max single stream?

What I've done alredy is:
- done "real life" windows client test with SMB file downloads over the tunnel and the windows client had no issues with the data i.e. there was no reordering issue or packet loss, in fact I'd much rather have on both sides of the tunnel same throughput
- to be sure I ran wireshark inbetween 2 mikrotik tunnel endpoints to sniff packets flowing through (switch with port mirror to the traffic analyzer tx/rx) and all packets were encrypted (just ESP load) as exptected, so I crossed out that from side A to side B encryption isn't happening (which I was initially taking as a possible cause)

I posted my IPSEC config earlier, it's pretty basic. I crossed out hosts doing tests on both sides of the tunnel (used different physical PC's and VM's) always getting same results.

The only thing I can't do is reset all config, as one of the routers (the initial one from side A which is just routing to the other CCR that's the tunnel entry point) is used in production, so all I can do on this one is set some additional firewall/routing to allow traffic to originate.

My first thought was that because traffic is encrypted only second "hop" from side A to side B is the cause, but I added additional "hop" on side B also, so connection goes like:
side B -> CCR gateway-for-clients-sideB --(no-encyrption)---> CCR IPSEC (encrypt/decrypt) -->ENCRYPTED ---> CCR IPSEC (enrypt/decrypt) --(no encryption)---> CCR gateway-for-clients-side-A --> side A

this type of scenario didn't make any change, still there is traffic assymetry.

At this point I'm looking for anything, any pointers where to look for misconfiguration, because from the topic i linked I should be getting on both sides of the tunnel same results, as from side B to side A (that is "total throughput for tunnel (all clients using tunnel) = max TCP single stream throughput" which in my case is between 180 Mb/s (IPIP+ipsec) or 300Mb/s (IPSEC in tunnel mode).

What I managed to test is:

When I run the tunnel between:
CLIENT -> CCR 1009 IPSEC tunnel -> CCR 1009 IPSEC tunnel -> CLIENT
I'm getting only single stream throughput as max (increasing connections doesn't increase total throughput, probably setting more tunnels would...) - so it must be something concerning using additional router after the tunnel and packets are decrypted

Have I misconfigured routing, ipsec or firewall, that I have increased throughput (which I shouldn't get I think)?
 
Lemahasta
just joined
Topic Author
Posts: 14
Joined: Wed Dec 30, 2015 9:52 am

Re: CCR 1009 - IPSEC throughput

Mon May 25, 2020 9:28 pm

I asked around some people and also at reddit (I copy/pasted now text below), yet still I didn't found the reason for this behaviour. I made today another test (I got my hands on new rb4011).

I took one CCR1009 and RB4011 - reset config (no default) - both upgraded to latest stable (6.46.6) and firmware aswell

On ccr I setup:

192.168.11.1/24 on eth 1

10.112.112.1/30 on eth2


on rb4011:

192.168.60.1/24 on eth7

10.112.112.2/30 on eth1


both routers connected ccr-eth2 - rb4011-eth1

on ccr:

route: 192.168.60.0/24 - gateway: 10.112.112.2


on rb4011:

route 192.168.11.0/24 - gateway: 10.112.112.1


to ccr eth1 - one laptop with 192.168.11.2/24 and 192.168.11.1 gateay

to rb4011 eth7 one laptop with 192.168.60.2/24 and 192.168.60.1 gateway


I setup IPSEC in tunnel between peers (10.112.112.1 - 10.112.112.2 with source/destination swapped between 2 routers) with policy

192.168.11.0/24 - 192.168.60.0/24 in TUNNEL (and swapped between 2 routers)

IPSEC with sha1-aes128 - ikev2 (tunnel established no problem obviously)

In firewall single rule in forward and input ACCEPT to see packets and to see connections in conn-track. (with no firewall and no conn-track I had around 800 Mb/s single TCP IPSEC throughput so it was impossible to see if adding more simultaneous TCP connections through single tunnel increases total throughput...)


With this setup I ran iperf3 between laptops (both windows 10 same dell-whatever model - no firewall both sides).


side A (CCR encrypts -> RB4011 decrypts) I managed to get with parallel connections higher total throughput (450-ish single connection, up to 800-ish with 4 parallel connections) Mb/s in TCP (with parallel more cores are used and up to 80% per cores used via tools profiling


side B (RB4011 encrypts -> CCR decrypts) I managed to get with parallel connections SAME total throughput (380ish single connection, 380 with 4 parallel connections) Mb/s in TCP (one core is used with roughly 60% max use via tools profiling)

I've done multiple reboots between changing configuration just to be sure nothing was "cached".

I tried swapping IPSEC initiator-responder between CCR and RB4011 and swapped between laptops iperf3 server/client and sent with reverse flag. I had 100% repeatable scenario. One side was able to use parallel streams to increase total throughput over single TCP connection, other side is limited to "single TCP connection throughput".

So I basically in much more simple scenario reproduced behaviour I'm having (one side is able to send total more with more simultaneous connections - other side with simultaneous connections still sends at whatever is max for SINGLE connection).

Tomorrow I'll be able to test it out with another CCR1009 that I'll be able to reset config aswell and I'll see if it changes anything, but my initial tests were with CCR's, so I'm afraid I'll get a repeat of todays scenario (one side for whatever mysterious reason sends more data)

I'm not an expert (although I have MTCNA, MTCRE, MTCTE certificates) and there just has to be something I'm missing. It's not that throughput is low/high or whatever - it's that the way that data is sent is completely different depending on which side is sending data (encrypting) and which side is encrypting.
 
Lemahasta
just joined
Topic Author
Posts: 14
Joined: Wed Dec 30, 2015 9:52 am

Re: CCR 1009 - IPSEC throughput

Thu Jun 04, 2020 11:40 am

I had chance to do some more comparisons and tests and it seems that it's specific Tile issue. With 2 rb4011 (arm32) with same config I had consistent throughput with parallel streams in the 550 Mb/s range with GRE over IPSEC. Replacing one rb4011 with ccr 1009, same config - one side dropped to 300 Mb/s with basically no gain with parallel streams.

CCR 1009 work ok with no firewall no conntrack and no tunneling protocol. pure ipsec with direct eth connection I could do in the 800 Mb/s IPSEC, with single firewall rule and conntrack it drops by half, adding singke prerouting rule (even jus ACCEPT) is abother 50 Mb/s loss and IPIP and GRE it's anotger 50-100 Mb/s loss.

I ran a fool's errand looking for config issues, when it seems that it's just ccr/tile issue. Kind of disappointing, at least rb4011/ah1100x4 are cheaper than any ccr so there's that.

Who is online

Users browsing this forum: Google [Bot], raphaps, smirgo and 83 guests