Do you run with default settings (aes128-cbc and sha1) or did you select other options (that may be slower or not HW accelerated)?
BTW, the abovementioned "issue with IPsec hardware acceleration" has been fixed.
Yes my setting are default aes128-cbc and sha1. I noticed on CCR1009 that one core of CPU is only used during speed test. The link between my routers is going thru infrastructure of ISP - can it affect on it? When I run test without IPSEC encryption the speed is full.
Previously all the cores were used in parallel, but the problem was that individual packets would be encrypted asynchronously by the different cores and would come out in different order than they went into the router.
This in itself is entirely within spec of IP and it should work fine, but in practice a lot of users with Windows applications were complaining that the speed of their TCP connections was very low.
Apparently Windows does not handle this re-ordering correctly. Linux was not affected by this problem, at least not as much as Windows.
So it was fixed, and I think the fix has been to serialize everything for a single connection through a single core. But I don't know what method was used to determine the connection to core mapping, i.e. if it is based on network traffic (e.g. different TCP sessions go to different cores) or if it is only by IPsec policy or even peer.
You can test if you have improved performance when you test with two different TCP connections in parallel. If not, it could be that you have better more performance with two peer systems using the network at the same time. Of course that would not help you when you have only 2 systems that you want to connect together.
In general it can be said that testing performance has to be done carefully. There is often no relation whatsoever between what you measure in a simple "speed test program" and what you achieve when using the router for realistic traffic (many sessions in parallel). This can differ both in positive and in negative sense.