Amazon AWS VPN -- A Working Configuration Example and Bug

Year after, this post is still of a great value, thank you very much! Most of the things are straight-forward, but there is one thing that I have struggled to understand for already 20 hours in a row, searched high and low and still can’t get my head around it..

First, something minor.

0   ;;; critically important to AWS connectivity that this rule be ahead of "masquerade".
     chain=srcnat action=src-nat to-addresses=192.168.88.0/24 dst-address=172.31.0.0/16

I was puzzled with this notation, since as I understand the idea behind this is that for the BGP routed network we want to avoid masquerading. However the way it is accomplished is rather strange: you change the source address to… itself? Or even, in case the packet arrives from another network that this local router is serving, say, 192.168.89.0/24, it will snat it, mapping to the 88 network, and may introduce conflicts (as I suspect). Instead, I used the return action at this point to simply leave the source address as is. Also, I had to add another rule to allow for the BGP traffic

 2    chain=srcnat action=return dst-address=172.31.0.0/16 log=no log-prefix="" 
 3    chain=srcnat action=return dst-address=169.254.0.0/16 log=yes log-prefix=""

Next, my biggest unresolved question
why do we need to create mirror ipsec policies for the BGP link network, and creating mirror for the main site to site policy breaks everything?
I mean the following snippet, where policies 1-3 and 2-4 are mirroring each other, but when I try to create a mirror policy for 0, even with explicit addresses, it just fails and the packets silently die during the RouthingDecision http://wiki.mikrotik.com/wiki/Manual:Packet_Flow phase (I see the packet on mangle prerouting and nat dstnat, and then it disapears; if I add explicit rule in routing to mark the destination as unreachable then it works by returning the corresponding ICMP message to the issuer). This makes it even harder for me to understand, because it means that changes to the ipsec policy implicitly influences the way the packet is routed, although it is not shown on the packet flow

0    ;;; AWS Tunnels
      src-address=0.0.0.0/0 src-port=any dst-address=172.31.0.0/16 dst-port=any protocol=all action=encrypt level=require 
      ipsec-protocols=esp tunnel=yes sa-src-address=x.x.x.x sa-dst-address=205.251.233.120 proposal=default priority=0 

1    src-address=169.254.249.26/32 src-port=any dst-address=169.254.249.25/32 dst-port=any protocol=all action=encrypt level=require 
      ipsec-protocols=esp tunnel=yes sa-src-address=x.x.x.x sa-dst-address=205.251.233.119 proposal=default priority=0 

2    src-address=169.254.249.30/32 src-port=any dst-address=169.254.249.29/32 dst-port=any protocol=all action=encrypt level=require 
      ipsec-protocols=esp tunnel=yes sa-src-address=x.x.x.x sa-dst-address=205.251.233.120 proposal=default priority=0 

3    src-address=169.254.249.25/32 src-port=any dst-address=169.254.249.26/32 dst-port=any protocol=all action=encrypt level=require 
      ipsec-protocols=esp tunnel=yes sa-src-address=205.251.233.119 sa-dst-address=x.x.x.x proposal=default priority=0 

4    src-address=169.254.249.29/32 src-port=any dst-address=169.254.249.30/32 dst-port=any protocol=all action=encrypt level=require 
      ipsec-protocols=esp tunnel=yes sa-src-address=205.251.233.120 sa-dst-address=x.x.x.x proposal=default priority=0

I suspect that there is something I don’t know about the basics of IPSec implementation in RouterOS, but there is absolutely no information on how it works.
I understand, that one policy should be enough for both encoding and decoding traffic, and in case of decoding it is applied in “reverse”, i.e. a policy is working for a received packet where src-address of the policy is matched to the destination address in the received packet. That seems valid assumption since in the examples only one rule is added for a site-to-site tunneling, with single mirrored rule on the other side of the tunnel (different router). But then this should be working the same for the BGP networks! I tried removing rules 3 and 4 above, and everything worked just fine. However the fact that mirror policies for BGP networks work fine, and mirror policy for the main network crashes routing is something that doesn’t let me sleep

Now, let’s see one big issue with the proposed solution.

The author sets up both VPN tunnels and both BGP instances, but only one site to site policy is installed (active), due to the described RouterOS limitation. But that results in a very bad situation that I have observed.

Since Amazon side sees two working tunnels it’s routing at one moment decides to go over the other channel, not the one that is being selected as active on our side. As long as the ipsec policy that would be used to decode the response is in inactive state (NB, “I” stands for inactive, not invalid), or simply missing (like the author suggested), then we observe unanswered ping. At the same time, on the other side our ping is received and even answered, but it returns over the channel that we can’t decode. The way I am detecting this, is by monitoring the installed-as current bytes property, which obviously increases with every pair of ICMP messages, and the response goes over the other channel.

The only way to fix this is to shut down one of the BGP instances and the corresponding ipsec policy. Probably a script can be developed that would switch channels.

I am currently investigating options to control the AWS BGP decision on the route selection, so that I can actively control it based on the currently active ipsec policy on my side.