Two GRE tunnels over IPsec between CO and CPE

Hello everyone, I need help with Gre over IPSec.

I apologize for my English - I use Google Translate.

There is a CHR in the cloud, which has a white IP address and domain name. There are remote offices in which Mikrotik routers are installed. To connect remote offices, I use Gre tunnels that fall under IPSec policies. Gre tunnels are directed to the CHR, which is responsible for routing (I use OSPF). That is, the CHR is the central router that connects all clients. Certificates are used for authentication.

Of the features - I configure two Gre tunnels on each client. The interfaces of these two tunnels are in different VRFs. In this way, I route and isolate two conditional networks: MGMT (for network devices) and WORK (for client devices in the office).

The problem is that sometimes one of the two tunnels does not work. Although before that it worked fine. I suspect that the problems start after the router in the remote office is switched to a "backup" provider.

Here is a typical remote office config (current version RouterOS 7.20.4):

/ip ipsec mode-config add name=ike2_conf responder=no
/ip ipsec policy group add name=ike2_policies
/ip ipsec profile add dh-group=modp2048 enc-algorithm=aes-128 hash-algorithm=sha256 name=ike_sa
/ip ipsec peer add address=ipsec.blabla.com exchange-mode=ike2 name=ike2_peer_core profile=ike_sa
/ip ipsec proposal add auth-algorithms=sha256 enc-algorithms=aes-256-cbc name=ike_auth pfs-group=none
/ip ipsec identity add auth-method=digital-signature certificate=ipsec.client.blabla.com_ec.crt_0 generate-policy=port-strict mode-config=ike2_conf peer=ike2_peer_core policy-template-group=ike2_policies
/ip ipsec policy add group=ike2_policies proposal=ike_auth template=yes

/interface gre add allow-fast-path=no local-address=192.168.70.1 mtu=1420 name=gre_core_mgmt remote-address=192.168.70.254
/interface gre add allow-fast-path=no local-address=192.168.70.1 mtu=1420 name=gre_core_work remote-address=192.168.70.253

/ip firewall address-list add address=192.168.7.1 list=hosts_ipsec_local
/ip firewall address-list add address=192.168.7.254 list=hosts_ipsec_remote
/ip firewall address-list add address=192.168.7.253 list=hosts_ipsec_remote
/ip firewall raw add action=notrack chain=prerouting dst-address-list=hosts_ipsec_local in-interface-list=WANs ipsec-policy=in,ipsec protocol=gre src-address-list=hosts_ipsec_remote
/ip firewall raw add action=accept chain=prerouting dst-address-list=hosts_ipsec_local in-interface-list=WANs ipsec-policy=in,ipsec protocol=gre src-address-list=hosts_ipsec_remote
/ip firewall filter add action=accept chain=output ipsec-policy=out,ipsec out-interface-list=WANs protocol=gre
/ip firewall filter add action=drop chain=output out-interface-list=WANs protocol=gre

Here is a config of the central CHR (current version RouterOS 7.20.5):

/interface bridge add fast-forward=no name=lo_ipsec_main port-cost-mode=short protocol-mode=none
/interface bridge add fast-forward=no name=lo_ipsec_work port-cost-mode=short protocol-mode=none
/ip address add address=192.168.70.254 interface=lo_ipsec_main network=192.168.70.254
/ip address add address=192.168.70.253 interface=lo_ipsec_work network=192.168.70.253
/interface gre add allow-fast-path=no local-address=192.168.70.254 mtu=1420 name=gre_client1_mgmt remote-address=192.168.70.1
/interface gre add allow-fast-path=no local-address=192.168.70.253 mtu=1420 name=gre_client1_work remote-address=192.168.70.1

/ip ipsec mode-config add address=192.168.70.1 name=ike2_conf_client split-include=192.168.70.254/32,192.168.70.253/32 system-dns=no
/ip ipsec policy group add name=ike2_policies
/ip ipsec profile set [ find default=yes ] dpd-interval=2m dpd-maximum-failures=5
/ip ipsec profile add dh-group=modp2048 dpd-interval=2m dpd-maximum-failures=5 enc-algorithm=aes-128 hash-algorithm=sha256 name=ike_sa
/ip ipsec peer add exchange-mode=ike2 name=ike2_peer passive=yes profile=ike_sa
/ip ipsec proposal add auth-algorithms=sha256 enc-algorithms=aes-256-cbc name=ike_auth pfs-group=none
/ip ipsec identity add auth-method=digital-signature certificate=ipsec.blabla.com_ec.crt_0 generate-policy=port-strict match-by=certificate mode-config=ike2_conf_client peer=ike2_peer policy-template-group=ike2_policies remote-certificate=ipsec.client.blabla.com_ec.crt_0
/ip ipsec policy add dst-address=192.168.70.0/24 group=ike2_policies proposal=ike_auth src-address=192.168.70.254/32 template=yes
/ip ipsec policy add dst-address=192.168.70.0/24 group=ike2_policies proposal=ike_auth src-address=192.168.70.253/32 template=yes

/ip firewall address-list add address=192.168.70.254 list=hosts_ipsec_local
/ip firewall address-list add address=192.168.70.253 list=hosts_ipsec_local
/ip firewall address-list add address=192.168.70.0/24 list=hosts_ipsec_remote
/ip firewall raw add action=notrack chain=prerouting dst-address-list=hosts_ipsec_local in-interface-list=WANs ipsec-policy=in,ipsec protocol=gre src-address-list=hosts_ipsec_remote
/ip firewall raw add action=accept chain=prerouting dst-address-list=hosts_ipsec_local in-interface-list=WANs ipsec-policy=in,ipsec protocol=gre src-address-list=hosts_ipsec_remote
/ip firewall filter add action=accept chain=output ipsec-policy=out,ipsec out-interface-list=WANs protocol=gre
/ip firewall filter add action=drop chain=output out-interface-list=WANs protocol=gre

This is what it looks like from the central CHR side

I hope I have described the situation clearly.

Unfortunately this cannot be done. In theory it could work with GRE because there is a “tunnel ID” field in the header, but RouterOS does not allow to set it.

You will need to make two different types of tunnel to be able to achieve this.

I rarely disagree with @pe1chl but this time I do. While RouterOS indeed ignores the existence of the optionalKeyparameter of the GRE header, the use of a different address for each of the two tunnels at one of the devices (which is what you do) is normally sufficient to make both of them work.

There are moments in your configuration that may complicate things.

First, you dynamically assign 192.168.70.1 to the initiator using mode-config(which is OK by itself), but you set 192.168.70.1 aslocal-addressfor the tunnels, so while the IPsec is down, that address is not available on the initiator device. I don’t say it necessarily causes the issue but if nothing else is found, this would be worth investigating too.

Next, you have chosen two adjacent /32 addresses (.253 and .254) for the tunnels on the central element, which means that two IPsec policies are necessary. I’m not sure whether RouterOS creates them from the template withlevelset tounique; if it doesn’t, it may also cause trouble. So using .252 instead of .254 and using .252/31 forsplit-includewould reduce the number of Phase 2 installed-sa to one half and might also address your issue.

However, the most likely cause seems to be the way you handle connection tracking. In order that the GRE transport packets (192.168.70.1 <-> 192.168.70.253) wouldn’t get NATed, it is indeed possible to mark them asuntrackedinrawas you do, but you only apply this treatment to the ones that arrive through WAN, not those that your router sends out via WAN. So the ones being sent can hit thesrc-natormasqueraderule and get source-NATed. To fix that, you should addaction=notrackrules matching on the corresponding address lists (or maybe just onprotocol=gre) also to chainoutputinraw.

Another thing related to connection tracking - the Phase 2 SA connections that were established via the primary ISP may be NATed too, so when the central router switches over to a backup ISP, it may keep src-nating the outgoing packets with the public address of the primary WAN interface even though they leave through the secondary one.

There also used to be an issue with incoming GRE packets that may be marked withconnection-state=invalid- this started happening even for connection-tracked GRE connections after some “security update” in the past, and in your case, you are even further from the standard paths through the software stack as incoming GRE packets extracted from the ESP ones have connection-state=untrackedthanks to yourrawrule.

I cannot clearly identify any of the above points as a 100 % reason of your trouble, especially as you say that only one of the two tunnels is affected, but the behavior of connection tracking is the typical cause of this kind of issues when transitional effects (like temporary loss of WAN address) are involved.

This is another reason why posting only the “relevant” part of the configurations is usually insufficient because the issue is typically caused by the settings one doesn’t expect to be related.

So the first step should be to post the complete exports (anonymized of course), and if suggested changes do not help, packet sniffing while the issue exists would be the next one.

1 Like

Ok I did not notice his endpoint addresses are different. Indeed they are, so that should be sufficient to separate the tunnels. At least when they are not behind some NAT router that translates them to the same address.

Thanks for the help. All comments are relevant, so I applied everything immediately in work. At the moment everything works as it should.

By the way, I looked at the logs and saw that CHR perfectly sees the change of provider on clients and forcibly creates new connections.