routing between two cloud providers though vpn

OK I have a very good subject to discuss. I will start from beginning.

We have two providers DigitalOcean and UpCloud. The idea was to connect networks of those two providers. UpCloud is much easier to manage if it comes to routing because all regions (Frankfurt, Amsterdam, London… and so on) can ‘talk’ to each other through private VLAN which is awesome. DigitalOcean Unfortunately doesn’t have that solution and servers between the same region can communicate each other. And so I decided to put two servers and transform them to Cloud Routers using CHR. To have pretty good latency I decided to put those Cloud Hosted Routers on the same Region in Different providers. I have chosen Amsterdam region in DigitalOcean and UpCloud. I will provide config here which was used from: https://firstdigest.com/2014/12/mikrotik-ipsec-vpn/

And so we have
Amsterdam DigitalOcean network

10.129.0.0/16

Amsterdam UpCloud network

10.5.0.0/22

Network addresses are independent
For this post I will skip public addresses and mask them using x signs.
DigitalOcean mikrotik is called shorter : domik IP: 10.129.9.234
UpCloud mikrotik is called shorter: upmik IP: 10.5.0.120


Configuration of IPsec
IP > Firewall > Filter
domik

/ip firewall filter print
Flags: X - disabled, I - invalid, D - dynamic
0 chain=input action=accept protocol=ipsec-ah
1 chain=input action=accept protocol=ipsec-esp
2 chain=input action=accept protocol=udp port=500
3 chain=input action=accept protocol=udp port=4500

upmik

There is no need to paste here cuz it looks exactly the same

IP > Firewall > NAT
domik

/ip firewall nat pr
Flags: X - disabled, I - invalid, D - dynamic
0 ;;; [DO AMS#2 - UP AMS#1]
chain=srcnat action=accept src-address=10.129.0.0/16 dst-address=10.5.0.0/22

upmik

/ip firewall nat pr
Flags: X - disabled, I - invalid, D - dynamic
0 ;;; [UP AMS#1 - DO AMS#2]
chain=srcnat action=accept src-address=10.5.0.0/22 dst-address=10.129.0.0/16

IP > IPsec > Proposals
domik

1 name=“DO-UP” auth-algorithms=sha1 enc-algorithms=aes-256-cbc lifetime=30m pfs-group=none

upmik

this is also the same as above

IP > IPsec > Policies
domik

1 A src-address=10.129.0.0/16 src-port=any dst-address=10.5.0.0/22 dst-port=any protocol=all action=encrypt level=require ipsec-protocols=esp tunnel=yes sa-src-address=domik_external_IP sa-dst-address=upmik_external_IP proposal=DO-UP
ph2-count=1

upmik

1 A src-address=10.5.0.0/22 src-port=any dst-address=10.129.0.0/16 dst-port=any protocol=all action=encrypt level=require ipsec-protocols=esp tunnel=yes sa-src-address=upmik_external_IP sa-dst-address=domik_external_IP proposal=DO-UP
ph2-count=1

IP > IPsec > Peers
domik

0 address=upmik_external_IP/32 auth-method=pre-shared-key secret=“xxxxxx” generate-policy=no policy-template-group=default exchange-mode=main send-initial-contact=yes nat-traversal=yes proposal-check=obey hash-algorithm=sha1
enc-algorithm=aes-128,3des dh-group=modp1024 lifetime=1d dpd-interval=2m dpd-maximum-failures=5

upmik

0 address=domik_external_IP/32 auth-method=pre-shared-key secret=“xxxxxx” generate-policy=no policy-template-group=default exchange-mode=main send-initial-contact=yes nat-traversal=yes proposal-check=obey hash-algorithm=sha1
enc-algorithm=aes-128,3des dh-group=modp1024 lifetime=1d dpd-interval=2m dpd-maximum-failures=5

I see connection is made.
I can’t ping domik router and upmik router
domik

ping 10.5.0.120
SEQ HOST SIZE TTL TIME STATUS
0 10.5.0.120 timeout
1 10.5.0.120 timeout
2 10.5.0.120 timeout
3 10.5.0.120 timeout
sent=4 received=0 packet-loss=100%

and the same on upmik

ping 10.129.9.234
SEQ HOST SIZE TTL TIME STATUS
0 10.129.9.234 timeout
1 10.129.9.234 timeout
2 10.129.9.234 timeout
3 10.129.9.234 timeout
sent=4 received=0 packet-loss=100%

The routing table are defined below
domik

0 A S 0.0.0.0/0 domik_gateway 1
1 S 10.5.0.0/22 10.5.0.120 1
2 ADC 10.129.0.0/16 10.129.9.234 ether2 0
3 ADC domik_network/24 domik_IP ether1 0

upmik

0 A S 0.0.0.0/0 upmik_gateway 1
1 ADC 10.5.0.0/22 10.5.0.120 ether2 0
2 S 10.129.0.0/16 10.129.9.234 1
3 ADC upmik_network/22 upmik_IP ether1 0

I tried to ping in the first scenario
servers [ 10.5.0.0/22 ] → DigitalOcean router [ 10.129.9.234 ]
From UpCloud server I can ping domik router

ping 10.129.9.234
PING 10.129.9.234 (10.129.9.234) 56(84) bytes of data.
64 bytes from 10.129.9.234: icmp_seq=1 ttl=63 time=2.50 ms
64 bytes from 10.129.9.234: icmp_seq=2 ttl=63 time=1.26 ms
64 bytes from 10.129.9.234: icmp_seq=3 ttl=63 time=1.27 ms
64 bytes from 10.129.9.234: icmp_seq=4 ttl=63 time=1.29 ms
— 10.129.9.234 ping statistics —
4 packets transmitted, 4 received, 0% packet loss, time 3002ms
rtt min/avg/max/mdev = 1.265/1.584/2.500/0.530 ms

Route List on that UpCloud server looks like this

Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 xx.xx.xx.xx 0.0.0.0 UG 0 0 0 eth0
10.0.0.0 10.5.0.1 255.0.0.0 UG 0 0 0 eth1
10.5.0.0 0.0.0.0 255.255.252.0 U 0 0 0 eth1
10.129.0.0 10.5.0.120 255.255.0.0 UG 0 0 0 eth1
xx.xx.xx.xx 0.0.0.0 255.255.252.0 U 0 0 0 eth0

But I have problem when try ping from the other side (DigitalOcean server → upmik router)

ping 10.5.0.120
PING 10.5.0.120 (10.5.0.120) 56(84) bytes of data.
— 10.5.0.120 ping statistics —
6 packets transmitted, 0 received, 100% packet loss, time 5039ms

Route List on that DigitalOcean server looks like this

Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 xx.xx.xx.xx 0.0.0.0 UG 0 0 0 eth0
10.5.0.0 10.129.9.234 255.255.252.0 UG 0 0 0 eth1
10.14.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0
10.129.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth1
xx.xx.xx.xx 0.0.0.0 255.255.255.0 U 0 0 0 eth0

Of course ping from that faulty server to domik 10.129.9.234 works

ping 10.129.9.234
PING 10.129.9.234 (10.129.9.234) 56(84) bytes of data.
64 bytes from 10.129.9.234: icmp_seq=1 ttl=64 time=1.08 ms
64 bytes from 10.129.9.234: icmp_seq=2 ttl=64 time=0.420 ms
64 bytes from 10.129.9.234: icmp_seq=3 ttl=64 time=0.377 ms
64 bytes from 10.129.9.234: icmp_seq=4 ttl=64 time=0.393 ms
— 10.129.9.234 ping statistics —
4 packets transmitted, 4 received, 0% packet loss, time 2999ms
rtt min/avg/max/mdev = 0.377/0.569/1.086/0.298 ms

Server traceroute on UpCloud Server to DigitalOcean router looks like this

traceroute 10.129.9.234
traceroute to 10.129.9.234 (10.129.9.234), 30 hops max, 60 byte packets
1 10.5.0.120 (10.5.0.120) 1.908 ms 1.907 ms 1.923 ms
2 10.129.9.234 (10.129.9.234) 5.112 ms 5.151 ms 5.166 ms

But The same traceroute but from DigitalOcean server to UpCloud router looks like this

traceroute to 10.5.0.120 (10.5.0.120), 30 hops max, 60 byte packets
1 * * *
2 * * *
3 * * *
4 * * *
5 * * *
6 * * *
7 * * *
8 * * *
9 * * *
10 * * *

30 * * *

Which is strange because there should be first hop to DigitalOcean Router 10.129.9.234

I could miss something.. I appreciate any help by this

You may need to specify the source address for your ping. It has to be forced to the 10.x address.
However, I recommend you to not use direct IPsec policies but instead configure a GRE/IPsec tunnel with routes on each side.
That way you can avoid such painful debugging sessions: it always works the first try.

What kind of source address ? I see that boths sides have the same routes defined for each side and one works, the other (DigitalOicean) don’t.

I don’t mind using GRE/IPsec but any tutorial for that?

Like I thought it could be problem with DigitalOcean infrastructure cuz IPsec is for sure configured correctly.

That said, looking at your threads and this ticket, I see something very troubling. I might be wrong, but it looks like you’re trying to bridge our internal network in AMS2 to your network on upcloud, and I will tell you right now, that’s not going to work. Our “private” networks in each datacenter are not VPCs at this time, they’re open internal networks where any user in the datacenter can ping any other user with eth1 enabled in that datacenter. For this reason, you can’t set up your own virtual router to send traffic to another provider. Our platform would block that sort of activity, which is the issue you’re running into.

So meaning that I don’t recommend stupid DigitalOcean which private network is useless