Community discussions

MikroTik App
 
DaviV
just joined
Topic Author
Posts: 10
Joined: Thu Apr 26, 2018 1:33 pm

IPSEC peer FQDN as failover/LB

Tue Oct 18, 2022 6:27 pm

Hello community,

I would like to ask for clarification about implementation of IPSEC peer with FQDN.
In our envinroment we have few hundreds spokes of multiple brands (majority Cisco ASA, Mikrotiks and Fortigates) and we are using CNAME(2 hubs) as peer FQDN to achieve load-balancing and failover.

Currently with Mikrotik we are failing to achieve failover on 6.48 version same as on 7.x version.
If we test by killing one of the HUBs. In Mikrotik peer will just time-out, phase 1 disappears after about 1-2 minutes, but nothing more happens.
Only solution is to manualy disable peer and enabled again, then it seems that it tries to resolve/connect to both resolved IPs again and reaches peer which is alive.

This doesn't look like desired functionality or am I missing something ?


Thanks

David
 
Sob
Forum Guru
Forum Guru
Posts: 9119
Joined: Mon Apr 20, 2009 9:11 pm

Re: IPSEC peer FQDN as failover/LB

Tue Oct 18, 2022 9:47 pm

I'm not sure if I correctly understand what exactly you do, but I can confirm that something with peers using hostnames is weird and doesn't work as expected (tested with 6.49.7). Even if you do something slightly different, it's probably the same problem.

As a test, I have two servers (passive, for peers/clients that can be behind NAT) with A records, and hostname used by client is CNAME pointing to one of them:
server1.test.tld.  3600  A      1.2.3.4
server2.test.tld.  3600  A      4.3.2.1
server.test.tld.     10  CNAME  server1.test.tld.
When client first connects to server.test.tld, it gets 1.2.3.4. When this server goes down, I change CNAME to point to second one and expect that when client notices failed connection (using DPD), it will resolve server.test.tld again and get 4.3.2.1. It sort of works, but not as expected. I have CNAME with short TTL, so when client tries to reconnect, original answer already expired. But it seems that client doesn't care about that, instead it takes TTL from A record and assumes that this address of server is good for that long. But it isn't. I think it should use TTL from CNAME record.

It's better when I skip CNAME and just use A directly:
server.test.tld.     10  A      1.2.3.4
But even then it's not perfect, there's still some caching or something going on, because reconnection attempts still use the old address few times (I saw 1-3), even when there's already new address in DNS cache.
 
DaviV
just joined
Topic Author
Posts: 10
Joined: Thu Apr 26, 2018 1:33 pm

Re: IPSEC peer FQDN as failover/LB

Wed Oct 19, 2022 7:03 am

Yes, This is exactly the issue I am having. It seems to me, that resolve is done just first time when peer is connecting. Then it sticks to it no-matter what is going on(with or without DPD) . Only solution is do disable peer and enable again to actually connect to second availabl peer. TTL is set to 5 minutes.


David
 
Sob
Forum Guru
Forum Guru
Posts: 9119
Joined: Mon Apr 20, 2009 9:11 pm

Re: IPSEC peer FQDN as failover/LB

Wed Oct 19, 2022 1:38 pm

I don't know what exactly is different, but it's not that bad here. With CNAME it does stick to TTL from wrong record and bare A is doing something slightly weird too, but it does re-resolve hostname and switch to new address eventually.
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18958
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: IPSEC peer FQDN as failover/LB

Wed Oct 19, 2022 4:39 pm

Are you saying that other vendors equipment functions as expected but MTs version of the functionality, is ( and this is my least favourite term for factually describing issues) WONKY???
(ps there is no coincidence that wonkey rhymes with donkey which some would call an ass and Sob is in this thread - feel the luv)
 
DaviV
just joined
Topic Author
Posts: 10
Joined: Thu Apr 26, 2018 1:33 pm

Re: IPSEC peer FQDN as failover/LB

Thu Oct 20, 2022 8:06 am

Are you saying that other vendors equipment functions as expected but MTs version of the functionality, is ( and this is my least favourite term for factually describing issues) WONKY???
(ps there is no coincidence that wonkey rhymes with donkey which some would call an ass and Sob is in this thread - feel the luv)
well, its not that straight forward but they "do"

CIsco ASA not supporting FQDN at all, but multiple peers since 9.14 within few minutes it moves to other peer.
Fortigate supporting DDNS as peer, swaps peer within few seconds-few minutes.
Tik stuck on dead peer for eternity. (I gave up after 20 minutes)

I just want to avoid using scripting for this as it should be supported.

David
 
Sob
Forum Guru
Forum Guru
Posts: 9119
Joined: Mon Apr 20, 2009 9:11 pm

Re: IPSEC peer FQDN as failover/LB

Thu Oct 20, 2022 3:24 pm

RouterOS also allows to specify multiple peers for policy, I didn't try it before, but I assume it's meant as failover. When I test it now, it doesn't seem to work at all. Initially there's active phase 1 for both peers, phase 2 for one of them, and tunnel works. When that peer dies, I'd expect that phase 2 would go to second peer. But nope, it just keeps trying to re-establish phase 1 with first peer and ignores second one, even though it's there and ready.
 
DaviV
just joined
Topic Author
Posts: 10
Joined: Thu Apr 26, 2018 1:33 pm

Re: IPSEC peer FQDN as failover/LB

Tue Oct 25, 2022 4:32 pm

RouterOS also allows to specify multiple peers for policy, I didn't try it before, but I assume it's meant as failover. When I test it now, it doesn't seem to work at all. Initially there's active phase 1 for both peers, phase 2 for one of them, and tunnel works. When that peer dies, I'd expect that phase 2 would go to second peer. But nope, it just keeps trying to re-establish phase 1 with first peer and ignores second one, even though it's there and ready.
Hi, so I have just tested that, looked nice until after moment when both peers tried to be active each had 1 ph2 out of 4 and it was very quick disaster.

David
 
DaviV
just joined
Topic Author
Posts: 10
Joined: Thu Apr 26, 2018 1:33 pm

Re: IPSEC peer FQDN as failover/LB

Tue Nov 08, 2022 8:19 am

Hello There,
I wonder if opening ticket makes sense.

I am 100% sure at this point that when peer is specified as FQDN and it dies nothing happens, IP lookup happens only when peer is disabled and enabled.

David
 
sindy
Forum Guru
Forum Guru
Posts: 10205
Joined: Mon Dec 04, 2017 9:19 pm

Re: IPSEC peer FQDN as failover/LB

Tue Nov 08, 2022 9:48 am

Definitely do open a ticket, with a supout.rif taken after the DNS change (when the peer keeps hammering the old address) and with a reference to this forum topic.
 
DaviV
just joined
Topic Author
Posts: 10
Joined: Thu Apr 26, 2018 1:33 pm

Re: IPSEC peer FQDN as failover/LB

Fri Nov 25, 2022 2:32 pm

Hello,

Just a small update. It was confirmed today by support that it ignores CNAME TTL and rather takes A entry TTL.

In the meantime I have done script which checks active peer count and also estabilished PH2 count. if there is no active peer or active ph2 amount is lower than 2 ( we use 4) then disable and re-enable peer.
/system scheduler
    add interval=20s name=VPN-check on-event=":local PH2 [/ip ipsec policy print c\
    ount-only where ph2-state=established ]\r\
    \n#\r\
    \n:local PC [/ip ipsec active-peers print count-only ]\r\
    \n#\r\
    \n:if (\$PC<1||\$PH2<2)  do={\r\
    \n  /ip ipsec peer disable [find name=YOUR_PEER_NAME]\r\
    \n  :log info \"IPSEC-PEER is dead disabling-enabling peer!\"\r\
    \n  :delay 3\r\
    \n  /ip ipsec peer enable [find name=YOUR_PEER_NAME]\r\
    \n} else={\r\
    \n}" policy=read,write,policy,test start-date=nov/20/2022 start-time=\
    23:15:00

Who is online

Users browsing this forum: adimihaix, rplant and 68 guests