Ike2 Ipsec random peer connections after reboots

Hi,
this is my network
net.png
until a few days ago everything was fine, but after a responder reboot not all initiators establish the VPN, at each reboot different initiators connect, sometimes 4, sometimes 6, the rest remain in no phase 2. I can see this messages in the initiators log

may/04 00:14:50 ipsec ike2 starting for: xxx.xxx.xxx.xxx 
may/04 00:14:53 ipsec adding notify: IKEV2_FRAGMENTATION_SUPPORTED 
may/04 00:14:53 ipsec adding notify: NAT_DETECTION_DESTINATION_IP 
may/04 00:14:53 ipsec adding notify: NAT_DETECTION_SOURCE_IP 
may/04 00:14:53 ipsec adding payload: NONCE 
may/04 00:14:53 ipsec adding payload: KE 
may/04 00:14:53 ipsec adding payload: SA 
may/04 00:14:53 ipsec <- ike2 request, exchange: SA_INIT:0 xxx.xxx.xxx.xxx[4500] 25deef602c73f383:0000000000000000 
may/04 00:14:53 ipsec -> ike2 reply, exchange: SA_INIT:0 xxx.xxx.xxx.xxx[4500] 25deef602c73f383:0000000000000000 
may/04 00:14:53 ipsec payload seen: NOTIFY 
may/04 00:14:53 ipsec first payload is NOTIFY 
may/04 00:14:53 ipsec processing payloads: NOTIFY 
may/04 00:14:53 ipsec   notify: COOKIE 
may/04 00:14:53 ipsec adding notify: COOKIE 
may/04 00:14:53 ipsec adding notify: IKEV2_FRAGMENTATION_SUPPORTED 
may/04 00:14:53 ipsec adding notify: NAT_DETECTION_DESTINATION_IP 
may/04 00:14:53 ipsec adding notify: NAT_DETECTION_SOURCE_IP 
may/04 00:14:53 ipsec adding payload: NONCE 
may/04 00:14:53 ipsec adding payload: KE 
may/04 00:14:53 ipsec adding payload: SA 
may/04 00:14:54 firewall,info input: in:ether1-WAN out:(unknown 0), src-mac e4:ab:89:a6:6d:d9, proto UDP, xxx.xxx.xxx.xxx:4500->192.168.1.254:4500, prio 1->0, len 88 
may/04 00:14:58 ipsec <- ike2 init retransmit request, exchange: SA_INIT:0 xxx.xxx.xxx.xxx[4500] 25deef602c73f383:0000000000000000 
may/04 00:15:03 ipsec <- ike2 init retransmit request, exchange: SA_INIT:0 xxx.xxx.xxx.xxx[4500] 25deef602c73f383:0000000000000000 
may/04 00:15:08 ipsec <- ike2 init retransmit request, exchange: SA_INIT:0 xxx.xxx.xxx.xxx[4500] 25deef602c73f383:0000000000000000 
may/04 00:15:13 ipsec ike2 init timeout request, exchange: SA_INIT:0 xxx.xxx.xxx.xxx[4500] 25deef602c73f383:0000000000000000

xxx.xxx.xxx.xxx = responder public IP
Here you can see the responder log
responder log.txt (54.2 KB)
This is the responder config:

/ip ipsec peer
add exchange-mode=ike2 name=IKE2-peers passive=yes profile=IKE2-profile send-initial-contact=no

/ip ipsec profile
add dh-group=modp2048 enc-algorithm=aes-256 hash-algorithm=sha256 name=IKE2-profile

/ip ipsec identity
add auth-method=digital-signature certificate=VDC generate-policy=port-strict match-by=certificate mode-config=IKE2-configs peer=IKE2-peers policy-template-group=IKE2-group remote-certificate=Offices

/ip ipsec mode-config
add address-pool=IKE2-pool name=IKE2-configs system-dns=no

/ip ipsec policy group
add name=IKE2-group

/ip ipsec policy
add group=IKE2-group proposal=IKE2-proposal template=yes

/ip ipsec proposal
add auth-algorithms=sha256 enc-algorithms=aes-256-cbc lifetime=1d name=IKE2-proposal pfs-group=modp2048

The configuration in the initiators is fine, before the reboot all of them connected without problems, I have still checked them. I have also restored a backup on the responder before the problem occurred with no luck.

I don’t know what else to look at or do

Please help

I don’t know why, but after doing a thousand tests it suddenly started working normally again using the initial configuration. Now the problem is back again :frowning:


I can see in torch the connection attempts of the initiators that fail to establish the connection.

by modifying the IPSec log configuration I can see the following information:

11:08:15 ipsec ipsec_: → ike2 request, exchange: SA_INIT:0 XXX.XXX.XXX.XXX[4500] 8b3d60c068e5dfc7:0000000000000000
11:08:15 ipsec ipsec_: ike2 respond
11:08:15 ipsec ipsec_: payload seen: NOTIFY
11:08:15 ipsec ipsec_: payload seen: NOTIFY
11:08:15 ipsec ipsec_: payload seen: NOTIFY
11:08:15 ipsec ipsec_: payload seen: NOTIFY
11:08:15 ipsec ipsec_: payload seen: NONCE
11:08:15 ipsec ipsec_: payload seen: KE
11:08:15 ipsec ipsec_: payload seen: SA
11:08:15 ipsec ipsec_: IKE_SA_INIT limit reached, dropping packet

Please help

For those who may have a similar problem or are interested, setting the PFS Group to none in the proposal solved the problem.

It seems like an unrelated issue to me. A mismatch in PFS settings affects the connection at the first Phase 2 rekey, so typically in about half an hour after the communication is initially established. If it worked for days before the reboot of the responder, it cannot be the reason.

Most issues like the one you have described are related to NAT where one connection has been established in one direction, and then another one arrives in the opposite direction and hits a dst-nat rule, but the reply-src-address assigned by the dst-nat operation would be the src-address of the other connection so the two would clash because the reply-src-address of the other connection is the same like the src-address of the new one.

So I would assume it is exactly this if not for the fact that what has rebooted was the responder that should sit and wait until something comes from the initiator.

If you said it was the other router at the responder side that has rebooted, I would have no doubts that the above is the explanation as the responder would keep sending keepalive packets and if such one would hit that other router first after it came back up, the packet from the initiator arriving later could hit the trap described above.

But I am nevertheless concerned by the DMZ label next to the initiators - for me, DMZ is an alias for 1:1 protocol and port agnostic dst-nat, whereas the connections initiated in the opposite direction may allocate ports on the public address even if they do not come from the internal destination of the DMZ, so conflicts like described above are possible. And there is no DMZ label at the responder side which is even more surprising - does the other router not do any NAT?