I’ve got two RB1100AHx4’s which are connected via an IPSec Tunnel, both were running 6.48.1 previously.
I was finally on site with one of the routers so I upgraded it to 7.22.1 (the other will get upgraded next time I’m in the same building as it, and I’ll be moving the intra-site VPN over to Wireguard.). The connection was working fine, but I added an additional backup/OOB WAN link (which is NATted) while I was onsite, the route for that connection was set with a distance of 255 (the primary WAN link’s route is set to distance 1).
When the IPSec tunnel re-initialised the tunnel switched the “local address” to the address on that additional WAN link (i.e. that 172.x.x.x address was showing under the Installed SAs), the link came up but no data traversed it presumably because that local address didn’t match the Policy.
I figured updating the “local address” on the Peer would resolve the issue, so I updated that with the public IP address of the “correct” WAN link, and the tunnel failed to come up. ROS7 side reported “no phase1”, ROS6 side reported “no phase2”, the only thing I can think is that because the “src address” on the Policy is set to IP/32 whereas on the Peer the local address is JUST the IP so it’s failing to match the Policy with the Peer?
Disabling the route out that additional link has resolved the issue. It seems very odd to me that the IPSec stack would pick the route with the highest distance to use as it’s local address in the first place…
Also weirdly although everything is working the ROS 7 end is STILL showing that 172.x.x.x address under Installed SAs and active peers (the ROS6 side is showing what I expect)…