"It's not what you don't know that kills you, it's what you know for sure that ain't true." - Mark Twain
I was sure about what I wrote below (now in gray), especially because I've run into the same situation described by the OP where everything was running smoothly when adding the tunnels one by one but after a reboot there were always issues, and adding the dedicated routes towards remote GRE peers with pref-src
was the solution (which rules out the influence of src-nat
rules as suggested by @Sob below, because such hypothetical NAT rules would have beaten the pref-src
settings). And worse than that, I've even come across a manual page which gave this explanation (which I'm obviously unable to google up right now).
However, a quick test on a lab device has now shown me that, at least using 6.42.6, the local-address
in both /interface gre
and /ip ipsec peer
used to set up the source address of the outgoing transport packets. A slightly slower test has shown me that after a reboot, at least at the time when the /tool sniffer
can be started, the packets also leave with the local-address
as their source address. So I'm at my wit's end, because a matter of fact is that what I suggest below was necessary to resolve that same problem on another pair of systems one of which is running 6.42.4 and the other one 6.42.6. In addition to 6.42.4 running on one of the ends, there is one more difference in the production case, though - the GRE tunnels are tunneled inside IPsec tunnels, so it takes some time after the reboot until the SAs get up and policies start stealing the packets from their normal routes, but that should not change anything about the packet's source addresses. To say it all, the src-address
parameters of the policies are /32 addresses, i.e. they only match the local-address
parameters of the corresponding /interface gre
My shot into the darkness is that you have set local-address in /interface gre and assume that this address is automatically used as source one for outgoing GRE transport packets, which is unfortunately not true, as the local-address is only used to identify the tunnel to which received GRE transport packets belong.
The source address of GRE transport packets is chosen depending on the output interface chosen to send them in the initial step of routing, using the routing table. But if some routing adjustments are taken later due to policy routing and alike, so the actual output interface is different from the initially chosen one, the source address of the packet doesn't change.
So a firewall somewhere on the way may not match packets in the public -> private direction to an existing connection because they come from a different address than their counterparts in private -> public direction, so it drops them and the tunnel doesn't get up.
To override the default choice of source address, use the pref-src parameter of a route. So if you need to use a pair of GRE tunnels between two devices and you need that each of the tunnels uses a different WAN interface, configure it as follows:
add name=gre-1.1-2.1 local-address=site.1.wan.1 remote-address=site.2.wan.1
add name=gre-1.2-2.2 local-address=site.1.wan.2 remote-address=site.2.wan.2
add dst-address=site.2.wan.1/32 gateway=site.1.gw.1 pref-src=site.1.wan.1
add dst-address=site.2.wan.2/32 gateway=site.1.gw.2 pref-src=site.1.wan.2
add name=gre-2.1-1.1 local-address=site.2.wan.1 remote-address=site.1.wan.1
add name=gre-2.2-1.2 local-address=site.2.wan.2 remote-address=site.1.wan.2
add dst-address=site.1.wan.1/32 gateway=site.2.gw.1 pref-src=site.2.wan.1
add dst-address=site.1.wan.2/32 gateway=site.2.gw.2 pref-src=site.2.wan.2
Instead of writing novels, post /export hide-sensitive. Use find&replace in your favourite text editor to systematically replace all occurrences of each public IP address potentially identifying you by a distinctive pattern such as my.public.ip.1.