/interface gre
add name=gre-1.1-2.1 local-address=site.1.wan.1 remote-address=site.2.wan.1
add name=gre-1.2-2.2 local-address=site.1.wan.2 remote-address=site.2.wan.2
/ip route
add dst-address=site.2.wan.1/32 gateway=site.1.gw.1 pref-src=site.1.wan.1
add dst-address=site.2.wan.2/32 gateway=site.1.gw.2 pref-src=site.1.wan.2
/interface gre
add name=gre-2.1-1.1 local-address=site.2.wan.1 remote-address=site.1.wan.1
add name=gre-2.2-1.2 local-address=site.2.wan.2 remote-address=site.1.wan.2
/ip route
add dst-address=site.1.wan.1/32 gateway=site.2.gw.1 pref-src=site.2.wan.1
add dst-address=site.1.wan.2/32 gateway=site.2.gw.2 pref-src=site.2.wan.2
That's not what I see. Are you sure that it's not just your default masquerade/srcnat changing the address? My GRE tunnel uses whatever I put in local-address as source.My shot into the darkness is that you have set local-address in /interface gre and assume that this address is automatically used as source one for outgoing GRE transport packets, which is unfortunately not true, as the local-address is only used to identify the tunnel to which received GRE transport packets belong.
/ip route rule
add src-address=<wan1address> table=<wan1table>
add src-address=<wan2address> table=<wan2table>
Great to hear that, but now there is a bigger problem, how is it possible that it helps given that the source address of the GRE transports packets is determined by the local-address parameter?You are right, this is the problem. Now both tunnels are working.
The problem is that while Damián has not told us anything about his routing configuration before applying my voodoo (yet), in my production case the individual routes were there since the very start - except that they didn't have the pref-src set. And having them there was not sufficient to make the second GRE tunnel get up after reboot.It is working, because it solves the routing problem.
/ip route rule
add src-address=<wan1address> table=<wan1table>
add src-address=<wan2address> table=<wan2table>
/ip route
add dst-address=site.2.wan.1/32 gateway=site.1.gw.1 pref-src=site.1.wan.1
add dst-address=site.2.wan.2/32 gateway=site.1.gw.2 pref-src=site.1.wan.2
+---------+ +---------+
| | WAN1 gre1 -->>------<<-- gre1 WAN1 | |
| | gre2 -->>-\ /-<<-- gre2 | |
| Router1 | \/ | Router2 |
| | /\ | |
| | WAN2 --<<------/ \------>>-- WAN2 | |
+---------+ +---------+
/ip route
add dst-address=<remote subnet> gateway=<ip on the other end of tunnel 1> check-gateway=ping
add dst-address=<remote subnet> gateway=<ip on the other end of tunnel 2> check-gateway=ping
So today I've come across that manual page again, and it turns out I've remembered what it said incorrectly or misinterpreted it when reading it for the first time - it's not that the source address would not be set to local-address but it is really only the routing issue, i.e. that the WAN is not chosen based on the local-address automatically.I've even come across a manual page which gave this explanation (which I'm obviously unable to google up right now).
/interface gre
add disabled=yes local-address=S1W2Address name=S1W2-S4W2 remote-address=S4W2Address
/ip address
add address=1.1.1.1/29 interface="S1W2-S4W2" network=1.1.1.0 (I changed the IP)
/ip route
add check-gateway=ping comment="S1W2-S4W2 goes out through Wan2" distance=1 dst-address=S4W2Address gateway=S1W2Gateway pref-src=S1W2Address
add check-gateway=ping comment="Site 4 network" distance=20 dst-address=10.10.10.0/24 gateway=1.1.1.2
Flags: E - expected, S - seen-reply, A - assured, C - confirmed, D - dying, F - fasttrack, s - srcnat, d - dstnat
0 S C protocol=gre src-address=S1W1Address dst-address=S3W1Address reply-src-address=S3W1Address reply-dst-address=S1W1Address gre-key=0 timeout=9m56s connection-mark="Wan1_conn" orig-packets=877 690 427 orig-bytes=1 001 257 292 449
orig-fasttrack-packets=0 orig-fasttrack-bytes=0 repl-packets=2 229 962 repl-bytes=185 661 182 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=0bps repl-rate=0bps
1 S C protocol=gre src-address=S1W2Address dst-address=S4W2Address reply-src-address=S4W2Address reply-dst-address=S1W2Address gre-key=0 timeout=9m58s connection-mark="Wan2_conn" orig-packets=649 999 orig-bytes=78 577 614
orig-fasttrack-packets=0 orig-fasttrack-bytes=0 repl-packets=2 058 repl-bytes=135 970 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=384bps repl-rate=0bps
2 S C protocol=gre src-address=S1W2Address dst-address=S2W2Address reply-src-address=S2W2Address reply-dst-address=S1W2Address gre-key=0 timeout=9m59s connection-mark="Wan2_conn" orig-packets=1 020 222 441 orig-bytes=1 259 894 333 183
orig-fasttrack-packets=0 orig-fasttrack-bytes=0 repl-packets=2 066 857 repl-bytes=337 386 397 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=4.3Mbps repl-rate=0bps
3 S C protocol=gre src-address=S1W1Address dst-address=S4W1Address reply-src-address=S4W1Address reply-dst-address=S1W1Address gre-key=0 timeout=9m59s connection-mark="Wan1_conn" orig-packets=248 474 961 orig-bytes=109 747 382 886
orig-fasttrack-packets=0 orig-fasttrack-bytes=0 repl-packets=2 017 335 repl-bytes=805 042 103 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=411.5kbps repl-rate=0bps
4 S C protocol=gre src-address=S1W2Address dst-address=S3W2Address reply-src-address=S3W2Address reply-dst-address=S1W2Address gre-key=0 timeout=9m59s connection-mark="Wan2_conn" orig-packets=206 618 062 orig-bytes=249 947 475 728
orig-fasttrack-packets=0 orig-fasttrack-bytes=0 repl-packets=207 788 repl-bytes=17 175 583 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=5.7Mbps repl-rate=0bps
Flags: E - expected, S - seen-reply, A - assured, C - confirmed, D - dying, F - fasttrack, s - srcnat, d - dstnat
0 C s protocol=gre src-address=S4W2Address dst-address=S1W2Address reply-src-address=S1W2Address reply-dst-address=S4W1Address gre-key=0 timeout=9m52s orig-packets=191 344 orig-bytes=10 770 687 orig-fasttrack-packets=0
orig-fasttrack-bytes=0 repl-packets=0 repl-bytes=0 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=0bps repl-rate=0bps
1 S C protocol=gre src-address=S4W1Address dst-address=S2W1Address reply-src-address=S2W1Address reply-dst-address=S4W1Address gre-key=0 timeout=9m52s orig-packets=148 083 orig-bytes=11 880 582 orig-fasttrack-packets=0
orig-fasttrack-bytes=0 repl-packets=3 137 repl-bytes=276 218 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=0bps repl-rate=0bps
2 C s protocol=gre src-address=S4W2Address dst-address=S2W2Address reply-src-address=S2W2Address reply-dst-address=S4W1Address gre-key=0 timeout=9m57s orig-packets=191 725 orig-bytes=10 770 044 orig-fasttrack-packets=0
orig-fasttrack-bytes=0 repl-packets=0 repl-bytes=0 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=0bps repl-rate=0bps
3 S C protocol=gre src-address=S4W1Address dst-address=S1W1Address reply-src-address=S1W1Address reply-dst-address=S4W1Address gre-key=0 timeout=9m59s orig-packets=87 494 788 orig-bytes=78 139 897 288 orig-fasttrack-packets=0
orig-fasttrack-bytes=0 repl-packets=987 024 repl-bytes=310 273 340 repl-fasttrack-packets=0 repl-fasttrack-bytes=0 orig-rate=2.5Mbps repl-rate=0bps
/ip firewall nat
add action=masquerade chain=srcnat comment="Masquerade General" out-interface-list=WAN
In that case, useI just tried to avoid to create the blackhole route, just in case, someone in Site4 maybe is using S1W2Address without the VPN (Accessing directly through a dst-nat on Site1)
The firewall's connection tracker remembers other than TCP and ICMP connections for 3 minutes by default. So after disabling the gre interfaces at both sides, re-enable them only after 5 minutes "cooldown". Or instead you can manually remove the connections at both ends using /ip firewall connection remove [find protocol=gre srcnat] and re-enable the gre interfaces immediately.I disabled the GRE interface in both sides, re enable it, and watch the connection: Allways the same result
If I restart the tunnel when the WAN2 is working, I think this should use the route I added to S1W2. Am I wrong?
Thank you sindy, this small paragraph gave me the needed idea to solve a problem that bothered me for quite a while, where GRE/IPSEC would re-establish just fine after router reboot, but would not pass traffic traffic properly. As it turned out -- precisely for the reason you mentioned -- first traffic from impatient clients hits GRE tunnel interface before IPSEC policy is activated, gets routed through the default WAN route (essentially blackholing it, because tunnel IPs are dummies), and src-nat claims that connection and never releaes it (since there's always some activity that keeps it from expiring).What may actually happen is that when WAN2 on Site 4 goes down, the individual route with dst-address=S1W2Address becomes inactive, so the packets for S1W2Address take the default route via WAN1, and a masquerade rule says "src-nat whatever has out-interface=WAN1 to its IP address". So to prevent this from happening, you need to add a type=blackhole route towardsS1W2Address, with higher value of the distance parameter than the real route, which will make sure that if the WAN2 on Site 4 is down, the packets for S1W2Address which should have left through WAN2 won't leave at all.
/ip firewall chain=srcnat action=masquerade dst-address-list=!IPSEC-DUMMY-DST out-interface=bridge-uplink ipsec-policy=out,none
/ip firewall chain=output action=drop dst-address-list=IPSEC-DUMMY-DST out-interface=bridge-uplink ipsec-policy=out,none