If you were starting from scratch:
- bare IPsec takes least overhead and is most different from normal routing
- IPsec-encrypted IPIP tunnels allow you to use normal routing with dynamic routing protocols but there’s additional overhead, albeit a few bits smaller than with GRE
However, you’re not starting from scratch, and even worse, you’ve got no control about the HQ side.
- Since 6.47.something, RouterOS permits a policy to be associated with two peers; only a single pair of SAs for that policy is established at a time, to one of the peers. So you could set up the following arrangement:
A B
|\ /|
| \ / |
| X |
| / \ |
|/ \|
1-----2
\ /
\ /
3
|
|
C
C is the host in the branch office that needs to talk to hosts A and B in the headquarters’ network
3 is the branch office router with two IPsec peers, 1 and 2
As one of the components of the overall security concept of IPsec is that packets that reverse-match a traffic selector of an existing policy, even an inactive one, must be dropped if they came in some other way than via a security association linked to that policy, it is essential that the policies at 1 and 2 are generated from a template rather than configured statically, otherwise these two routers wouldn’t be able to hand over the traffic from the remote peer directly to one another as it would be coming in the “wrong” way to the other one. The latter is true for RouterOS; whether it is the same for the IPsec router used by the Business is up to you to find out.
Another reason for dynamic generation of policies from a template is that both 1 and 2 can then have a static regular route to 3’s subnet via the other one; this regular route is overridden by the dynamically created policy.
If 1 and 2 should act as stateful firewalls, you’d need them to use VRRP synchronized with the IPsec failover instead of both being active, as the stateful firewall doesn’t handle non-symmetric routing well (if a SYN packet from A to C goes via 1 and 2 but the response SYN,ACK packet from C to A takes a shortcut from 2 directly to A, a stateful firewall at 1 won’t see the SYN,ACK so it won’t let subsequent packets from A to C through, hence you need that A sends a packet for C directly to 2 if the SA is active between 2 and 3).
So if you can agree with the Business to arrange their topology accordingly, the above is one possible way to go. It should be possible to set up a separate policy for communication between 3 and your management router acting as a backup one for the one of the Business.
- To migrate to a setup based on IPsec-encrypted IPIP tunnels, you’d also have to work in tight cooperation with the administrators of the Business. They would have to change the setup for the branch offices from bare IPsec to IPsec-encrypted tunnels one by one to minimize the outage; for them, such arrangement is much more complicated as they need to configure one tunnel per branch office, whereas with bare IPsec the policies may get created dynamically (at least in RouterOS).
- A combination of bare IPsec towards their 1 and IPsec-encrypted IPIP tunnels towards your 2 is the most complex one to handle for you at 3, as you’d need to activate an action=none policy to shadow the (currently inactive) one towards 1 each time the IPsec session to 1 would be down, and I can see no advantage of such a mixed approach.
So all in all, my private opinion is that dynamic routing protocols are advantageous in mesh type networks; since each 3 only has two paths to the rest of the network, and since dynamic generation of IPsec policies substitutes a dynamic routing protocol in terms that it “installs a route to each 3” at 1 or 2, I’d stay with bare IPsec in this particular topology.
Things to consider are whether the failover time of 100 seconds by default (DPD messages are sent every 10 seconds and 10 of them must stay unresponded to declare the peer down) is sufficient for you or whether you’d change that to something faster, and whether you want to force the security association back to 1 after it recovers - RouterOS keeps the SA on 2 until it fails itself. To force it back to 1, you have to disable and re-enable the peer representing 2, so you need some kind of monitoring script to watch for recovery of 1 and take action. It is also a good practice not to rush with moving back to the primary path if the switchover operation is not hitless (which is our case, it takes some time for the SA to establish so a few packets may get lost). And the IPsec code doesn’t currently provide any possibility to run a script on state change similar to what DHCP or PPP offer, so you’d have to schedule a periodic run of the fallback script.
Also, things become complex responsibility-wise once the 2 under your administration becomes a backup for 1 under their administration, as in that case, you’ve got access to an element which may play a firewall role in their network (policing where the C is allowed to get within the headquarters network).