We have a monitoring center which has 2 critical IP addresses (one primary and one secondary) which sit on independent CCR 1036 routers as firewalls etc.
What i need help with is we want to make these IP addresses resilient across multiple totally separate third party connections. We have a main supplier on each IP which currently is supplying us an L2TP tunnel type connection for this IP address on each firewall however I want to make this move across the backup connections in the event of a failure with as little downtime as possible.
We are fine with running scripts if necessary.
For example our primary firewall has 3 WAN links with internet access and this IP is running on the L2TP interface. Main connection is 1Gb FTTP, second connection is 1Gb FTTP and third connection is Starlink.
What is your advise on the best way to make this work with no downtime?
Not a very clear story.
So you have ISP1 , which delivers 1 public IP using L2TP terminated onto a CCR
You have ISP2, that delivers also 1 public IP using L2TP terminated onto a CCR
So is ISP1 providing you with the primary FTTH link and ISP2 with similar FTTH-link ?
If the above assumptions are correct, you will never be able to accomplish this very easily.
Each ISP has its own IP-space and you cannot make public-IP “1” reachable across ISP2 or Starlink.
You could go for 2 independant connections with the same ISP1, bring the lines in your building through different routes and then have that same public IP made reachable across 2 different physical paths for resilience.
Offcourse if ISP1 has a major f*ckup you might experience issues too.
Did you look at certain SaaS-options, maybe some voodoo with CloudFlare or Azure or something to have some “public IP’s” always available?
if you have low latency paths you could use public ips through a hosting. establish a tunnel from isp1 to the hosting. create another tunnel from isp2 to the hosting. assign yourself a public ip from the hosting through the tunnels, you could use ospf to assign it dynamically using the best path.
I am not sure I understand properly what you want to achieve. From what you wrote I gather that those addresses are assigned to your provider by the authority responsible for address allocation in your area, i.e. you do not have any public prefix and a public ASN number assigned directly to you, so the only portion of the path between the device somewhere in the internet and your network is the one from your “public address provider” ISP to your location.
If this understanding is correct, the best would be to discuss with your IP address provider what connection redundancy model they can routinely provide, because any solution that does not involve cooperation on their side will be too prone to false positives, as it boils down to establishing the L2TP tunnel with the same user credentials from another endpoint at your end when you detect an outage of the curently active one. Which, from the perspective of the “IP address provider” gear, will be seen as an attempt to establish a new connection before terminating the old one, and may or may not be successful depending on their implementation (any maybe configuration).
Regarding “no downtime” - in the implementation phase, you’d have to debug the solution with another address first, and then replace the test adress with the production one. In the production phase, the failover time may be a sub-second one if you use BGP or OSPF together withf BFD.
Also, have you considered making a DNS name the immortal identifier rather than a numeric IP address?
If you were to switch the method (IP->FQDN) or the IP address provider and thus the addresses (if the current provider offers no redundancy support on their end), how much work it would be to reconfigure the remote gear to use the new identifier?
We have a supplier who is providing us a public IP address over an L2TP tunnel. This is the critical IP address. We can authenticate this tunnel over any of the WAN connections that we have.
Our primary firewall has multiple (3) independent connections. Currently if the primary fttp connection fails our tunnel will then authenticate on the next wan in order of priority very simply. But this is resulting in too much down time in the event of a failure.
My questions is very simply can we do this in a more optimal way?
We do not have our own Data Centre infrastructure and therefore we rely on what our ISP’s are willing to support us with.
The idea of multiple tunnels and using ispf to supply this IP over the best path sounds interesting but may complicate our NAT configuration due to more interfaces.
Essentially what we want to achieve is ultra high uptime for this IP address while using as many third party internet wan connections as we like to provide more and more redundancy.
To minimize the total time of the outage, you must minimize both the detection time and the time needed for the actual failover. The faster you make a conclusion that the current connection is down, the sooner you can initiate the actual failover, but the higher the probability that you react too soon because only few packets got actually lost, so the resulting two failovers will cause more packets to get lost than just passively waiting a little longer. On the actual failover side, the L2TP authentication is a multi-packet exchange which also takes some time.
So you can continuously ping the L2TP server from each WAN and if the L2TP server stops responding via the preferred WAN, disable the L2TP client on that WAN and enable it on the other one. It needs scripts spawned by the scheduler or by netwatch; the minimum interval for both the scheduler and netwatch is 1s so you can make conclusions once per second at the fastest; during that second, you can send multiple ping probes and count the responses. Since you plan on using more than one backup uplink, you would need the on-up and on-down scripts of the netwatch to set and evaluate global variables to decide which L2TP client to enable and which ones to disable, which is prone to timing hazards; if you use a scheduled script instead, you can spawn a separate thread to ping via each WAN within the same reincarnation of the script and evaluate the summary results still within that reincarnation, so there is always only one thread that makes the decision and thus no space for timing hazards.
The advantage of using multiple L2TP tunnels established permanently and using OSPF to choose between them on the fly is that you can use BFD to detect the outage of a path; BFD works in the sub-second domain but you cannot use it standalone, only in combination with OSPF or BFD. The public address would not be attached to any of those tunnels; instead, each tunnel would use a different private or CGNAT address at your end and the provider would route the traffic for the public one through any of the tunnels as per the OSPF decision. The NAT rules would match on an interface list that would contain all the tunnel interfaces, so it would not matter through which tunnel the initial packet of each connection would leave. Of course the action would have to be src-nat, not masquerade.
What i need help with is we want to make these IP addresses resilient across multiple totally separate third party connections. We have a main supplier on each IP which currently is supplying us an L2TP tunnel type connection for this IP address on each firewall however I want to make this move across the backup connections in the event of a failure with as little downtime as possible.
I’m afraid you can’t do that without your upstream concerned. it wasn’t about those layer 2 links - but those public ips are tied to their respective bgp asn.
you can peer with your bgp upstreams - but that won’t be an easy job to make your router a transit network. not to mention you are about to do prohibited route leaking.