Hi!
I’m trying to find some info about hardware fail over/redundancy. Everything I found about fail over is about WAN fail over but that is not what I’m looking for.
I have a RB450Gx4 which does a lots of stuff, for example port forwarding/NAT specific services between some subnets, being a NTP server and monitoring the connection to some hosts. We discovered that as more and more smart “functions” are added to this router the impact of a hardware malfunction increases more and more making the router hw a single point of failure.
I was looking for a way to install a second router with the same config but making it inactive as long as the main device is working. I can’t have both online because that would create lots of IP conflicts between the two routers Is that possible setup a “hot standby” redundancy in routeros?
Ultimately the standby router should copy it’s config from the primary as long as both are working fine so it is always current but it could also be something I have to manually sync, most important is the fail over functionality. Any tips where to start reading?
Ah, thanks! Sounds very similar to what I’m looking for Can VRRP also distribute config changes from master to backup nodes? Looks like that is left for the user to handle?
It’s tricky, trying to implement hardware failover can lead to more points of failure.
For example, if the LAN is configured to use VRRP for failover between the two routers how would you connect the two routers to your network - if you use a switch that then becomes a single point of failure. Similarly how would you share the WAN connection between the active and standby router.
You may want to have a look at this. It includes configuration synchronisation from active to standby.
What is missing so far is synchronisation of connection tracking, but the RouterOS 7 beta supports that too (the functionality is linked to VRRP there).
VRRP (and HSRP) are intended to preserve the first hop, so that hosts can always reach their default gateway. The routers sharing the virtual address could be completely different hardware and configuration.
Thank you for your comment! I have redundant switches too. Not an active type of redundancy but a backup switch below the primary with same config so I just have to move all the cables to the same ports in the other switch in case of malfunction This is also something I could do with the RB devices but I think it is harder to keep the backup device config in sync with the primary with a more complex config than the switches uses Especially since the backup device can’t be connected all the time as it will cause IP conflicts if I duplicate the config straight off…
Ah thanks! Certainly looks like a interesting read I think I could live without connection tracking, all the connections would then just have to reconnect I guess? That would probably happen rather quickly at user application level so they would probably not even notice a quick drop I guess…
I understand the concept for it yes but I’m not completely sure it is right for me. I have so many IPs and interfaces on it… it is even standby for specific hosts on the layer two network for which the connections can never go down (switch or cable failures) so it monitors IPs and if they become unreachable it takes over the IP and enables it on itself and handles and that traffic a completely different way using NAT and layer 3 routes. The host hosts on the first network hardly notice there has been a connection break and that the L2 traffic is rerouted over L3 links
So it is not a “standard” default GW on the network… I use it more like a network “helper” to make other super important connections run smooth without any connection disruptions.
Interesting,
So I could have a completely different router (both MT) connected concurrently and if one fails the other picks up without skipping a beat, well once ROs7 comes online (addressing existing connections).
I like it, who doesnt have an extra MT router kicking around. I have mine setup to add manually if the main router fails.
Q1. My question is how does this work with (or perhaps interfere) with failover.
For example if the main ISP fails, and the router kicks to the secondary ISP, is there a chance it may try to kick to the other router??
Q2. Ie how does VRRF handle multiple wans??
Q3, What is the trigger, unable to reach next hop? Detect loss of power in other router??
Q4. Assuming one has to use a dedicated port on the main router to connect to a dedicated port on the secondary router to monitory BUT if the unit is dead electrically does not the secondary have to be connected to both modems as well??
@anav, your post above is a typical example of where a moderator should convert a post into an initial one of a separate topic. Could you do that yourself, please?