@bpwl: Thanks for the feedback. As mentioned it mostly works fine this way. However, if the main WAN is down, then the gateway used to check for it, is not reachable either as the route with the gateway (e.g. “8.8.8.8”) is down but not the one where “8.8.8.8” is the dst-address which is nearly always the case for me as the ISP modem provides an internal gateway IP, if the ISP itself does not provide a correct IP, but the Internet itself does not work, of course.
My solution for now is to use DNS servers I don’t use myself, so I use CloudFlare DNS and OpenDNS servers, that way the Google DNS is still reachable, once the failover is in use but I am not sure this is the best solution? Though it is probably the easiest solution.
I might change the DNS servers I use to CloudFlare DNS but then I will just check against Google DNS and OpenDNS instead.
Or is there a workaround for this issue that can be easily implemented?
The IP used for checking is bound to the interface for the check, and as such is NOT reachable if that interface is down. It should not be used as an available resource!
There are enough IP addresses that can be used for checking (only) and still have that function from elsewhere. https://www.lifewire.com/free-and-public-dns-servers-2626062
Some also set multiple check-IP’s for the same recursive route (see #130)
@bpwl Thanks again. I expected this to be the case (since yesterday).
I guess there is no easy way around this? It isn’t that problematic because as you mentioned there are multiple safe IP addresses to check against but I still think it is a weird limitation
It is also something that is very seldom explained, which was why yesterday when I needed the failover Internet the first time it wasn’t working as expected. Sure I tested it as far as possible before but that was basically disabling the interface altogether or removing the network cable, which of course also resulted in the interface itself being down, so the failover worked but not because of the gateway check and also testing with an invalid gateway IP worked because the DNS server was not blocked then
What might have worked to test it, is to disconnect the cable after the modem but to be honest I didn’t think about this and it might not have worked either, as it kind of depends on how to modem reacts to that scenario.
In hindsight it also makes total sense, as the route for e.g. “8.8.8.8” would still be active, as it is outside of the gateway check
But now that I know about this, it is easy to avoid this issue.
Thanks for the information. As mentioned it is fine in my case now, just find it an odd limit, but I guess if want really needs it, there is always scripting to do this.
I now use two DNS servers to check (in case one is down, however unlikely that is) and one of the remaining providers for my real DNS queries, so it is fine now.
I don’t know if I’m the only one, but to me it always seemed kind of rude to use them for this. They provide free DNS, but they (AFAIK) don’t invite the whole world to constantly ping their servers. I’m sure they can handle it, but still… Also, since they didn’t make any promises regarding pings, what if they decide that they had enough and block it? It’s probably not very likely, but that would be fun. That said, I don’t have better alternative, i.e. some always on servers that welcome people to use them for this.
It’s probably not very likely, but that would be fun
We do take a lot for granted … .
By failover (and load balancing) my route list contains the same paths again but without the “recursive based ping check” at a larger distance.
Just in case the “well known and trusted” servers stop responding. Test will fail, but traffic will still flow if the path is usable. There is some time now to find another “to Shanghai”.
This is why I didn’t add any checks for my failover Internet
So if those all stop working, then the failover Internet will still work until I can adjust the configuration accordingly (e.g. remove the checks from the main internet).
One solution would be to e.g. use a script that actually does a DNS lookup and checks if the server provides a response, in which case the Internet is working. More complicated to implement, but it would still work and probably produce more traffic/load for their DNS servers than just ping
Scripts can do many things, they are even more powerful than this. For example, if you’d want to switch back from backup to main connection only after it’s been stable for some given time, with script it would be possible. The beauty of this solution is that it’s all built-in function of router. So you don’t have to write any script (I find writing RouterOS script very much not admin-friendly), and it’s less likely to break.
So, ghostzero, yes, the solution is to put your clients’ traffic to a separate routing table. You create additional default routes in that table, then mark all traffic-to-be-routed-outside (like “in-interface=LAN-Bridge dst-address-type=!local” to exclude traffic destined to the router itself, like DNS requests) with that mark. Voila, your clients don’t use a route with dst=8.8.8.8, only your default routes (primary and failover in case of troubles).
@Chupaka thanks for the feedback. Makes sense, though I think it is indeed unnecessary for me and I will just check against hosts I do not use anyway.
Though I have to admit I had hoped MikroTik would provide an easier solution to thi, like e.g. just adding a list of IP addresses to check for a route to be active.
Though it is a nice solution as it avoids scripting, which can be tricky though powerful and might break upon migrating to a different hardware or upgrading RouterOS as Sob mentioned, so using recursive routing is preferred.
Wau..and could you please show me the screenshot from WinBox where to specify interface in recursive route lookup? The route poperties window is different than in ROS v6. Thanks
After spending the last hour trying to figure out why my dual wan failover version6 didn’t work in V7, Reading multiple posts and watching a utubevid I tried this and it seems to work with behavior as expected. This is on V7.2Rc1 with Cube60AC. Has wlan1 and wlan60. The 60 interface is DHCP and so is the WLAN1. Single ethernet interface 88.1/24. I can disable either interface, one at a time, and it immediately shifts traffic. On the Barn end of this setup, I’ve got one AP with 5G and a different AP with 60G. I rebooted the 60G unit, and it took 3 seconds for traffic to transfer to the 5G. once back on, I dropped one ping during the switch back to the 60 g system. 10.2.4.5 is the GW on the 60G interface (primary desired link) 10.3.44.1 is the GW on the backup link.
Works for me on this version…Hope it works for you all.
Just fyi - here is the old version that worked in 6.x - less the firewall rules, and dhcp-client commands that are identical as the new commands.
In this V6 working version example - the primary desired path gw is 10.3.127.1