Attempting to evolve from caveman's failover

jaclaz · October 3, 2023, 3:00pm

Hello to all.

Completely new to Mikrotik RouterOS, and with a far from complete understanding of networking (so please be gentle if I ask too simple/basic questions and/or I use some incorrect terminology).
I spent some time in the last few days reading many forum posts and trying to have a basic understanding of the capabilities of the RouterOS, after - while looking for a possible better solution to my current failover setup (none or manual/caveman) - I learned from posts and links by Sob on this thread:
http://forum.mikrotik.com/t/connecting-to-multiple-devices-with-same-ip-address/159052/1
how it is seemingly possible through some RouterOS “magic” with nat/firewall/mangle to have more than one device with the same IP connected to the network.

My current situation (small office/shop operation), is connection to the internet with three ISP providers, all of them provide their own (proprietary/locked) modem/router, so all of them (each of them is a gateway) have 192.168.1.1/255.255.255.0 IP address, to the network are connected a bunch of the usual stuff, a few PC’s, printers and a few other “proprietary” devices (POS and POS-like).

So I have all the devices on a 192.168.1.x/255.255.255.0 LAN with gateway pointing to 192.168.1.1, no DHCP (all devices have static IP’s), no VLANS or similar “advanced” routing/switching.

The Internet connections are as follows:

Primary: FTTC (vdsl)
Secondary: “old” aDSL
Tertiary: FWA (Lte 3G/4G)

Only one of the three routers/modems is physically connected to the network at any given time.

The primary connection is usually very stable and gives no problems, but (it happened three times this year) we had faults on the ISP side, leaving us without connection on the primary for one or two days each time, in two cases the secondary worked, in one case both primary and secondary were down (copper cables cut during some road works) and we had to use the tertiary.

So right now my “caveman” method of failover is to simply unplug the rj45 (coming out of the “main switch”) from the back of the modem/router that has no connection and insert it in one of the other modem/router, power this latter on and see that internet works.

I have no access to the settings of the three modem/routers (I can change them only calling the ISP assistance and they change settings remotely, but it is not “fast”, it takes from a couple hours up to one day or more), and as well changing the settings on the POS-like devices needs a call to other assistance services (three different ones), but even if I had access to these settings (like I have for the PC’s and printers) it would take time and need some (basic) knowledge that other pwople/colleagues simply miss, while the unplugging and re-plugging can be done by everyone.

For the reasons above, we must assume that this 192.168.1.1 is carved in stone and cannot be changed.

It would be needed a (hypothetical) device, that could act (still manually) as a RJ45 physical switcher box A/B/C/D similar to:
http://www.cablesonline.com/abrjswitbox3.html
but that could be instead automated via some ping (recursive) or netwatch or similar, or some other script running on a PC.
I actually found something (loosely) similar, a switcher box that can be piloted via RS232 but that besides not being exactly cheap (US$290), would add a whole new level of complication:
https://www.vpi.us/network-devices/gigabit-ethernet-switch-1044

I think (but may well be wrong) that if I introduce between the “main” LAN switch and the three modem/routers two Mikrotik routers (possibly RB750GR3?) I can do the following:

have the first router get the 192.168.1.1 address on the LAN and (say) 172.16.0.1 on the WAN and have some scripts/netwatch to connect to main/failover1/failover2 to three addresses like 172.16.0.10, 172.16.0.20, 172.16.0.30
have the second router be 172.16.0.2 on the Wan (that exists only between the two routers) and mapping/natting the three 192.168.1.1 fixed address modem routers to the three addresses 172.16.0.10, 172.16.0.20, 172.16.0.30

If any of the two Mikrotik routers fail (or both), I can still use the old method of unplugging and plugging directly the modem/router to the switch, bypassing the Mikrotik routers completely.

Or (another idea, maybe folly) I could have 4 devices, 1 Mikrotik like the first router above and three other (any) small routers one of each of the cables connecting each modem/router, simply routing from 172.16.0.10 to 192.168.1.1, from 172.16.0.20 to 192.168.1.1 and from 172.16.0.30 to 192.168.1.1

Do I make sense? (or am I completely off and better/easier solutions exist)

If the approach can work (or if any other suggested ones works) then I will probably need some help in choosing the right hardware/routers and configuring the whole stuff.

Thanks in advance for any reply/suggestion to solve the problem.

jaclaz

Filo · October 4, 2023, 8:26am

Hi,
from what I understood:

Clients have FIXED IPs (and FIXED Gateway-IPs)
Three Internet-Gateways are in place, all of them with DIFFERENT IPs
Caveman is plugging cables when failover should happen

I would totally redesign this network in order to:

If DHCP is not possible, use MIKROTIK-Router-IP as Gateway
Attach ALL Internet-Gateways and clients to MikroTik, they need valid IPs in your subnet
Create the failover-logic on your MikroTik for those three gateways

The last point is more generic, since it depends on how detailed you’re in topic and want to be.
The easiest way to create a failover would be to create three standard routes (0.0.0.0/0) with different routing costs / distance.
The lowest distance count route will be used unless it is unreachable.

This means: The GATEWAY needs to be unreachable / turned off in order to get the next distance as the active route.
But you would also be able to ALTER the routing distance to make an alternative route active, instead of (un-)plugging cables.

If you want to make sure, alternative (say: “backup”-) routes become active automatically, you would need to go into the “recursive routing for failover”-tutorial (which you can find here: https://help.mikrotik.com/docs/pages/viewpage.action?pageId=26476608 ) or use another approach involving NETWATCH and MANGLE-Rules for firewall I once mentioned here: http://forum.mikrotik.com/t/simpler-failover-for-two-gateways-i-found-working/169108/1

As this being said - it’s only meant as an example. We do not know your configuration for sure and there may also be some other things to consider.

Hope this will help you evolve from caveman to network-hero

edit
As I read later on - ALL Internet-Gateways have 192.168.1.1 and you’re not able to change? The above solutions require each gateway to have a unique IP in your network and having the MikroTik to have the 192.168.1.1

If it is the case that ALL GATEWAYS have 1.1. and you are not able to change the interal addresses you can solve this manually:

Connect all Gateways to a MikroTik
DISABLE LAN-Ports of GW2 and GW3 (better do that before Step1 )
In case of failing Internet: DISABLE LAN-Port of GW1 and ENABLE GW2 or GW3
No need to create differnt standard-routes (since 192.168.1.1 will be the way to go either Gateway used)

Of course, this could also be scripted with MANGLE-Rules and NETWATCH-Tool on a MikroTik, as well. Automatic Failover is quite easy to do (Netwatch, Probe 8.8.8.8 for Ping, if not responding, deactivate Port for GW1, activate Port for GW2, probe again, if not responding in a given time, deactivate Port for GW2, activate Port for GW3…), automatic failback is another story since we’re not able to probe the preferred route for connectivity. For this you would surely need to change the internal addresses of the Gateways.

That brings me to a rarely stupid workaround…
IF you have access to the Gateways and IF you are able to create a DynDNS-Host for them to… let’s say probe them “from outside” with DynDNS or equal, the MikroTik would be able to check every Internet-Connection with the corresponding DynDNS-Host from inside. You’ll need to make sure to refresh DNS-Cache if your Internet-IPs are subject to change. If you have FIXED-IPs, great, probe them. But with this idea, it would be possible to create a failback if GW1 is responding to DynDNS again.

Best regards,
Martin!

jaclaz · October 4, 2023, 11:34am

Yes, the gateways have “fixed” IP’s (192.168.1.1).

The thread and links I found with the examples by Sob essentially use a Mikrotik router to have three “same IP” devices connected but “translated” to three different interfaces/IP’s, the context in those threads posts is different, it is about people having industrial machines (or radios/whatever) that have fixed IP’s from factory, in that context I believe that there are no issues whatsoever as the requirement/goal is to connect to more than one machines with the same IP address, I don’t think there is any issue with capacity/bandwidth over those links.

My scenario is different because the idea is to nat/masquerade the gateways(s) IP’s and maybe, even if possible in theory in practice there will be issues of some kind with the connection (double natting? reduced bandwidth? something else?), not that any of the current gateways are any good when it comes to speed, the main one is about 30 Mbit/s, and the two spare ones are around 20 Mbit/s.

Besides, from what I understand of network/routers (but I am not at all an expert) in my case I cannot do everything on a same router, and I need either 2 mikrotik routers, one “replacing” the gateway an having 192.168.1.1 with a route for 0.0.0.0 pointing towards the second router (with the three current gateways at 192.168.1.1 somehow natted/masqueraded to three different IP addresses or have one small/simple router translating/relaying/routing/whatever for each one of the current gateways.

Thank you for the Netwatch reference and your modified way, if the overall approach makes sense, then I will test/study these examples.

As said I could (a one time task) make the ISP change the three 192.168.1.1 to (say) 192.168.2.1, 192.168.3.1 and 192.168.4.1 and bring the setup to a “normal” one where the single Mikrotik router can manage them set as gateway with 192.168.1.1, but what would happen if the Mikrotik router fails? I would have to bypass it and change manually the gateway on all devices connected to the network (which I can do for PC’s but that I have problems with the POS-like devices) but even if I could directly change the gateway on the POS-like devices it would remain the issue if I am not there, there is noone I can trust to make these changes and anyway it would take some time.

I could do this one time change and have two identical Mikrotik routers setup in an identical way, only one connected, and in case of failure of the connected one, replace it with the second. I have some experience with this approach with a self-made router running zeroshell (a now discontinued Linux router/firewall distro) on a repurposed thin client, but when I need to “switch” the main thing client with the second one I have anyway to disconnect and reconnect several cables).

I will have to think if it is physically possible to setup the cables “in parallel” so that all that would be needed in this case would be to power off the “main” (now failed) Mikrotik and power on the “spare” one.

Another approach I thought of (probably foolish) could be to find a reliable ethernet relay that the netwatch (or similar) script on the Mikrotik can somehow pilot, powering off and on the three gateways (this has an issue, as these stupid ISP provided modem/routers are very slow at booting, particularly one so I would need to keep them powered on and use the ethernet relay to switch on/off three small switches (hubs) between the Mikrotik and each gateway) and there is a physical problem as the LTE router/modem/gateway is not (at the moment) near the other two.

A further complication (I am throwing on the table anything I can think of) would be a set of “smart plugs”, still given somehow the Mikrotik script can pilot them), but it seems that they are all wireless and most of them have a proprietary app, though there are a few using Tasmota that is open source and “local” and possibly can be triggered by commands set over the wireless network.
Given the (I believe poor) reliability of these devices and the hypothetical Wi-Fi network I already discarded this approach as not suitable in practice.

Any other idea/?

jaclaz

Filo · October 5, 2023, 5:21am

Well… to be honest, the most simple solution for you is to use the manual switcher you already mentioned.
There are situtations where you try to solve a problem with technical overkill or you keep it simple and doable for everyone.

If this is a remote location you don’t want to visit very often, I would go for an unmanaged switch and the manual switcher.
EVERYONE is able to switch from A to B or B to C. Plus: Only two devices could fail. Leave them on spare at the location, label the cables, another caveman can jump in and replace it.

There ARE indeed solutions which you may be able to implement. Automatic failover, VRRT on MikroTik, and many more options to create an administrative nightmare for everyone which is not you or “brainlinked” with you.

Is it worth it or are you just trying to make a simple and easy setup for everyone to understand to lets you get away in peace?
Take a step back from your considerations and try to look at it from outside. Such projects can kill you over time and often it is better to revert complexity to a working setup for this special location.

If you can’t use standard-failover procedures mentioned in MikroTik-Tutorials due to the given limitations - don’t bend it over. Use the Hardware-Switch for the gateways and everyone will understand your 1-Page-Documentation on how to fix it, while you’re in the sun having a beer

edit
I guess there are also such switchers on the market which may be remote-controllable via VPN (for FAILBACK, if they forget to do it)

Best regards,
Martin!

jaclaz · October 5, 2023, 12:17pm

Well, I found the given “switcher box” that can be driven both manually (push button) and programmatically (via RS232), “Manual Ethernet Switch”:
https://www.vpi.us/network-devices/gigabit-ethernet-switch-1044
but it is (as I see it) stupidly expensive (at nearly 300 US$) and - being RS232 - would need an added module ethernet<-> RS232 (likely another 50-100 US$) but - besides the cost, I am not even sure how well it can work.

With the same kind of money you can buy two more than decent Mikrotik routers or switches.

As an outsider to the networking world, I would have thought that there were heaps of similar devices or some other re-known, wide use alternate solution that I was not able to find, like some sort of managed switch that could be easily commanded on http to bring ports up or down.

From what I understand Mikrotik routers seem like not having a command line (SwOS), and maybe (but I have to study more) using a router (RouterOS) as a switch is possible, still I have no idea if it is possible through some “magic” script/setup to obtain what I need/want.

The problem I have does not seem to me so much niche, I wonder if other people in similar condition have found a better solution.

The three gateways with same IP could be - as said - marginal, in the sense that even if I change them to different IP’s (once) the result seems to me not as robust/failproof as I would like it to be.

Another network device that seemingly does not exist is a very simple router (actually more like a network address translator) a (hypothetical) device with two ports that simply translates/routes a given address to another one as transparently as possible.

Another semi-random question is could a PoE enabled switch/router actually command the PoE supply to one port?
It seems like it is possible:
https://wiki.mikrotik.com/wiki/Manual:PoE-Out
so, could the POE be switched on and off and connect to a given port a relay of some kind?

jaclaz

Filo · October 6, 2023, 7:03am

Hi,

Well, indeed you CAN with Mikrotik. There are also tutorials integrating MT-Devices into SLACK or TELEGRAM and you may be able to command them from there (regardless the security aspect).

MikroTik-Devices have a great CLI - indeed it is very easy to adapt. Scripting is no problem, too.

Your main problems seem:

a) Fixed IPs and Gateway on clients
→ Can be fixed by using the MikroTik as 192.168.1.1 (Gateway)

b) Fixed and SAME internal IPs on all Gateways
→ Needs to be fixed to three different internal IPs
→ If you change the internal IPs of the three gateways, any WAN-Failover-Tutorial will fit for you.

Only thing will be: Single Point of Failure on the MikroTik (which of course can be mitigated by creating a “Virtual Router” on more than one device)

I’m not aware of any use-case MikroTik-Devices could not help with - it’s a matter of energy invested in learning and adapting.

Regards,
Martin

jaclaz · October 6, 2023, 11:01am

Yes, I know what the problems are, and of course I know how forcibly removing them there won’t be any more those problem (but new ones may arise).

Still this is not “problem solving” it is “working around”, and - while there is nothing wrong in working around as opposed to solving - if the end result of the workaround is not satisfying, besides removing all the fun, there is also no real advantage.

I see it more like a game, there are Rules on how to play and you must play along those rules, you cannot invent your own rules on the spot and call it a day (unless it is Calvinball, which is actually fun:

https://calvinandhobbes.fandom.com/wiki/Calvinball

I will search/study/think a bit more, as I think there can be practical and working solutions without spending a fortune in professional/industrial devices.

jaclaz

Filo · October 6, 2023, 1:10pm

Well, maybe another user likes to join this topic, but I think, you can extract some ideas from this already. We discussed several ways:

Change the config of everything surrounding a central MikroTik device (change routers‘ IPs)
Adapt the given facts and build workarounds on MikroTik (like scripts in Netwatch and Ports enabling and disabling automatically)

Both ways are open for you with MikroTik, even to buy two or more of them to mitigate SPOF and even scripts and automation are open for everything.

Whatever you choose - good luck.

Regards,
Martin

jaclaz · October 7, 2023, 9:04am

Thank you very much.

In the meantime I checked around about (even if I won’t probably in the end use them) ethernet relays and similar stuff, it seems like there are an endless amount of “hobby” devices, “no name” and of dubious working, and - on the other end of the spectrum - professional PDU’s intended for racks with the usual (IMHO) crazy prices.
I found a few (maybe) good items in the lower price range (only for memory and for future reference:
https://www.kmtronic.com/LAN-Relay-Controllers?product_id=95
https://www.waveshare.com/modbus-poe-eth-relay.htm
https://tinycontrol.pl/en/lan-controller-35/
and seemingly a good, documented one (it also has an online simulator) which stands out, in the higher price range:
https://www.netio-products.com/en/products/all-products
the rack PDU is also programmable/scriptable with LUA,
and a one-of-a-kind device, that could be useful in simpler projects/setups:
https://www.tyconsystems.com/tpdin-poe-relay

Finally a more complete range of DIN Rail devices:
https://relaydroid.com/

jaclaz

jaclaz · October 9, 2023, 7:55am

I expected a rabbit hole, but frankly not as deep as it seems to be (about failover methods), I checked several (many, likely a couple dozens) posts/blogs/tutorials about failover, many are incomplete or posted with a later comment by someone correcting them in this or that “wrong” parts, I would have expected to be a few “canonical” ways that by this time were “established”.
Besides the “help official”:
https://help.mikrotik.com/docs/pages/viewpage.action?pageId=26476608
and your (Fllo’s) very nice and simple one :
http://forum.mikrotik.com/t/simpler-failover-for-two-gateways-i-found-working/169108/1
I found also the one by Chupaka:
http://forum.mikrotik.com/t/advanced-routing-failover-without-scripting/136599/1
on that thread there is a reference to a nice presentation by Tomas Kirnak:
https://mum.mikrotik.com/presentations/US12/tomas.pdf
that, even if definitely “advanced” clears a lot of doubts about the terminology/methods that I had.
Right when I was convinced that - even if complex - the recursive check was the way to go, I found that there is another way through “Detect Internet”:
https://wiki.mikrotik.com/wiki/Manual:Detect_internet
https://help.mikrotik.com/docs/display/ROS/Detect+Internet
that seemingly is not much used, I found a seemingly valid example/explanation here:
http://forum.mikrotik.com/t/how-to-properly-use-detect-internet-for-isp-failover/138129/1
but not much more.

It will be a looong (besides steep) learning path.

Only as a side note, I found (on an Italian board dedicated to RouterOS) a (small) confirmation of my original idea/approach:
https://www.routerositalia.net/forum/viewtopic.php?f=1&t=3957
in the post (unfortunately not answered to/abandoned) the user asks about using a Mikrotik to replace a Draytek working with 192.168.0.1 on a LAN port and 192.168.0.2 on TWO WAN ports, so that the load balancing/failover router can effectively be bypassed in case of troubles.
It is good to know that I am not the only crazy guy with the idea of multiple same address gateways.

jaclaz

llamajaja · October 9, 2023, 11:12am

Start at para I… feel your pain. https://forum.mikrotik.com/viewtopic.php?t=182373

jaclaz · October 11, 2023, 1:31pm

Yep, I gave read (and re-read, and re-re-read) that paragraph, but it still sounds to me (with all due respect to the Author, anav. whom surely posted it in good faith and as an attempt to help fellow board members) largely similar to Vogon’s Poetry.

Some of the examples are convoluted/overcomplicated and a lot of info is missing, the “DTRIPLE WAN - RECURSIVE” (which could have been a solution/answer to my question) has been (IMHO) overcomplicated by introducing besides the three WANs also three (actually six) subnets and after posting a set of 9 (nine) additional routes with target-scope=14 ends with:

Then the rest of the routes are required, six with target scope of 13, and the last six with target scope of 12.

which makes little sense (to me), if there are nine routes with target-scope=14, there should be also nine with target-scope=13 and nine with target-scope=12, shouldn’t they?

In the DUAL WAN - RECURSIVE, it is not defined why some (explicit) IP address are chosen, and what elsewhere is called “virtual hop” becomes (AFAICU) “Bogus address”.

It is very confusing when different terminology is used by everyone that writes these tutorials (and often different from what is called in Mikrotik official wiki/documentation).

Very likely this is part of the difficulties that a non-native speaker has when learning a new language, when I started speaking (almost) English I called my home a “house”, and was promptly corrected with a “You mean an apartment, don’t you?” so, next time I used “apartment”, and was promptly corrected with a “You mean a flat?”

jaclaz

jaclaz · October 13, 2023, 7:55am

After much looking around I found two videos (from Indonesia by Citraweb, luckily with English captions) that do explain nicely two ways to manage on a same router two same IP gateways, one addressing the ether interface with % (like “gateway=192.168.1.1%ether1”) and one making use of VRF (actually I believe what is called “VRF lite” in some other examples/tutorials.

The context is slightly different (it is focused on load balancing) but the concepts are (IMHO) well explained.

Video #1:
https://www.youtube.com/watch?v=ZiybVYms6kw

Video #2:
https://www.youtube.com/watch?v=dWwKIP2Kqbo

Besides the actual usefulness of the content, it is good to know that I am not (yet) completely crazy and other people have to deal with multiple ISP’s with “fixed” same gateway IP.

Usually I hate videos (when compared to articles or blog or forum posts) but these ones are clear/slow enough.

jaclaz

jaclaz · October 16, 2023, 9:50am

Here is a graphical representation of my current situation:

jaclaz · October 16, 2023, 9:53am

I got my hands on some (old, slow) cheap routers, only capable of routing a LAN address to a WAN one, TP-Link TL-R460, here is a graphical representation of a possible setup.

jaclaz · October 16, 2023, 9:57am

The next evolution, replacing the #4 router with a RB750GR3 (thus being capable of automatic failover):

jaclaz · October 16, 2023, 10:01am

Or, I could replace three of the Tp-Link’s with a MT RB750GR3:

jaclaz · October 16, 2023, 10:02am

Then I could use two RB750GR3’s as follows.

The question is:
is there a way (some magic or protocol or whatever) that would allow me to do all this in a “same” single router? (even if that would mean using a “better” router, with more ports or some other advanced characteristics)?

mtest001 · October 16, 2023, 3:18pm

Hello,
I have the same need as you, i.e. 2 ISPs whose routers have fixed IPs (192.168.1.1). I have created two VRFs and I am able to route correctly on one or the other, but I did not manage to have the NAT working.

Do you mind sharing your Mikrotik router configuration ?

Thank you.

jaclaz · October 16, 2023, 4:06pm

I haven’t any, I am still studying if the whole thing is doable and - if it is - if it is worth the hassle.

I have read several VRF related threads, read (and re-read) the official Mikrotik wiki/docs, watched the two or three videos that are usually linked to on the forum and came out with very little understanding.

The only source I could (maybe) understand was this video:
https://www.youtube.com/watch?v=dWwKIP2Kqbo
(Indonesian but with English subtitles)

I believe the missing “magic” or the “trick” in your situation:
http://forum.mikrotik.com/t/failover-between-2-isps-using-gateways-with-same-ip-was-nat-traffic-to-vrf/170381/1
lies in the “routing mark”, the procedure is explained in the linked video.

From the little I am understanding of RouterOS, the same thing can probably be made through 2 or 3 different ways, so maybe you are attempting to use a different method from the one in the video and consider that I am a complete beginner so take my advise with lots of salt, but maybe you can replicate what they show.

jaclaz