Hello all,
I am doing some experiments with VRFs to implement an automatic failover between 2 ISPs and things are mostly working but I have some weird issues for which I would appreciate some help.
The context is as follows:
I have 2 ISPs
Both ISP-provided routers have the same IP 192.168.1.1
My LAN network is also 192.168.1.0/24
Since all of this is using the same subnet I have put each ISP router in a separate VRF:
The bridge IP address is 192.168.1.201
The router of the first ISP is connected to ether1, ether1 has the IP 192.168.1.202 and is connected to a dedicated VRF
The router of the second ISP is connected to ether2, ether2 has the IP 192.168.1.203 and is connected to an other dedicated VRF
Then I have a netwatch script that change the priority of my default route to go through one ISP or the other to implement the failover, but i’s not that part that I want to focus on for now.
Things are mostly working BUT I have the following issues:
I can only access the Internet from the LAN if the 2 IPs of the Mikrotik router in the 2 separate VRFs are assigned as /32 - that should not be the case. If I assign /24 addresses then I can ping google from the Mikrotik console but not from the LAN.
Checking for updates from the Mikrotik does not work, it cannot do the DNS resolution.
Below is my config - I removed the part relevant to netwatch because for now it’s not the issue:
Probably yes but if you have multiple interfaces in the same vrf, you’ll be natting between those interfaces too. Maybe you should also use an IP matcher.
Masq srcnat: in:bridge out:ether2, connection-state:new src-mac xx:xx:xx:xx:xx:xx, proto TCP (SYN), 192.168.1.104:56184->151.101.129.140:443, len 60
My assumption is that the return traffic is not routed using the correct routing table. It works when using /32 interface addresses because this creates a point-to-point link.
Indeed and these routes are absolutely necessary otherwise nothing works. Also this is why I need to assign /32 mask to the IP address in the VRF otherwise I end up with conflicting routes: 192.168.1.0/24 is automatically added as a dynamic route so I cannot “leak” an other route to 192.168.1.0/24 in the main routing table.
I did some tries without the return routes and using some firewall mangle rules instead, and got I some promising results, although could not manage to make it work yet.
Hello,
i’m also try to use vrf. but i have a similar problem.
for testing i try to my lab , so i use 3 different subnet
LAN = 192.168.88.0/24
WAN1 (VRF1) = 10.1.1.0/24
WAN2 (VRF2) = 192.168.89.0/24
i attach rsc.
with now i setup a dhcp server to use 8.8.8.8 and 1.1.1.1 because if i use 192.168.88.1 not work.
all client in LAN works correctly, nat is working .
netwatch works fine. after i will setup a rule to disable mangle or change distance.
my only problem is router itself can resolve names…
if i try to ping works
[admin@MikroTik] > ping www.google.com
invalid value for argument address:
invalid value of mac-address, mac address required
invalid value for argument ipv6-address
while resolving ip-address: could not get answer from dns server
of course i can’t update package…
i try with and without latest mangle rules (output)…
repeat only problem is dns on mikrotik don’t work…
i try also telnet from mikrotik but not connect …
tried wrapping my head around that and i cannot clearly make it up how a routing decision is made in that setup?
a packet from “LAN” (src 192.168.1.x) goes out to “WAN” (either VRF “starlink” or “orange”) … how would the return path look like when everything is 192.168.1.0/24 ? i know VRFs are a meaning to resolve ip overlapping but i still cannot see how a router is able to decide the RP from WAN back in either VRF ?
@aleab
The issue you are having is a known one.
There is not (yet, seemingly things are in the works) support for DNS in vrf’s.
I have a similar setup, in my case I “reversed” the vrf, putting it on the LAN side, so that the interfaces on the WAN side are on “main”, and thus the DNS works normally..
@spippan
Check the same links above, it does work, but we (at least myself) don’t really know why exactly, the key is having the static route(s) to the ISP modem(s) as /32 and return route(s) added to the vrf tables, in “main” the return route (LAN side) is automatically added (comes out as DAC in /ip route print).
Maybe I’m missing something here… But what is the point of using VRF for ISP failover? — VRFs have nothing to do with “automatic failover”. Failover works without VRFs, and so layering VRF on top of failover mechanisms just make config even more complex.
The point is about having multiple ISP routers pre-set to the SAME IP address (usually 192.168.1.1) that you cannot modify (either because the routers themselves are not accessible or because you have some devices on the network with 192.168.1.1 set as gateway that as well cannot be changed or - to be changed - need to wait several hours or days for an intervention either remote or on site, that BTW may or may not be free).
In my particular case (and in my simplicity, caveman but attempting to evolve) I am now using an Ax Lite as a “transparent device” i.e. it has 192.168.1.1 on the LAN side and connects to other devices (ISP routers) that also have 192.168.1.1, while it provides a failover feature, it can be bypassed any time by simply taking out the ethernet cable coming from the switch/network from the LAN port of and inserting it directly in a LAN port of the (chosen) router.
Only for the record, I have as a side-side project an alternative configuration with the three ports to the three ISP modems bridged with the network, with two of the three ports disabled/temporarily removed from bridge.
When internet connection is not working, I can disable/remove the current “towards modem” port and enable/add another one, it needs a gratuitious ARP to update timely the MAC of the device with 192.168.1.1, manually it works but I have still to find the time to better study the scripting syntax and produce a (even if half-@§§ed) working script to automate the failover.
Moreover at the time I setup GNS3 on a spare PC and now I am having issues to install GNS3 on the laptop I am using, and until I find a way to do so, I won’t have the possibility to make progresses with this approach.
More just saying that having multiple same subnets are allowed without VRF. Now it means the default route 0.0.0.0/0 needs to be % qualified, so gateway=192.168.1.1**%etherX-toWAN-Y**.
Failover happens by using check-gateway=ping (or more complex netwatch/recursive routing approaches) on primary route with distance=1. And backup route get distance=2. The fact they have same subnet should not matter if on WAN side, but the interface-qualified % is needed (which should be added by DHCP client automatically)
While you can generally pick your own LAN side subnet to NOT conflict (further), and avoid these esoteric RouterOS questions… But let’s assume LAN absolutely has to be 192.168.1.1 and two WANs have to be 192.168.1.1… AFAIK that too should be fine without VRFs. Now where you MIGHT run into trouble is the firewall… but basically any IP-based matchers (wherever in filter/managle/nat/address-list) always need specify an SOME interface based match, since IP alone is not unique.
In terms of example, I’m more a Layer3 purist so each subnet should be unique so everything is routable across a larger network. When that’s not possible, another approach is to use “netmap” action in NAT to essentially remap some 192.168.1.0/24 - but this is more useful if you have some routed multi-site L3 architecture already. i.e. so rest of the network sees some edge with 192.168.1.1 as something else like 10/192/172.a.b.x… Basically NAT’s “netmap” lets you “alias subnet” like 192.168.1.x to 192.168.101.x to make it unique. Similar other tricks with “netmap” are possible as alternative to VRFs.
Anyway more food for thought. Maybe VRFs are the right approach if really everything is 192.168.1.0/24, but I try hard to avoid that case before getting to VRFs.
Last point, the OP has a starlink, so kinda question 0 is why not use by-pass mode to avoid 192.168.1.0/24 and also avoid a double-NAT…