Hoping this is a simple config issue that I’m just not seeing. I have a CRS309 that acts as my core switch with multiple VRFs. Each VRF will have unique default routes that go up to my firewall that has interfaces into each of the VRFs. Attached is a high level diagram. CRS309-multi-vrf-config.txt (3.58 KB)
Everything on my main routing table works fine going out through the firewall although oddly, if I look at the routing table it shows my configured default route for the MAIN routing table as entry 0 AND entry 1? with 1 being Inactive. In anycase, this is how it looks
[admin@CRS309] /ip/route> print
Flags: D - DYNAMIC; I, A - ACTIVE; c, s, y - COPY; H - HW-OFFLOADED
Columns: DST-ADDRESS, GATEWAY, DISTANCE
# DST-ADDRESS GATEWAY DISTANCE
0 AsH 0.0.0.0/0 172.19.0.9 1
1 IsH 0.0.0.0/0 1
DAcH 172.19.0.0/24 MainGateway 0
DAcH 10.10.10.0/24 HomeWired 0
2 IsH 0.0.0.0/0 172.19.80.9 1
DAcH 172.19.80.0/24 VRF-X-Gateway@vrf-x 0
DAcH 10.11.11.0/24 VRF-X-LAN@vrf-x 0
3 IsH 0.0.0.0/0 172.19.81.9 1
DAcH 172.19.81.0/24 VRF-Y-Gateway@vrf-y 0
DAcH 10.12.12.0/24 VRF-Y-LAN@vrf-y 0
4 IsH 0.0.0.0/0 172.19.82.9 1
DAcH 172.19.82.0/24 VRF-Z-Gateway@vrf-z 0
DAcH 10.13.13.0/24 VRF-Z-LAN@vrf-z 0
I can ping the firewall interface from their respective VRFs no problem. Here’s an example from the VRF-Z
The hosts within each VRF can talk to each other no problem. If I send traffic between VRFs, what should happen is they route out to the firewall and let the firewall route traffic. There’s nothing showing on the firewall pcaps which makes sense since the default route for each VRF isn’t being installed… I tried testing with check-gateway using arp or ping and neither works and the CRS still marks it inactive (that’s also why you will see I have ping or arp on check-gateway of the various routes since I was testing). Is there anything I’m missing here?
EDIT: Noting here on OP post that I tested 7.3 where it apparently should only enable l3hw-offload on the main table but this does not fix the issue with VRF breaking when enabled.
Ok, I think I found the issue. What I noticed when I was checking ALL traffic on the firewall itself, the CRS309 was pinging the gateways of the other VRFs using an IP thats on the main routing table. I guess if you don’t define a vrf-interface, the router will just use what’s available on the main routing table. I explicitly defined vrf-interface for each routing entry including the main one too and now routing is working as expected.
Spoke too soon, it stopped working altogether… This is odd. I’ll have to troubleshoot this some more and then post back here of my findings. What I did notice was that a test host that I placed into one of the VRFs, like say VRF-Z, for some reason their traffic is being routed through the Main routing table. I’m unclear why the CRS would have the packet routed there.
Ok, I’m throwing in the towel and I can’t figure out why multi VRF setup doesn’t work here. If I just define the default route in the MAIN table, my main LAN works fine, but obviously the other vrfs do not work.
As soon as I introduce another routing rule there for a completely DIFFERENT routing-table, the hosts that were on the main routing-table uses that instead which breaks connectivity.
It’s as if every subsequent rule overrules the previous ones. This is easily noticeable because my firewall would see ingress traffic from my hosts on the main routing-table, and then as soon as I add the second routing rule there, the hosts show up coming from VRF-X.
Anyone able to get multi-vrf routing with the same prefix to work? It looks to me that it can’t handle true VRF, meaning it can’t have multiple copies of the same prefix even if they are on separate VRFs pointing to different next hop addresses.
I’m not super familiar with VRF on Mikrotik specifically but I’m willing to configure this in my lab and use it as a learning experience. I’d rather do it with a matching RouterOS version since routing and feature support is so different across each right now.
Looking closer at your output it’s pretty obvious its v7.x.x on second inspection.
Okay, I think I got it. My test setup is using two v7.1.5 CHRs with one acting as the firewall:
When installing your configuration at first I also had inactive default routes. The problem seems to be that the next-hops of the static routes are always looked up in the main table unless specified otherwise. Changing the next-hops from something like gateway=172.19.80.9 to gateway=172.19.80.9@vrf-x allows it to look up the next hop in the appropriate table and then install the route:
Here is a ping from vrf-z to vrf-x through the firewall (and implicitly relying on two of the working default routes in those tables):
And :tool sniff proof that the packet is actually leaving hitting the firewall and coming in and out tagged on VLANs 982 and 980 as you might expect:
Let me know how this goes when applied to your situation.
Thanks so much for following up and testing it on your end. So I tested this and inter-vlan routing works but it breaks internet connectivity again like before because my hosts that are on the main routing table are now using the default route configured on a separate VRF. So in this setup here:
Hosts that are on the main routing table are now using VRF-X as its routing table and my firewall logs reflect that.
I am starting to feel like CRS line does NOT support true VRF.
EDIT: I also tested by defining the @main in the first entry but same issue. As soon as I add the second default route for a different VRF, the hosts in the main table lose access to internet.
The internet in this case is supposed to be pictured north of the firewall right? I added a loopback to the “firewall” in my test setup w/ 8.8.8.8/32 to represent the internet and when I attempt to ping from the router to there, it does still correctly use the main table and it takes Eth1/1 to the firewall like it’s supposed to.
Can you do me a favor and post a copy of your :routing route print detail? I’m beginning to wonder whether this might be a negative interaction with L3 hardware offloading. According to the documentation [1], VRF isn’t supported on L3-hardware-offloading but I just assumed that it would only install main table routes and leave the rest alone. It’d also be easy to rule out if you just disable it with :interface ethernet switch set 0 l3-hw-offloading=no and see if you get different behavior.
I’ll have to wait for this weekend to troubleshoot further. I really hope that’s not an issue with clashing with hardware offloading! Can you confirm the config syntax you used to add the static routes? Did you also include vrf-interface?
I didn’t specify vrf-interface, I just specified the dst-address, the routing-table, and the next-hop gateway using the @
format. I’m frankly not even sure what vrf-interface even does. The only thing the documentation even says about it is “VRF interface name” and that it is a string with a default value of “10”??? [1]
Ok, I did some more testing and that is exactly the issue. If you enable hardware offloading, it breaks VRF completely. I was able to reproduce this behavior disabling and enabling the feature. If enabled, all VRFs will utilize the last configured routes you have in your routing tables. If disabled, they will stick to their respective routing tables.
Normally this wouldnt be a big deal for networks that only utilize a single vlan to cover their entire lan but in my case, my setup utilizes multiple vlans in the main table for servers, users, etc, that has alot of inter-vlan traffic, and I also have broken off a VRF to hold guest wifi and iot devices, etc.
This sounds like I should probably look at alternatives for my core switch and leave my Mikrotiks as just L2 switches.
Glad we isolated it. I still think this behavior deserves some inspection by Mikrotik. It’s all fine and well to not support VRF via hardware offloading, but the correct answer should be to invalidate the “hardware offloading” H flag for VRF routes so that only the main table ones are installed. Turning the hardware forwarding table into a FIFO route salad of the most recent additions from multiple VRFs just isn’t sane.
I haven’t exactly combed the patch notes for this either. Are you running v7.1.5?
Mikrotik doesn’t really do roadmaps but they did communicate [1] that a lot more software support for hardware offloading is coming in v7.x.x, including VRF. Still, for right now a capable CPU-based router like one of the CCRs is probably your best bet and leaving the CRSs to do line-rate switching.
Yes, I’m running 7.1.5 and I agree, if hw-offloading isnt supported on VRFs, then it really shouldn’t break VRF functionality. I would expect it to work on the main table and not on custom VRFs. I opened a ticket with them so maybe they have an avenue of communication to the devs for either a feature request/bug fix?