Spanning VRFs Across Routers Without MPLS

This is my first post in Mikrotik forums :slight_smile:

We’re working on building a small private virtual overlay network using WireGuard as the transport. Basically, we have a few VMs running CHR/vyos and some physical MikroTik devices, all interconnected using WireGuard tunnels. Each node has its own private ASN and is running eBGP.

So far, we’ve brought up the routers in the cloud and other sites, and the WireGuard tunnels are all working. BGP instances are up, and all are peered using eBGP. The sites are currently spread across three countries.

At two sites, we’ve got some test devices connected, and routing is working fine. Each site will have a firewall, which we plan to connect to the local MikroTik using iBGP. Currently, everything is in the maint VRF.

In the future, we may need to add two more sites that must be completely isolated from the current setup. These new sites will use two different wireguard interfaces and have overlapping subnets, and due to security requirements, we want them completely separated. The easiest way is to add a new VRF (let’s call it vrf11) on the routers that these two new sites connect to, and run a separate BGP session for that VRF.

But this setup doesn’t really scale. If we add more isolated sites (e.g., vrf12, vrf13, etc.), then we’ll need to create separate BGP sessions for each VRF.

From what I understand, this is where MPLS normally comes into play, where you can run one BGP session and carry multiple VRFs across it. Correct me if I’m wrong.

But since we’re using WireGuard over WAN links on the public internet, MPLS isn’t really an option. We’re not doing any L2VPN, only L3 connectivity across sites. And each site has local internet breakout.

So here’s what I’m trying to figure out:

What’s the best way to set this up?
Is there a way to span multiple VRFs across sites using just one BGP session per router?
Or is MPLS absolutely required for that kind of multi-VRF transport?
What limitations or issues should we expect if we stick with the model of one BGP session per VRF?

For scalability, you can run MPLS on your WireGuard tunnel and run BGP VPN over it, as a negative it will add MPLS overhead to the packet.

1 Like

Thanks for the reply. I was checking this, and since my WireGuard transport is over the internet, the absolute maximum MTU I can get is 1440. For safety, I’ve set it to 1430 for now.

With this setup, if I add MPLS on top, I lose even more MTU, especially when exchanging more than one label.

This might be fine for most services, but I’m not entirely sure where and when it might break. If it does break, I’m unclear on what the next approach should be to fix it.

Also, I have one more question:

Is VRF Lite something entirely different from what I asked in my first question? afaik, it’s about maintaining multiple routing tables for different slices and running BGP per table (VRF). Is that correct?

VRF lite is when you are not using BGP VPN to connect remote customer locations over MPLS cloud.
It is local to your router.
See examples in the manual:

In your scenario vrf-lite can be considered if you create tunnel per customer, add the tunnel to customers vrf and route locally via static routes or IGP running over the tunnel.

1 Like

As long as you compensate for the MPLS headers in your MTU calculations, application connectivity should be fine.

I’ve run BGP and VPNv4 over tunnels a number of times and it works fine. The main tradeoffs are added complexity and lower throughput.

But if you need multiple VRFs over a tunnel to a site, then BGP VPNv4 and MPLS is still a more scalable approach. Unless you decide to automate the vrf lite approach so that you’re not managing the complexity manually.

1 Like

i think before you dive deeper - it is better to resolve that overlapping subnets. it is not scalable and will likely be the routine maintenance which most likely heading you to do network overhaul. it won’t be easy. even that’s the cause you think about doing network separation vrf.

1 Like

Depends on why the subnets are overlapped. In some networks, the overlap is intentional if it’s a cookie cutter solution that’s supposed to be isolated from the other duplicated networks. Have seen this a number of times in L3VPN solutions from ISPs. There isn’t enough RFC1918 space to give every large company non-overlapped space.

If it’s overlapped due to an oversight or some other reason, then I agree with you.

If it’s due to a network merger or migration, then it depends on a number of other factors as to whether it’s more beneficial to resolve the overlap or to work around it.

@StubArea51

true, your cookie cutter analogy is indeed the main idea of doing vrf. that is at a certain point - which beyond that needs serious workaround for other maintenance arises from @op vrf implementation idea.

if we read again at @op first post, his proposal was more like building carrier routing carrier network strategy - which we need to give @op a question back - whether @op and probably his DevOps team will be able to handle the maintenance work?

usually - mostly - crc strategy only offer 1 off the 2 choices he has :be layer 2 provider or layer 3 provider - hence @mrz has pointed that out.

if @op want to do it all - then it won’t be easy and probably costly. our MT forum here is not a total solution for @op questions i guess.

i was thinking a more cheaper approach for @op :

  1. qinq for network separation.
  2. doing ix Datacenter for their bgp Exchange .

but not to totally eliminate the choice pick as i have mentioned above. l2 or l3.

plus - not to forget the underlying wireguard to think about.

additional reading to clarify my points:
https://info.support.huawei.com/hedex/api/pages/EDOC1100277644/AEM10221/04/resources/vrp/feature_0022432728_new.html