BGP Route Exchange Between Local VRFs on a MikroTik Router

irghost · December 16, 2024, 12:16pm

Overview:

I have a MikroTik router configured as an edge router with two separate VRFs (vrf1 and vrf2). Each VRF is connected to a different client via eBGP. The goal is to exchange routes between these two VRFs locally within the same router.

Details:

Clients and VRFs:

Client 1 is connected to vrf1 and advertises the route 192.168.11.0/24 via eBGP.

Client 2 is connected to vrf2 and advertises the route 192.168.22.0/24 via eBGP.

Objective:

Make the route 192.168.11.0/24 from vrf1 accessible in vrf2.

Make the route 192.168.22.0/24 from vrf2 accessible in vrf1.

Configuration:

VRF Setup:

Two VRFs (vrf1 and vrf2) are configured and mapped to separate interfaces.

BGP Sessions:

eBGP sessions are established in vrf1 and vrf2 to receive routes from Client 1 and Client 2, respectively.

Route Distinguisher (RD) and Route Target (RT):

Unique RDs are configured for vrf1 and vrf2.

RTs are used for importing and exporting routes between the VRFs.

Route Exchange:

Routes from vrf1 are exported with an RT that is imported by vrf2.

Routes from vrf2 are exported with an RT that is imported by vrf1.

Problem:

Routes are being exchanged between the VRFs (i.e., the routes appear in the routing tables), but they do not work. Specifically, the immediate gateway field is empty for the leaked routes, and traffic does not flow as expected.

mrz · December 16, 2024, 1:26pm

You don’t need BGP session between VRFs, VPN import is done without active session. See the manual:
https://help.mikrotik.com/docs/spaces/ROS/pages/328206/Virtual+Routing+and+Forwarding+-+VRF#VirtualRoutingandForwardingVRF-DynamicVrf-Literouteleaking

irghost · December 16, 2024, 5:34pm

I am aware that a separate BGP session between VRFs is not required for route leaking, as the route import/export process is handled by the Route Distinguisher (RD) and Route Target (RT) configurations.

The issue I am facing is that while routes are successfully exchanged between VRFs (vrf1 and vrf2), these leaked routes are non-functional because the immediate gateway field is empty in the routing table. As a result, the router cannot forward traffic based on these routes.

Updated Problem Description:

Routes from vrf1 are visible in vrf2 and vice versa, indicating that route exchange is occurring as configured.
However, the leaked routes have an empty immediate gateway field, making them unusable for traffic forwarding.
The issue appears related to the next-hop resolution for these leaked routes.

Key Question:

What additional configuration or troubleshooting steps are required to resolve the empty immediate gateway field for routes leaked between VRFs?
Supporting Information:

My RD, RT, and BGP configurations follow the standard practices as per MikroTik documentation, and the problem persists despite these configurations.

Let me know if you need logs, debug outputs, or more detailed configuration snippets for further analysis.

irghost · December 17, 2024, 6:01am

Questions:

How can I ensure that eBGP-learned routes in vrf1 are properly leaked into vrf2 with a valid immediate gateway (and vice versa)?

Is there a specific configuration required for next-hop resolution when redistributing eBGP-learned routes between VRFs?

Does the VPN configuration for route leaking need additional settings to handle eBGP-learned routes?

mrz · December 17, 2024, 11:23am

Gateway must be resolvable in that particular vrf. If it can’t be resolved then there will be empty immediate gateay and route will not be active.

irghost · December 17, 2024, 11:48am

[admin@PE3] /ip/route> print where routing-table=vrf-CE4
Flags: D - DYNAMIC; A - ACTIVE; c - CONNECT, b - BGP, y - BGP-MPLS-VPN; H - HW-O>
Columns: DST-ADDRESS, GATEWAY, DISTANCE
     DST-ADDRESS      GATEWAY           DISTANCE
DAyH 192.168.11.0/24  10.3.7.2@vrf-CE3       200
DAb  192.168.22.0/24  10.3.6.2@vrf-CE4        20
DAc  10.3.6.0/30      ether6@vrf-CE4           0
DAy  10.3.7.0/30      vrf-CE3@vrf-CE3        200
-- [Q quit|D dump|right]

[admin@PE3] /ip/route> print stats where routing-table=vrf-CE4
Flags: D - dynamic; X - disabled, I - inactive, A - active; 
c - connect, s - static, r - rip, b - bgp, o - ospf, i - is-is, d - dhcp, v - vp>
H - hw-offloaded; + - ecmp 
   DAyH  dst-address=192.168.11.0/24 routing-table=vrf-CE4 
         gateway=10.3.7.2@vrf-CE3 distance=200 scope=20 target-scope=10 

   DAb   dst-address=192.168.22.0/24 routing-table=vrf-CE4 
         gateway=10.3.6.2@vrf-CE4 immediate-gw=10.3.6.2%ether6 distance=20 
         scope=40 target-scope=10 

   DAc   dst-address=10.3.6.0/30 routing-table=vrf-CE4 gateway=ether6@vrf-CE4 
         immediate-gw=ether6 distance=0 scope=10 target-scope=5 
         local-address=10.3.6.1%ether6@vrf-CE4 

   DAy   dst-address=10.3.7.0/30 routing-table=vrf-CE4 gateway=vrf-CE3@vrf-CE3 
         immediate-gw=vrf-CE3 distance=200 scope=20 target-scope=10

[admin@PE3] /ip/route> print where routing-table=vrf-CE3
Flags: D - DYNAMIC; A - ACTIVE; c - CONNECT, b - BGP, y - BGP-MPLS-VPN; H - HW-O>
Columns: DST-ADDRESS, GATEWAY, DISTANCE
     DST-ADDRESS      GATEWAY           DISTANCE
DAb  192.168.11.0/24  10.3.7.2@vrf-CE3        20
DAyH 192.168.22.0/24  10.3.6.2@vrf-CE4       200
DAy  10.3.6.0/30      vrf-CE4@vrf-CE4        200
DAc  10.3.7.0/30      ether1@vrf-CE3           0

[admin@PE3] /routing/bgp/vpn> print 
Flags: X - disabled, I - inactive 
 0   name="bgp-mpls-vpn-1" 
     import.router-id=10.0.0.3 .route-targets=200:5 
     export.route-targets=300:5 .redistribute=connected,bgp 
     route-distinguisher="300:5" vrf=vrf-CE3 label-allocation-policy=per-vrf 

 1   name="bgp-mpls-vpn-2" 
     import.router-id=10.0.0.3 .route-targets=300:5 
     export.route-targets=200:5 .redistribute=connected,bgp 
     route-distinguisher="200:5" vrf=vrf-CE4 label-allocation-policy=per-vrf

mrz · December 17, 2024, 7:12pm

There was a problem with resolving BGP gateways. Next beta version will have the fix.

irghost · December 18, 2024, 6:12am

Thank you for your prompt response regarding the issue with resolving BGP gateways in local VRF setups. As this functionality is critical for my MPLS deployment and requires BGP to configure routing based on communities, I am eager to test the upcoming fix in my specific scenario.

Would it be possible to provide early access to the beta version containing this fix? I can test it thoroughly in my environment and provide you with detailed feedback to help ensure the resolution works as expected.

fischerdouglas · December 18, 2024, 6:18pm

Did you see the “H” marked on the leaked next-hop route to other VRF?
I could bet a beer that your issue is related to that!

[admin@PE3] /ip/route> print where routing-table=vrf-CE4
Flags: D - DYNAMIC; A - ACTIVE; c - CONNECT, b - BGP, y - BGP-MPLS-VPN; H - HW-O>
Columns: DST-ADDRESS, GATEWAY, DISTANCE
     DST-ADDRESS      GATEWAY           DISTANCE
DAyH 192.168.11.0/24  10.3.7.2@vrf-CE3       200

AS you can check on the actual version of MikroTik documentation about L3 Hardware Offloading.
RouterOS do not know how to deal with VRFs with hardware offload.

Only the main routing table gets offloaded. If VRF is used together with L3HW and packets arrive on a switch port with l3-hw-offloading=yes, packets can be incorrectly routed through the main routing table. To avoid this, disable L3HW on needed switch ports or use ACL rules to redirect specific traffic to the CPU.

Maybe if you disable all the hardware offload on those boxes, you could reach where you want. But probably with mediocre performance.

I also saw de “y” on your routes.
Your scenario seems to be related to a L3VPN over MPLS. Right?
It is worth mentioning that RouterOS also does not support Hardware Offload of Encap/Decap (P.E. Router) of MPLS.
Not for L2VPN(VPLS) or even for a simple L3VPN.
P.S.1: They do not support that kind of hardware offload, but the switch-chips they use do support it. So is fair to conclude that it is a GAP of RouterOS.
P.S.2: Someone will appear on the following posts screaming about fast-path. Saying that Router RouterOS supports Fast-Path to MPLS P.E. Usually they can’t differ kernel-by-pass and hardware-offload.

I have been trying for MANY YEARS to use MPLS on MikroTik, and my recommendation is:
If you like to sleep and not be woken up in the middle of the night, do not let any RouterOS(v6 or v7) operate LDP or any other Label Exchange protocol with devices from other more consolidated vendors (Cisco, Juniper, Arista, Huawei, Nokia, Etc…).

If it is a “pure” RouterOS network, then the chance of it working well is reasonable… I have implemented some of those myself.
But if it needs to exchange labels via LDP, RSVP, or any other protocol with other vendors, RUN AWAY!

mrz · December 18, 2024, 7:39pm

If you actually read OP you will see that it has nothing to do with label distribution protocols, there is zero need of such protocols in particular setup.

Also, there are no known problems with LDP interoperability with Cisco or other vendors. LDP is implemented according to RFC the same way as for other vendors. If you have a specific problem report a problem to support and don’t spread misinformation that LDP does not work with other vendors.

fischerdouglas · December 18, 2024, 10:37pm

Yep! I have read… And I did no say that this problem was related to LDP.
I have said that it is probably related to a wrong behavior related to VRF and Hardware offload.

I just took the opportunity to share my ugly experience with MikroTik and MPLS. And advising a colleague to avoid missing nights of sleep.

About the tickets, I can send a list with several of those(including many before using Jira) that I had received answer like “by now, this is the expected behavior.”