Hi there,
We have recently seen a problem where the MPLS L3VPN implementation does not seem to take into account the BGP best path selection algorithm.
We have two routers, one with a lower router ID than the other, both putting out routes. The topology is as follows: -
To give some more info, router-id for Core1 is 1.1.1.1, 11.11.11.11 for Core2. I’m running OSPF as my IGP, and LDP is enabled on all relevant interfaces. Core1 and Core2 run as Route Reflectors for R1, R2 and R3
The following filter is applied on Core1: -
[showlette@Core1] /routing bgp instance> /routing filter print
Flags: X - disabled
0 chain=VRF-OUT prefix=0.0.0.0/0 prefix-length=0-32 protocol=connect
invert-match=no action=accept set-bgp-local-pref=90 set-bgp-prepend=2
set-bgp-prepend-path=""
1 chain=VRF-OUT prefix=0.0.0.0/0 prefix-length=0-32 protocol=static
invert-match=no action=accept set-bgp-local-pref=90 set-bgp-prepend=2
set-bgp-prepend-path=""
2 chain=VRF-OUT prefix=0.0.0.0/0 prefix-length=0-32 bgp-local-pref=200
invert-match=no action=accept set-bgp-local-pref=200
set-bgp-prepend-path=""
3 chain=VRF-OUT prefix=0.0.0.0/0 prefix-length=0-32 invert-match=no
action=accept set-bgp-local-pref=90 set-bgp-prepend=2
set-bgp-prepend-path=""
The following filter is applied on Core2: -
[showlette@Core2] > /routing filter print
Flags: X - disabled
0 chain=VRF-OUT prefix=0.0.0.0/0 prefix-length=0-32 protocol=static
invert-match=no action=accept set-bgp-local-pref=200
set-bgp-prepend-path=""
1 chain=VRF-OUT prefix=0.0.0.0/0 prefix-length=0-32 protocol=connect
invert-match=no action=accept set-bgp-local-pref=200
set-bgp-prepend-path=""
2 chain=VRF-OUT prefix=0.0.0.0/0 prefix-length=0-32 bgp-local-pref=90
invert-match=no action=accept set-bgp-local-pref=90 set-bgp-prepend=2
set-bgp-prepend-path=""
3 chain=VRF-OUT prefix=0.0.0.0/0 prefix-length=0-32 invert-match=no
action=accept set-bgp-local-pref=200 set-bgp-prepend-path=""
I have advertised a standard IPv4 route (80.1.1.0/24) to try and see if there is a difference in behaviour. I get the following in the log of R1 (with route logging added): -
09:26:50 route,debug,event Added candidate route
09:26:50 route,debug,event dst-prefix=80.1.1.0/24
09:26:50 route,debug,event attributes
09:26:50 route,debug,event protocol=BGP
09:26:50 route,debug,event scope=40
09:26:50 route,debug,event target-scope=30
09:26:50 route,debug,event next-hop= address=1.1.1.1
09:26:50 route,debug,event origin-type=BGP
09:26:50 route,debug,event origin-instance-id=0
09:26:50 route,debug,event bgp-peer-router-id=10.10.10.10
09:26:50 route,debug,event bgp-peer-flags=1
09:26:50 route,debug,event bgp-router-id=2.2.2.2
09:26:50 route,debug,event bgp-origin=IGP
09:26:50 route,debug,event bgp-nexthop=1.1.1.1
09:26:50 route,debug,event bgp-localpref=90
09:26:50 route,debug,event use-te-nexthop=yes
09:26:50 route,debug,calc Route to destination 80.1.1.0/24 received from 11.11.11.11 is better than route from 1.1.1.1, because LOCAL_PREFERENCE 200 is better than 90
So as expected, the route from 11.11.11.11 (Core2) is preferred due to a higher local preference.
Now I am advertising a couple of routes (a redistributed default, and a subnet that is present on both Core routers) for MPLS VPN, and I would have expected the same behaviour with setting local preference. I have put AS Path prepending in too, no other reason than to test if either value would be evaluated. However I get completely unexpected behaviour: -
09:26:51 route,debug,calc Route to destination
09:26:51 route,debug,calc rd
09:26:51 route,debug,calc type=0
09:26:51 route,debug,calc administrator=65009
09:26:51 route,debug,calc assigned-number=1
09:26:51 route,debug,calc pr=172.16.2.0/24 received from 11.11.11.11 is worse than route from 1.1.1.1, because RouterId 11.11.11.11 is worse than 10.10.10.10
09:26:51 route,debug,calc Route to destination
09:26:51 route,debug,calc rd
09:26:51 route,debug,calc type=0
09:26:51 route,debug,calc administrator=65009
09:26:51 route,debug,calc assigned-number=1
09:26:51 route,debug,calc pr=172.17.100.0/24 received from 11.11.11.11 is worse than route from 1.1.1.1, because RouterId 11.11.11.11 is worse than 10.10.10.10
09:26:51 route,debug,calc Route to destination
09:26:51 route,debug,calc rd
09:26:51 route,debug,calc type=0
09:26:51 route,debug,calc administrator=65009
09:26:51 route,debug,calc assigned-number=1
09:26:51 route,debug,calc pr=0.0.0.0/0 received from 11.11.11.11 is worse than route from 1.1.1.1, because RouterId 11.11.11.11 is worse than 10.10.10.10
So according to this, the routes received are now actually preferred based upon RouterID. This skips all the steps in the BGP Best Path process up until RouterID. My guess is BGP Best Path selection is not being evaluated at all in this case.
If I change the router ID on Core1, I see the following instead: -
09:45:42 route,debug,calc Route to destination
09:45:42 route,debug,calc rd
09:45:42 route,debug,calc type=0
09:45:42 route,debug,calc administrator=65009
09:45:42 route,debug,calc assigned-number=1
09:45:42 route,debug,calc pr=0.0.0.0/0 received from 1.1.1.1 is worse than route from 11.11.11.11, because RouterId 12.12.12.12 is worse than 11.11.11.11
09:45:42 route,debug,calc Route to destination
09:45:42 route,debug,calc rd
09:45:42 route,debug,calc type=0
09:45:42 route,debug,calc administrator=65009
09:45:42 route,debug,calc assigned-number=1
09:45:42 route,debug,calc pr=172.17.100.0/24 received from 1.1.1.1 is worse than route from 11.11.11.11, because RouterId 12.12.12.12 is worse than 11.11.11.11
Is there any reason why MPLS VPN does not take into account the BGP Best Path Selection algorithm and instead seems to go on some internal path selection criteria? BGP Best Path Selection exists to prevent loops, so not taking that into account could lead to the MPLS VPN looping traffic.