Community discussions

MikroTik App
 
User avatar
netzwerghh
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Sun Aug 07, 2011 4:23 pm
Location: Hamburg, DE
Contact:

BGP best path selection algorithm sometimes runs in wrong behaviour resulting in update loops

Wed Jun 15, 2022 2:21 pm

We are operating 5 CCR2004 with RouterOS 7 for our backbone and maintaining peering sessions at DE-CIX FRA, HAM, DUS, MUC and AMS-IX as well as to our upstream providers. Routes between those 5 routers are being exchanged via linux bird route reflectors in order to avoid full mesh bgp sessions.
Zeichnung.png
Mostly this works as expected. Recently the CPU load on our routers skyrocketed and the BGP announcement traffic to our route reflector heavily increased for no reason. Further investigation revealed a route announcement loop. Our router DCHAM-RTR01 startet playing announcement ping pong with DCHAM-RTR02. We simplyfied our network diagram down to the relevant hosts. We analyzed bgp announcements via wireshark and found that the follwing announcements were alternating in a less than second rhythm:
+2a0b:1300:81dc::/48 via 2001:7f8::20ad:0:1 med 225 localpref 200 as-path 8365 35548 209915
+2a0b:1306:1dc::/48 via 2001:7f8::20ad:0:1 med 225 localpref 200 as-path 8365 35548 209915
+2a0b:1300:81dc::/48 via 2001:7f8:3d::1b1b:0:1 med 100 localpref 200 as-path 6939 2914 209915
+2a0b:1306:1dc::/48 via 2001:7f8:3d::1b1b:0:1 med 100 localpref 200 as-path 6939 2914 209915
+2a0b:1300:81dc::/48 via 2001:7f8::20ad:0:1 med 225 localpref 200 as-path 8365 35548 209915
+2a0b:1306:1dc::/48 via 2001:7f8::20ad:0:1 med 225 localpref 200 as-path 8365 35548 209915
+2a0b:1300:81dc::/48 via 2001:7f8:3d::1b1b:0:1 med 100 localpref 200 as-path 6939 2914 209915
+2a0b:1306:1dc::/48 via 2001:7f8:3d::1b1b:0:1 med 100 localpref 200 as-path 6939 2914 209915
We ruled out that one of our peers is note route flapping. Announcemts from 6939 (Hurricane Electric) and 8365 (Man-Da) are stable.

We then used /routing/route/print detail where dst-address=... several times to catch the moment of routing table change on DCHAM-RTR01. There we found the reason:
[admin@ICHAM-RTR01] > /routing/route/print detail where dst-address=2a0b:1300:81dc::/48
Flags: X - disabled, F - filtered, U - unreachable, A - active; c - connect, s - static, r - rip, b - bgp, o - ospf, d - dhcp, v - vpn, m - modem, a - ldp-address, l - ldp-mapping, y - copy; H - hw-offloaded; + - ecmp, B - blackhole
b afi=ip6 contribution=candidate dst-address=2a0b:1300:81dc::/48 routing-table=main gateway=2001:7f8:1::a500:8365:1 immediate-gw=fe80::4a8f:5aff:fe00:2d36%bonding1 distance=200 scope=40 target-scope=30
belongs-to="BGP IP6 routes from 2a01:55e0::b000"
bgp.peer-cache-id=*B000003 .as-path="8365,35548,209915" .communities=35548:16,64800:42005,64800:41005,64800:40002,64800:49999 .originator-id=194.39.187.3 .local-pref=200 .med=220 .atomic-aggregate=yes .origin=igp
debug.fwp-ptr=0x20355D20

b afi=ip6 contribution=candidate dst-address=2a0b:1300:81dc::/48 routing-table=main gateway=2001:7f8:1::a500:8365:1 immediate-gw=fe80::4a8f:5aff:fe00:2d36%bonding1 distance=200 scope=40 target-scope=30
belongs-to="BGP IP6 routes from 2a01:55e0::b001"
bgp.peer-cache-id=*B000006 .as-path="8365,35548,209915" .communities=35548:16,64800:42005,64800:41005,64800:40002,64800:49999 .originator-id=194.39.187.3 .local-pref=200 .med=220 .atomic-aggregate=yes .origin=igp
debug.fwp-ptr=0x20355D20

Ab afi=ip6 contribution=active dst-address=2a0b:1300:81dc::/48 routing-table=main gateway=2001:7f8:3d::1b1b:0:1 immediate-gw=2001:7f8:3d::1b1b:0:1%vlan-de-cix-ham distance=20 scope=40 target-scope=10
belongs-to="BGP IP6 routes from 2001:7f8:3d::1b1b:0:1"
bgp.peer-cache-id=*B00025A .as-path="6939,2914,209915" .communities=64800:42002,64800:41001,64800:40001,64800:49999 .local-pref=200 .med=100 .atomic-aggregate=no .origin=igp
debug.fwp-ptr=0x2035F840

b afi=ip6 contribution=candidate dst-address=2a0b:1300:81dc::/48 routing-table=main gateway=2001:7f8::20ad:0:1 immediate-gw=2001:7f8::20ad:0:1%vlan-de-cix-fra distance=20 scope=40 target-scope=10
belongs-to="BGP IP6 routes from 2001:7f8::20ad:0:1"
bgp.peer-cache-id=*B000172 .as-path="8365,35548,209915" .communities=35548:16,64800:42001,64800:41002,64800:40001,64800:49999 .local-pref=200 .med=225 .atomic-aggregate=no .origin=igp
debug.fwp-ptr=0x203605A0
[admin@ICHAM-RTR01] > /routing/route/print detail where dst-address=2a0b:1300:81dc::/48
Flags: X - disabled, F - filtered, U - unreachable, A - active; c - connect, s - static, r - rip, b - bgp, o - ospf, d - dhcp, v - vpn, m - modem, a - ldp-address, l - ldp-mapping, y - copy; H - hw-offloaded; + - ecmp, B - blackhole
b afi=ip6 contribution=candidate dst-address=2a0b:1300:81dc::/48 routing-table=main gateway=2001:7f8:3d::1b1b:0:1 immediate-gw=2001:7f8:3d::1b1b:0:1%vlan-de-cix-ham distance=20 scope=40 target-scope=10
belongs-to="BGP IP6 routes from 2001:7f8:3d::1b1b:0:1"
bgp.peer-cache-id=*B00025A .as-path="6939,2914,209915" .communities=64800:42002,64800:41001,64800:40001,64800:49999 .local-pref=200 .med=100 .atomic-aggregate=no .origin=igp
debug.fwp-ptr=0x2035F840

Ab afi=ip6 contribution=active dst-address=2a0b:1300:81dc::/48 routing-table=main gateway=2001:7f8::20ad:0:1 immediate-gw=2001:7f8::20ad:0:1%vlan-de-cix-fra distance=20 scope=40 target-scope=10 belongs-to="BGP IP6 routes from 2001:7f8::20ad:0:1"
bgp.peer-cache-id=*B000172 .as-path="8365,35548,209915" .communities=35548:16,64800:42001,64800:41002,64800:40001,64800:49999 .local-pref=200 .med=225 .atomic-aggregate=no .origin=igp
debug.fwp-ptr=0x203605A0
What is happening? We are having three announcements (one is doubled because of two route reflectors distributing them) for the same destination. All three announcements have the same local-pref (200) and the same AS-path-length (3) but MED and differs. BGP path selection rule says: Take the route with lowest MED.

  1. DCHAM-RTR01 has 4 routes. Two via AMS-IX at DCHAM-RTR02 (immideate-gwxxx%bonding1) with med 220. One via DE-CIX-HAM with MED 100 (the correct active one). And one via DE-CIX-FRA with MED 225.
  2. DCHAM-RTR01 is publishing it’s active route to the route reflectors which are pushing them to DCHAM-RTR02.
  3. DCHAM-RTR02 is receiving the route with MED 100 via DCHAM-RTR01 and marks it as active because MED 100 is better than the MED 220 route via AMS-IX.
  4. DCHAM-RTR02 withdraws it’s announcent via AMS-IX with MED 220 from the route reflectors which are in turn withdrawing the two routes also from DCHAM-RTR01
  5. DCHAM-RTR01 has two routes left. One with MED 100 via DE-CIX-HAM. And one with MED 225 via DE-CIX-FRA. And now the error happens: DCHAM-RTR01 is marking the route with the higher MED (225) as active and pushes this route to the route reflectors replacing the MED 100 route.
  6. The MED 225 route is being pushed to DCHAM-RTR02 replacing the MED 100 route.
  7. DCHAM-RTR02 correctly sees that MED 200 route via AMS-IX is better than MED 225 route via DE-CIX-FRA and activates route via AMS-IX, pushing the route to the route reflectors.
  8. DCHAM-RTR01 get’s the MED 220 route via AMS-IX and now correctly activates the MED 100 route again, jumping back to step 1. The loop repeats.

It seems that for some weird reason in a special case the MED comparison is made wrong. It doesn’t make any sense why this is not happening all the time. And only when certain kind of routes are present. But this is what actually happens. We also ruled out any error being made by the route reflectors. Those are working as intended.

Routers are both on 7.3.1.

There is an open support case: SUP-84730
You do not have the required permissions to view the files attached to this post.
 
User avatar
netzwerghh
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Sun Aug 07, 2011 4:23 pm
Location: Hamburg, DE
Contact:

Re: BGP best path selection algorithm sometimes runs in wrong behaviour resulting in update loops

Tue Nov 15, 2022 8:06 pm

This is still happening in ROS 7.6. It seems MED-comparision sometimes just get's it wrong. Same LOCAL-PREF, same path-length (although different path), same origin, different MED. But only sometimes. Most times ROS get's it right. Or there somehing I don't see? Router-ID is same on all BGP templates. I've also updated the support request which got no reply by now with a new supout-file.
Flags: X - disabled, F - filtered, U - unreachable, A - active; c - connect, s - static, r - rip, b - bgp, o - ospf, d - dhcp, v - vpn, m - modem, a - ldp-address, l - ldp-mapping, y - copy; H - hw-offloaded; + - ecmp, B - blackhole
b afi=ip6 contribution=candidate dst-address=2a02:e00:ffe8::/48 routing-table=main gateway=2001:7f8:3d::1b1b:0:1 immediate-gw=fe80::de2c:6eff:fee0:7a83%bonding1 distance=200 scope=40 target-scope=30 belongs-to="bgp-IP6-2a01:55e0::b000"
bgp.peer-cache-id=*B000003 .as-path="6939,43289" .communities=64800:42002,65103:276,65101:4122,64800:41001,64800:40001,64800:49999,65104:150,65102:4000 .large-communities=6695:1000:2,6695:1001:1 .originator-id=194.39.187.2 .local-pref=300
.med=100 .atomic-aggregate=yes .origin=igp
debug.fwp-ptr=0x203460C0

b afi=ip6 contribution=candidate dst-address=2a02:e00:ffe8::/48 routing-table=main gateway=2001:7f8:3d::1b1b:0:1 immediate-gw=fe80::de2c:6eff:fee0:7a83%bonding1 distance=200 scope=40 target-scope=30 belongs-to="bgp-IP6-2a01:55e0::b001"
bgp.peer-cache-id=*B000004 .as-path="6939,43289" .communities=64800:42002,65103:276,65101:4122,64800:41001,64800:40001,64800:49999,65104:150,65102:4000 .large-communities=6695:1000:2,6695:1001:1 .originator-id=194.39.187.2 .local-pref=300
.med=100 .atomic-aggregate=yes .origin=igp
debug.fwp-ptr=0x203460C0

b afi=ip6 contribution=candidate dst-address=2a02:e00:ffe8::/48 routing-table=main gateway=2001:7f8:1::a500:9002:1 immediate-gw=2001:7f8:1::a500:9002:1%vlan-ams-ix distance=20 scope=40 target-scope=10 belongs-to="bgp-IP6-2001:7f8:1::a500:6777:2"
bgp.peer-cache-id=*B00006E .as-path="9002,43289" .communities=64800:42005,64800:41005,64800:40002,64800:49999 .local-pref=300 .med=220 .atomic-aggregate=no .origin=igp
debug.fwp-ptr=0x2035CD20

b afi=ip6 contribution=candidate dst-address=2a02:e00:ffe8::/48 routing-table=main gateway=2001:2000:3080:de1::1 immediate-gw=2001:2000:3080:de1::1%vlan-telia-iptransit distance=20 scope=40 target-scope=10
belongs-to="bgp-IP6-2001:2000:3080:de1::1"
bgp.peer-cache-id=*B000076 .as-path="1299,1299,43289" .communities=1299:30000 .local-pref=100 .atomic-aggregate=no .origin=igp
debug.fwp-ptr=0x20343180

Ab afi=ip6 contribution=active dst-address=2a02:e00:ffe8::/48 routing-table=main gateway=2001:7f8:1::a500:9002:1 immediate-gw=2001:7f8:1::a500:9002:1%vlan-ams-ix distance=20 scope=40 target-scope=10 belongs-to="bgp-IP6-2001:7f8:1::a500:6777:1"
bgp.peer-cache-id=*B00006B .as-path="9002,43289" .communities=64800:42005,64800:41005,64800:40002,64800:49999 .local-pref=300 .med=220 .atomic-aggregate=no .origin=igp
debug.fwp-ptr=0x2035CD20

Fb afi=ip6 contribution=filtered dst-address=2a02:e00:ffe8::/48 routing-table=main gateway=2001:7f8:1::a500:6939:1 immediate-gw=2001:7f8:1::a500:6939:1%vlan-ams-ix distance=20 scope=40 target-scope=10 belongs-to="bgp-IP6-2001:7f8:1::a500:6939:1"
bgp.peer-cache-id=*B00006C .as-path="6939,43289" .atomic-aggregate=no .origin=igp
debug.fwp-ptr=0x20350660

Who is online

Users browsing this forum: No registered users and 23 guests