The topology/setup
We have a core router that uses iBGP sessions to the rest of our network, it only establishes iBGP sessions to the sites that have a layer2 fibre connection (We’ll call them A/B/C). The core is setup as a route reflector to these sites. Each of these site has its own Layer2 connection to the core, they are not bridged together, they have their own /30 range. The rest of our network uses OSPF, redistribution is used between the Core and these 3 distribution sites
Because of the lower AD of OSPF those routes are preferred so i.e. A when talking to B will take a longer path through the rest of the network, instead of going through the Core->B. This is fine because I need MPLS and that doesn’t work over BGP (whole big issue with this at the Core. I would like to run OSPF there as well but for some reason it just kills the SFP+ ports after 1-24 hours)
I do have some route filters on A/B/C to lower the AD of the Core’s loopback address so it uses the direct L2 service instead of preferring a redistributed OSPF route from a neighbor
The iBGP sessions are NOT established with loopback addresses, they are using the directly connected IP addresses which should always be preferred. So this should not be the issue
The problem
When an OSPF adjacency is lost further down in our network, all of the iBGP sessions drop.
I cant work this out, It should have absolutely no affect whatsoever because the iBGP sessions are established via the IP address on the same segment. So even if there was a routing issue it shouldn’t matter because they are directly connected, it would/should never use any route other than that direct connection
And if there was something like a massive Layer2 loop that caused CPU usage at one of the sites to go to 100% and be unable to process any packets, how would that affect way further upstream? And it surely shouldn’t affect all 3 sites