Another thing - the iBGP peering (R1<>R2) should be done using loopback IP addresses, and not the /30 addresses of the link between them. (this is a common BGP novice mistake)
Can you give more information on this? This is the first time I hear this about iBGP peerings.
Ask and ye shall receive. I just happened to have a lab running in GNS3 that perfectly illustrates this:
BGP Lab Drawing.png
Configuration Details:
- R1, R2, and R3 are all peering with each other using iBGP in full mesh (no route reflectors).
- They are configured to use their loopback IP as the router ID
- They use the remote router's loopback IP as the peer address.
- Each uses its loopback interface as the source interface.
- The routers of AS100 are using OSPF as the IGP.
Why iBGP must peer using loopback IPs:
Suppose R2 and R3 were configured to peer using their link IPs (10.2.3.2 and 10.2.3.3).
Now suppose that this link fails. R2 will not be able to reach 10.2.3.3, and R3 will not be able to reach 10.2.3.2. This iBGP peering session will no longer be possible, so the session will drop. Since this is a full mesh of iBGP peerings, the only way R2 knows that AS600 exists is by learning it from R3. When the iBGP R2<>R3 session drops, R2 will lose its BGP path to AS600. This means that AS500 will no longer be able to reach AS600. (R3 will conversely lose its routing information about AS500).
This is silly because R2 could still reach R3 via R1, and OSPF within the ASN will quickly converge on this alternate path. There's an available path from AS500 to AS600 but it will fail thanks to AS100 having iBGP configured improperly.
Why can't OSPF save the day?
One may ask: if OSPF sees the way around via R1, why can't R2 reach 10.2.3.3 via R1 as well?
It
can potentially, but not using source IP of 10.2.3.2 - to see why, let's look at the various failure cases.
There are three possible modes of failure of the link:
1) The link fails in a way that both routers see the interface go down. In this case, those IP addresses are not reachable because no interfaces in that network are active.
2) The link fails in a way that both routers see the interface as physically up, but transmissions across the link all fail. In this scenario, the routers will have link into the 10.2.3.0/24 network, and both will consider the network as being locally connected. So R2 will always send packets to 10.2.3.3 via its directly-connected interface. It would never dream about looking at OSPF for the rest of that /24 network's hosts. Same for R3.
3) R2 sees the link as physically down while R3 sees it as physically up. In this scenario, R2 would see reachability to 10.2.3.0/24 in OSPF via R1. R2 would be able to ping 10.2.3.3, but NOT using its 10.2.3.2 address. R2 would use one of its other interfaces' IP address to communicate with 10.2.3.3 because the interface with the source IP address 10.2.3.2 is down. That doesn't matter though, because even if it DID use 10.2.3.2 as the source IP, R3 could not properly reply. R3 would send packets to 10.2.3.2 out via the directly connected link, which would fail to deliver the packets to R2. Obviously this is the same case if R3 is the one with interface down and R2 is the one with interface up.....
In all cases, there is no possible round-trip communication between 10.2.3.2 and 10.2.3.3. Thus, if this link fails, and iBGP is configured using these endpoint IPs, iBGP will fail.
The loopback IP is ALWAYS reachable
So long as one of the routers has an OSPF neighbor available, its loopback IP address will be advertised into OSPF, so no matter what changes happen to the topology, the loopback IP is reachable. So if R2 and R3 are properly configured to use iBGP via their respective loopback IPs, the iBGP session between them will not drop if the physical topology changes.
Remember: IBGP does not act like an IGP such as OSPF
The next hop IP of iBGP is always going to be the next hop IP learned from the original EBGP peering. So when R3 learns about 6.0.0.0/8 from AS600, the next hop will be 10.3.6.6 - and then R3 will tell R1 and R2 about this prefix. R1's entry will show the next hop as 10.3.6.6, which in turn must be reachable via some other route (typically OSPF). Thus iBGP routes tend to be "recursive next hop" type routes. This is so that each router can make a BGP decision based on how far away the next-hop destination is. This simple topology doesn't make it obvious, but if you were to imagine a much larger network, and multiple ways to reach the remote ASNs, the reason for this behavior becomes more apparent.
Say that router X is comparing two paths to Google learned via iBGP from routers Y and Z. If the metrics to reach Google via Y and Z are otherwise equal, router X is going to choose based on the distance to router Y or Z. Under normal conditions, the path from router X to router Y might be the shortest, so it will prefer the Google path from router Y's peer. But if the topology changes and router Z is now closer to X than router Y, router X will start preferring to reach Google via router Z. Router X will proceed to update its eBGP neighbors about this change as well. The eBGP neighbors of router X may not like something about the path through your network via peer Z and choose one of their other peers instead. Whenever X sees that it's shorter to go through Y again, X will update the eBGP peers who may now like your network again over their other peers.
I didn't mean to go quite so far into BGP theory in general, but I think it's important to understand why iBGP preserves the next hop IP where an IGP will cause the next hop to be the neighboring router's IP. It's because of this behavior that you want the iBGP neighbors to be able to reach each other in all circumstances, which is the reason you should use loopback interfaces for iBGP.
You do not have the required permissions to view the files attached to this post.