BGP / Configuration Sync

Hi,

I have a CCR1009-8G-1S-1S+ currently in colo doing firewall and router duties. Our DC provider is announcing a /24 we own using their ASN and routing this to us on a /29 from their IP pool. Plan is to add another CCR1009-8G-1S-1S+ in a couple of weeks and I am looking for some feedback and possibly guidance on how I should be configuring things.

We have confirmed that we can do BGP to the DC provider using our own ASN which we have. We will have a /30 (using private IPs) on each CCR connecting us to the DC provider, then run BGP on each router announcing the same /24 to the internet. I will use VRRP on the inside for all gateways so when the master router is unavailable outbound traffic will go via the secondary unit. I presume we will have no control over incoming traffic as it could go to either router. So both routers will need to have the same firewall rules, states and routing tables as each other which I will put some scripts in place to keep in sync.

Does this seem like a good approach?

Thanks,
PK

I have drawn up my proposed layout.
1.png
I will need to have a BGP peer from r1 to the colo provider and r2 to the colo provider to announce our public /24 - do I need to have a BGP peering between r1 and r2?

Sounds about right. I have the same setup on a site and works without an issue for years now.

But I use BGP MED so that the 2nd CCR is pretty much standby. The moment the primary CCR goes down or I need to disable the BGP peer for maintenance, all traffic goes through the 2nd CCR.

I advertise multiple /24s so using filters I can ‘load balance’ the traffic between the 2 CCRs for each /24.
For example 2 /24s via CCR1 and 2 /24s via CCR2. (I usually employ this method when having a large inbound DDoS attack so that the affected /24 gets routed via CCR2 allowing the rest of the subnets to keep working via CCR1)

On the inside, I use separate VLANs for each subnet so that I can have some of them running as primary/active vrrp on CCR1 and some on CCR2.

This way incoming and traffic can be balanced between both CCRs.


You need to have iBGP between R1 and R2. Imagine a scenario where for some reason the BGP peer on CCR1 goes down.
VRRP would still be primary on CCR1 and all hosts would stop being able to reach the internet since CCR1 would have no routes to it.
Having iBGP will allow you to do full or partial failover.
Meaning you would either failover the BGP side or the VRRP side or both.

Check this presentation: https://mum.mikrotik.com/presentations/GR15/MUMGreece-Athens-2015-Nikalexis_Nikos.pdf
It has many of these stuff explained.

Thanks for the reply. My plan would be the same to run everything through the first router with the second only passing traffic in the event of the first being unavailable.

Ah yes I had never thought of the issue of the peering connections going down while the router was still up. Will ensure I create a peering between the first and second router.

Been having a read through the presentation you mentioned, very useful.

Thanks again for your advice.

Regards,
PK

Note that there is currently no form of inter-chassis state synchronization tech for Mikrotik - so you can’t have seamless failover if you’re using connection tracking on these routers.

Also regarding the vrrp configuration if you use vlans you’d probably want to not have them as slaves to the vrrp interface.

I used it this way for a few months and I noticed that it would drop packets on high traffic (or attack) and also caused a slight cpu overhead.
Full disclosure, I haven’t tested if this has been resolved in recent versions of mikrotik nor I ever filed a bug report.
I also presume that this is a CCR/Tilera thing only since in the presentation they use a PPC based router. I’m sure they would have noticed if there was a similar issue since they do mainly VoIP which would have been greatly affected.

I instead have the vlan interfaces directly slaved to the sfp interface (or bonding in your case) and I use the On Master/On Backup scripting capabilities of the vrrp interface to disable and enable the vlan interfaces whenever they change into master or backup state.

Also I have two vrrp interfaces on each router so that one is master on one router and the other master on the other.
This way I can easily manipulate which vlan is active on which router by simply editing the vrrp interfaces (and BGP filters) if/when needed (ie: during an attack).

I have been using it this way for years now and the failover has worked properly on a couple of unplanned downtimes and attacks.

One caveat with this method though is that you need to set the mac address of the bonding interface on both routers the same otherwise during the failover the traffic will stop until all hosts and the switches learn the new mac address of the 2nd router.

While state synchronisation would be nice it isn’t the end of the world, clients will just need to retry.

Once thing I can get my head around is how to give each router an accessible IP so I can reach both independently. Both routers will have the same config in terms of firewall rules etc. Each running their own LNS with different public IPs.

So outbound traffic can originate from either router but how will inbound traffic decide which path to take?

My ideal setup would be:
Screen Shot 2017-07-18 at 09.27.36.png
With each router having a independent IP for management out of the /24 that is announced, each router having an independent IP for LNS server running on it.

Is the possible with a single /24?

Thanks,
PK

Make the two /30’s public IPs (which is industry best practice), or alternatively, simply assign each router a /32 on a loopback from your /24. Connected routes will have a higher preference than BGP/Static routed IPs, so your IGP should be able to route the traffic adequately.

Thanks - think this makes sense in my head. I keep forgetting that the routers will have iBGP between them so what one knows the other will too!

Doing the config tomorrow so will let you know how I get on.

Regards,
PK

Also don’t forget OSPF! Otherwise you will need static routes (which can be a PITA is your network changes often).

Without OSPF or static routes iBGP will be pretty much useless (all the routes will be inactive).

An other alternative would be selecting ‘Netxhop Choice: force self’ option on the iBGP peers but I am not sure if this is the best approach (I need to look into this).

In general, iBGP’s purpose is to distribute routing policy throughout the AS, and it’s very “next AS hop” centric… which is why the next hop remains unchanged as iBGP messages propagate your AS. So if one border router chooses to use an eBGP peer’s advertised route, the border router will push this decision into iBGP. iBGP tends to look like a list of decisions that each eBGP border router will make. The distance to reach that next hop comes into the decision making process. It’s like picking a plane ticket based on which airport you may fly from. You might find a really cheap ticket from an airport that is a 12hr drive from your home, so it’s just not worth it, whereas if you lived next door to that airport, this ticket would be a no-brainer, or if you lived about 2hrs away, you may choose it if it’s significantly cheaper than your local airport, etc. But the distance to reach that egress point from your own system is important to BGP, which is why iBGP keeps the original next hop learned from eBGP sessions.

Setting next-hop-self in iBGP obfuscates this process and typically works against the goals of BGP.
Note that iBGP’s default administrative distance is 200 - meaning that other protocols such as OSPF take precedence over it.

So I now have two BGP sessions established with our transit provider announcing our /24 over both and receiving a default route back. I can ping out of either connection and inbound traffic goes to r1. I also have a BGP session established from r1 to r2 but do not have any routes announced over this yet.

Ideally I would like r1 to handle all traffic and in the event of a loss of transit to r1 or r1 being unavailable all traffic will flow through r2 including all NATd connection.

I tried removing my routing filters on the BGP session between r1 and r2 which removed my announcement from the advertisements table within Routing>BGP.

Two questions I have:

  1. How should BGP be configured between r1 and r2? Currently I have them directly cabled to each other with a /30 as a link. BGP is built on these IPs using our public ASN.

  2. How do I configure r2 to be able to NAT when traffic arrives at it? Do I need to mirror the IP address / firewall rules to be the same as r1?

Thanks,
PK

Hopefully you’re using stateless 1:1 nat (action=netmap) so that either router can successfully perform the correct NAT w/o needing any stateful information. If you’re using stateful NAT, then if connectivity moves from one router to the other, then things will temporarily break due to the fact that the routers cannot share their state information.

Another thing - the iBGP peering (R1<>R2) should be done using loopback IP addresses, and not the /30 addresses of the link between them. (this is a common BGP novice mistake)

If you want simple failover behavior, then this is pretty easy, especially if ISP2 allows you to send community lists on your advertisements.
If ISP2 has a community that instructs their network to lower the local_preference of a prefix, then you should use it to lower that preference below the preference of routes received from their peers / transit providers. This means that your announcements going to ISP1 will go around the Internet and via some path they will reach ISP2. You want ISP2 to choose this path instead of going directly to you across your link.

If they do not support communities to lower their Local_Pref, then you must use AS-Path-Prepend. Prepend all advertisements something ridiculous, like 5x prepend. ISP2 may still prefer going directly to you due to local_preference values inside their own network, but once your announcements via ISP2 go outside of their network, those announcements are going to look pretty bad at the AS-PATH decision point compared to your announcements via ISP1.

Then on your side, just make the in-filter from ISP2 always set local-pref=90 and you’re done. Your outbound traffic will always choose ISP1 over ISP2 (unless ISP2 sends you a more specific route

If you have a /23 block or larger, then you can easily force failover w/o tweaking metrics. Suppose you have a /22. Announce only the /22 into ISP2, and announce two /23 prefixes into ISP1. That will force the primary/backup behavior you want, and no amount of metric tuning by ISP2 could affect this. I generally try to avoid this tactic if possible just because it contributes to an ever-expanding global BGP table. Many organizations do this, and as a result, the global BGP table is sitting at about 641k prefixes right now.

Can you give more information on this? This is the first time I hear this about iBGP peerings.

Ask and ye shall receive. I just happened to have a lab running in GNS3 that perfectly illustrates this:
BGP Lab Drawing.png
Configuration Details:

  • R1, R2, and R3 are all peering with each other using iBGP in full mesh (no route reflectors).
  • They are configured to use their loopback IP as the router ID
  • They use the remote router’s loopback IP as the peer address.
  • Each uses its loopback interface as the source interface.
  • The routers of AS100 are using OSPF as the IGP.

Why iBGP must peer using loopback IPs:
Suppose R2 and R3 were configured to peer using their link IPs (10.2.3.2 and 10.2.3.3).
Now suppose that this link fails. R2 will not be able to reach 10.2.3.3, and R3 will not be able to reach 10.2.3.2. This iBGP peering session will no longer be possible, so the session will drop. Since this is a full mesh of iBGP peerings, the only way R2 knows that AS600 exists is by learning it from R3. When the iBGP R2<>R3 session drops, R2 will lose its BGP path to AS600. This means that AS500 will no longer be able to reach AS600. (R3 will conversely lose its routing information about AS500).

This is silly because R2 could still reach R3 via R1, and OSPF within the ASN will quickly converge on this alternate path. There’s an available path from AS500 to AS600 but it will fail thanks to AS100 having iBGP configured improperly.

Why can’t OSPF save the day?
One may ask: if OSPF sees the way around via R1, why can’t R2 reach 10.2.3.3 via R1 as well?
It can potentially, but not using source IP of 10.2.3.2 - to see why, let’s look at the various failure cases.

There are three possible modes of failure of the link:

  1. The link fails in a way that both routers see the interface go down. In this case, those IP addresses are not reachable because no interfaces in that network are active.
  2. The link fails in a way that both routers see the interface as physically up, but transmissions across the link all fail. In this scenario, the routers will have link into the 10.2.3.0/24 network, and both will consider the network as being locally connected. So R2 will always send packets to 10.2.3.3 via its directly-connected interface. It would never dream about looking at OSPF for the rest of that /24 network’s hosts. Same for R3.
  3. R2 sees the link as physically down while R3 sees it as physically up. In this scenario, R2 would see reachability to 10.2.3.0/24 in OSPF via R1. R2 would be able to ping 10.2.3.3, but NOT using its 10.2.3.2 address. R2 would use one of its other interfaces’ IP address to communicate with 10.2.3.3 because the interface with the source IP address 10.2.3.2 is down. That doesn’t matter though, because even if it DID use 10.2.3.2 as the source IP, R3 could not properly reply. R3 would send packets to 10.2.3.2 out via the directly connected link, which would fail to deliver the packets to R2. Obviously this is the same case if R3 is the one with interface down and R2 is the one with interface up…

In all cases, there is no possible round-trip communication between 10.2.3.2 and 10.2.3.3. Thus, if this link fails, and iBGP is configured using these endpoint IPs, iBGP will fail.

The loopback IP is ALWAYS reachable
So long as one of the routers has an OSPF neighbor available, its loopback IP address will be advertised into OSPF, so no matter what changes happen to the topology, the loopback IP is reachable. So if R2 and R3 are properly configured to use iBGP via their respective loopback IPs, the iBGP session between them will not drop if the physical topology changes.

Remember: IBGP does not act like an IGP such as OSPF
The next hop IP of iBGP is always going to be the next hop IP learned from the original EBGP peering. So when R3 learns about 6.0.0.0/8 from AS600, the next hop will be 10.3.6.6 - and then R3 will tell R1 and R2 about this prefix. R1’s entry will show the next hop as 10.3.6.6, which in turn must be reachable via some other route (typically OSPF). Thus iBGP routes tend to be “recursive next hop” type routes. This is so that each router can make a BGP decision based on how far away the next-hop destination is. This simple topology doesn’t make it obvious, but if you were to imagine a much larger network, and multiple ways to reach the remote ASNs, the reason for this behavior becomes more apparent.

Say that router X is comparing two paths to Google learned via iBGP from routers Y and Z. If the metrics to reach Google via Y and Z are otherwise equal, router X is going to choose based on the distance to router Y or Z. Under normal conditions, the path from router X to router Y might be the shortest, so it will prefer the Google path from router Y’s peer. But if the topology changes and router Z is now closer to X than router Y, router X will start preferring to reach Google via router Z. Router X will proceed to update its eBGP neighbors about this change as well. The eBGP neighbors of router X may not like something about the path through your network via peer Z and choose one of their other peers instead. Whenever X sees that it’s shorter to go through Y again, X will update the eBGP peers who may now like your network again over their other peers.

I didn’t mean to go quite so far into BGP theory in general, but I think it’s important to understand why iBGP preserves the next hop IP where an IGP will cause the next hop to be the neighboring router’s IP. It’s because of this behavior that you want the iBGP neighbors to be able to reach each other in all circumstances, which is the reason you should use loopback interfaces for iBGP.

Wow! Thanks for the detailed explanation! :smiley:

I figured it had to do with a failure scenario but I hadn’t analyzed it enough in my mind to understand what would happen exactly.

It makes perfect sense now.

I quickly tried it on my network (similar topology as your lab) but I couldn’t get it to work.
The moment the iBGP peering is established it gets disconnected immediately.
But I did see (and confirm) what happens when an iBGP peer goes down.

I’ll have to look into it.
I’ve had this setup for well over a decade and I am baffled I haven’t noticed that issue before!
It kinda helped that the bgp network (not internet) is consisted only of transit ASes with no filtering whatsoever and tons of alternative paths so the problem was not easily visible (everything kept working, albeit using longer paths depending on each router’s view of the network)

Great day today! I learned something new :smiley:

I’d say to make sure that both routers use the same IP for both remote and local IP.
In WinBox, the neighbor configuration points basically look like this:
General Tab:

  • Remote Address: remote router’s loopback IP
  • Remote AS: same as local ASN (That’s what causes it to be an iBGP session)
  • Nexthop Choice: default

I’m not immediately certain as to whether BGP requires the timers to match with the peer’s settings the same way OSPF does, but it might help to make sure those match. Route Reflect is not a part of this, but if your network uses RR, then obviously the RR should have that option enabled on all RR-clients. You don’t need to check “multihop” for iBGP peers.

Advanced tab:
Be sure that Update Source is set to be your loopback interface. You can use different update sources for each peer, so you need to remember to go set that every time. I suspect that this is the missing piece of the puzzle for you, but it may be other things like firewall filter rules, etc.

Yeah I tried those already.

If it was a firewall issue it would have timed out but in my case the BGP peering establishes and then disconnects right away on both ends
Not a timer issue either - those are configured properly and have been working for years, I don’t see why they would cause a problem by simply changing the remote-address of the peers.
I also tried update source option but it didn’t work either. I don’t remember what exactly this does. Should I select the interface or enter the loopback IP on this field?

I’ll look more into it later on because it’s still daytime here and I don’t want to cause any more flapping on the network now :slight_smile:


Just a sidenote since I kinda hijacked the topic with this.
The use of loopback IPs on the iBGP peers is only needed when having 3 or more routers in the same AS, correct?

I mean, with only 2 routers (like packetkicker wants) there’s no alternative path from R1 to R2.
But I get it’s a best practice to adopt either way :wink:

Heh - unhijacking the thread…

It may seem that way, but if you have a pair of failover routers with a direct cable between the two, then you actually have two paths between the routers. You have the direct cable AND you have the customer-facing LAN interfaces where they’re doing the failover protection (the VRRP-protected network)

In this exact scenario, it doesn’t make much difference, but suppose your scenario were such that you had 2 border routers, but they were at different locations and they had different circuits connecting them. It’s easily obvious that one circuit may fail, causing the routers to use the other circuit. This would be a valid case where a 2-router BGP configuration should still use loopback interfaces.

Basically, I can’t think of any reason in a “best practice” deployment where you would actively decide to use a physical interface to peer two iBGP routers because you didn’t want them talking to each other when that link was down while other paths were available. In short - I’d say it’s easier to just make your habits follow best practice in all cases so you don’t need to remember “do this in this situation, but do something else in all other cases”

Hi,

Thanks for the advice.

I have setup /32 loopbacks on each router and moved to these IPs for peering. I changed the outbound routing filter for R2 to be:

add action=accept chain=isp-out prefix=12.34.56.0/24 set-bgp-prepend=3

This has forced all inbound traffic to R1 and using VRRP forces outbound traffic to R1. I have added a netwatch event on R1 to check if it can reach the internet and if not change the VRRP priority to force outbound traffic to R2.

add down-script=vrrp-prio-down host=8.8.8.8 interval=5s up-script=vrrp-prio-up
add name=vrrp-prio-down owner=admin policy=
ftp,reboot,read,write,policy,test,password,sniff,sensitive source=
“/interface vrrp set priority=1 [find priority=255]”
add name=vrrp-prio-up owner=admin policy=
ftp,reboot,read,write,policy,test,password,sniff,sensitive source=
“/interface vrrp set priority=255 [find priority=1]”

Seems to work ok for the testing I have done so far. If BGP peering is lost between R1 and the ISP then traffic fails over to R2 and if R1 is offline traffic fails over to R2. For now this is ideal as it offer protection for hardware failure and gives me comfort to perform software updates etc without causing an outage.

Thanks again for the help.

Regards,
PK