BGP instance suddenly disabled?

Hi
strange thing happened to us today: CCR1036 running RouterOS v6.36 today suddenly disabled (all by itself!) a single BGP instance running.

We investigated the issue, but the system is only manageable via private IPs and not exposed to remote access, therefore we are quite sure it did this thing by itself.

This instance is using its own vrf routingtable and speaks to exactly one external peer.

One of our techstaff told a story of previous RouterOS releases where simply viewing a full bgp table could crash the BGP instances, but another instance on this box running a /24 filtered full table was still running fine without issues (also using its own vrf routingtable).

As we upgraded the RouterBoard-firmware only about 10 hours ago, this might be connected, though we don’t see how. And therefore the uptime of the system was only about 10 hours when this issue arose, previously this BGP instance was stable and working for several weeks.

Any hints are greatly appreciated,
hk

Don’t rule out a junior admin clicking the disable button and not owning up to it.

I was once involved with a total outage of a service provider’s VoIP system. The issue ended up being an obscure checkbox being checked deep down inside the configurations somewhere - nobody admitted to it. We had to do lots of log diving, looking for security breaches, etc… and eventually the guilty party fessed up.

not ruling the junior out, but I’m a quite trusting guy on the other hand, because people at our company are allowed to make errors and not get expelled :slight_smile:

on the positive side: we now do extensive bgp monitoring for the mikrotik boxes in our network…

That’s good - but just to make sure there wasn’t a language thing - by “rule out” I didn’t mean “remove administrative privileges” - I meant don’t forget to consider the possibility that it might have been simple human error that caused the bounce.

One of my complaints about ROS’s BGP implementation is that it seems too eager to bounce the neighbor connections whenever you make configuration changes in the BGP. Perhaps some changes are being made which cause BGP to re-start the sessions.

Hi,
no language problem here, he still has access :wink:

I can only second your opinion on MikroTik’s BGP, while it likes restarts for changes to its configuration it on the other hand needs an extra push to use updated filterrules :slight_smile:

regards
hk