RouterOS 3.20 BGP peer stability issue

Hello all,
we are using RB1000 as the border routing platform with success until the upgrade to RouterOS v 3.20. After upgrade to 3.20 peers was falling randomly off.

We had to downgrade back to RouterOS v 3.17. All OK and stable with ~ 800k prefixes in the router and peering with ~ 100 peers.
One our peer reported exactly the same problem. Is it a known issue?

Astib

please work with support on this. support@mikrotik.com. your additional input and troubleshooting with support will help get this resolved (and we will appreciate it).

OK, I will contact support@ soon.

I also have this bgp peer stability issue.
but, i’m using server with Ros 3.20 with routing-test package coz rb1000 don’t work with internet full routing table.
simple topology: bgp A - bgp B
When cust (bgp B) mikrotik disable bgp peering to “A” let say about a day or more then
when B reactivate the bgp peering again, the peering won’t come UP, i must manually disable and reenable
bgp process on A to awake the the peering. I think this is caused by open sync/active bgp process on A that stuck while finding neighbor that had been manually disabled.

Having read 3.21 and 3.22 log changes but nothing come up with this issue, so i’l; wait for the next release perhaps.

rgds,

We see the same problem on RB1000/ROS3.16/Routing Test with full Routing Table.
Doing a Resend solves the problem. The problem appears after a reboot of the RB1000.
I read every Changelog since 3.16 but found nothing regarding this.
We have a second problem on this router. After some changes on a tower this router
loosed a single OSPF Route to a client behind this tower. Every other OSPF-Router in our
network knows this route. The RB1000 never learned this route again (3 Weeks). Even a
reboot of the router announcing this route does not help. A Reboot of the RB1000
solved this.
No other known problems on this RB1000. Uptime was 99 days before this manual reboot.

So looking forward to a new stable ROS for BGP/OSPF on RB1000.

Stefan

I didn’t fully test it but, the rebuild of the full routing table in 3.22 is very fast and smooth and tooks like 10 secs (before it was like 40 secs)
I’m gonna test soon with 2 full routing tables

After 10 mins the peers weren’t able to build the second routing table yet. :confused:

 /system resource> pr
                   uptime: 1d16h11m11s
                  version: "3.22"
              free-memory: 1708808kB
             total-memory: 1945820kB
                      cpu: "Intel(R)"
                cpu-count: 2
            cpu-frequency: 3000MHz
                 cpu-load: 0
           free-hdd-space: 89074kB
          total-hdd-space: 122703kB
  write-sect-since-reboot: 622
         write-sect-total: 52759440
        architecture-name: "x86"
               board-name: "x86"



 /system resource pr
                   uptime: 1d15h8m49s
                  version: "3.22"
              free-memory: 756952kB
             total-memory: 1027492kB
                      cpu: "Intel(R)"
                cpu-count: 1
            cpu-frequency: 3200MHz
                 cpu-load: 1
           free-hdd-space: 443389kB
          total-hdd-space: 484630kB
  write-sect-since-reboot: 42258
         write-sect-total: 42258
        architecture-name: "x86"
               board-name: "x86"

May I have to try Routing Test?

If you’re gonna use prefix filters and manipulates routes, then i suggest you should and must or you would end up going nuts trying to discover anomalies that would come out.

And after using routing-test try to test using ebgp multihop with somewhere around 3 hops, and see what’s happen next, will you still receiving fullroutes or it will stuck in a few routes on it.

and for more information for others who’s facing bgp stuck problem like me, after disable and then reenable the peering, you must do the same for your other peer(s) or the router won’t reforward the prefix received by first hungup peer.

BTW, for the admin may i suggest when changing prefix filter on prefix-in and out selection in bgp peer
the bgp peering must not automatically going down for a while and up again, it should have route refresh capabily perhaps in cisco the command is clear ip bgp … in/out for example.

Hello all,
in regards of my announced instability issue: we are testing on our border routers 3.24, and for several days all BGP peers are stable. We run several tens of peers and 3x full tables on RB1000.
Good work, looks like another BGP stable release after 3.17!

Was it stable using the routing or routing-test package on the 3.24?

We run “routing” package and all works for us :slight_smile:

do they still make changes in regular package?..

yes, bugs are fixed in regular package, however priority is routing-test.