We recently upgraded two CCR2004-1G-12S+2XS routers from RouterOS v7.17.1 to 7.18.1, and our iBGP session between the two routers is now misbehaving. Routes seem not to be advertised, and not received. Using routing/bgp/advertisements print, I can see one of the routers is advertising correctly, but the routes are not received on the other router. The other router is not advertising. If I reboot one of them, they flip so that the other router is advertising, but not the first one.
Using the session print command, I can see a prefix-count of 0 on both routers, but an uptime as long as the system uptime.
I have tried removing the output-filter, but that has no effect.
I have checked with a packet capture on both ends.
Router1: correctly advertises routes over the iBGP session
Router2: correctly receives the advertised routes.
However, no routes appear in the routing table on router2, and there is no input filter on router2. I have also tried using a blank (accept) filter, but there was no change.
UPDATE: Looking at the debug log for bgp on the receiving router, I can also confirm that it parses an UPDATE message, as expected. However, still no routes appear in the routing table
I believe @pe1chl mentioned this issue in one of the release topics. Routes are being received but not installed in the routing table. We’ll need MikroTik to look into this. Create a support ticket at https://help.mikrotik.com/servicedesk/servicedesk/ if you haven’t already done so.
I accidentally tried changing affinity from “remote as” to “main.” The routes are now received, but this must have been by chance. It seems like a race condition or similar in route handling or BGP in 7.18.1.
I’m having a similar issue with eBGP though, between a CCR2116 and couple of CHR instances in a datacenter.
On CCR2116 I’m using on both input/output affinity “alone” as multi-core system, while on CHR (2-cores) I was using “main”. All routes have been installed properly at startup, but after hours on one of CHR they disappeared and to make them appear again I had to restart the BGP session. Then the issue happened to the other CHR.
Now I’m testing both CHRs where I don’t set the affinity field (which means by default is set to alone) and I’m monitoring whether I will loose the routes again.
So far so good, but it still needs more time to be monitored. However in my case changing affinity on CHR was a recent modification and since I’ve never had those failures before, I decided to go back to default alone, even though the virtual machines don’t have so many cores.