i’ve an rb1000 as core-router for small network ( approx 600 home users ) it announce an /21 prefix via BGP. the internal network is routed with ospf, the core announce default route in ospf internal network. when Rb1000 get the routes from remote peer the cpu on rb1000 go to 100% for approx 3 or 4 minutes. winbox crash, ssh doesn’t work, ospf go down, only telnet work ( too slow ), i’ve buy another RB1000 and the problem is the same. i’ve tested all version from 3.17 to 3.22. with routing and routing-test packages. the problem persist!.
i think this is a bug. there is any workaround? if this problem persist i must change the core to other vendor. for future i would use 2 RB/1000 in VRRP but if problem persist this scenario are impossible.
some time, adding or removing some BGP filters (and then with route reloading), cpu hangs to 100% for some time and ospf fall downwith “discarding description packet: wrong neighbo state” or “invalid sequence number”.
Some other time i need to ssh in and reboot my box.
I have 1 upstream peer with full table, 1 downstream peer with full table, 1 transit peer with full table and 1 upstream ipv6 peer with full table.
I have this problem with 3.22 with routing test, but already tried with older release.
please send a supout to support at mikrotik.com and explain your problem. it will never get fixed if more people don’t help them figure out the problem.
ok, i will send ASAP, but i think that its a “general” problem: get a rb/1000, configure a BGP peer and import full table: during the download the cpu is ar 100%… any other strange configuration!
Ciao Michele and Giuseppe.
I’ve encountered the same issue some time ago and couldn’t get it fixed with only one machine.
I’ve then setup a BGP machine doing EBGP with full routing table with the provider, then IBGP to a second machine (i called it Gateway) with propagate default only.
The Gateway is linked in IBGP with all the EBGP peers i got, and OSPF with the internal network.
I’m working on setting up a second Gateway machine to work with the first one to improve redundancy and failover.
The systems are x84 Xeon Quad RouterOS 3.22. But I think RB1000 would be a better choice.
Hope it Helps
Renato
I got 3 peers and it’s not full mesh IBGP, but centralized ( all peers connected to the Gateway)
Having only one peer, i don’t see the use of bgp filters since the full routing table is only “an expanded default route”
Since you propagate a /21 , i assume you got a ISP license, shouldn’t you have to peer with at least 2 providers?
I’ve tested against this problem with an x86 (P4, 2.8GHz) with 1GB RAM and it appears in this case too. BGP convergence takes very long (at least3-4 minutes) and the CPU is at 100% the whole time.
Doesn’t look like a RAM shortage as the free RAM is still at 780 MB or so.
Hi, have same problem with RB1000. Even still 300MB ram free. When changing filter ith hangs for more than one minute. Sometimes winbox/ssh hangs too.
BGP 1 session external, one internal, OSPF VRRP.
Funny that using same config on another router, but x86 P4 1.8 works better. 100% is onlu for several seconds (15-20) after filter change. But prefixes are loading much slower than on RB1000.
Mr Mikrotik, when you see so many posts, means you really have something to do.
Of course I’ll provide supout when asked. Even read-only access.
Cheers.
Darek
CPU = C2D i7400 2,8Ghz
RAM = 2GB 1066mhz DC (2x1GB)
when take full table from one peer, CPU go to 17%, (can see over telnet) (if i have open winbox, winbox block )
for take full table 300k routes time are about 15 sec.
when peer go down, and MT start deleting routes 300k, CPU go to 50%, and deleting time are about 30 sec.
if I take full table from 3 peer , 900k routes, did CPU go to X 3 ( 17% x 3, 50% x 3 (100%) ) ?
or time go to X 3 (download = 20sec x 3 = 60sec , deleting = 40sec x 3 = 120 sec) ?
Any (full table capable) Cisco/Juniper/Quagga/whatever router I’ve ever worked with.
for example, a Juniper J4350 (Celeron 2.5 GHz based software router) will converge 2 full tables
in about 20-30 seconds. CPU usage, of course, goes up but the router stays responsive and will process traffic while converging, too…
In fact, the ROS problem might be a simple process scheduling issue.
Hello Guy!
I’ve performed another test in lab environment i’ve test Quagga with 2 REAL full table.
My old P4 2.0Ghz, 512Mb ram and linux debian lenny. Get full table in few seconds cpu ad 50% and memory usage is 380mb.
Best Regards Giuseppe!
PS: i’ve testing RB1000 with ONE full table on 3.23 both routing and routing-test. the problem persist.
and let me guess, not one of you sent details to support at mikrotik about it and ask for a fix? Without people reporting their problems (opening a ticket) it will never get fixed. Please guys, if you have problems with BGP, email support. We need stable BGP and your help to get it there.