Community discussions

MikroTik App
 
User avatar
lan
newbie
Topic Author
Posts: 32
Joined: Mon Oct 22, 2007 4:06 pm
Location: Verona italiy
Contact:

BGP full table Large CPU usage on rb1000 3.22

Sun Apr 19, 2009 9:08 pm

Hello!

i've an rb1000 as core-router for small network ( approx 600 home users ) it announce an /21 prefix via BGP. the internal network is routed with ospf, the core announce default route in ospf internal network. when Rb1000 get the routes from remote peer the cpu on rb1000 go to 100% for approx 3 or 4 minutes. winbox crash, ssh doesn't work, ospf go down, only telnet work ( too slow ), i've buy another RB1000 and the problem is the same. i've tested all version from 3.17 to 3.22. with routing and routing-test packages. the problem persist!.

i think this is a bug. there is any workaround? if this problem persist i must change the core to other vendor. for future i would use 2 RB/1000 in VRRP but if problem persist this scenario are impossible.


Best regards
Giuseppe
 
User avatar
MichelePietravalle
Trainer
Trainer
Posts: 100
Joined: Sun Apr 19, 2009 9:03 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Sun Apr 19, 2009 9:30 pm

i have a similar problem.

some time, adding or removing some BGP filters (and then with route reloading), cpu hangs to 100% for some time and ospf fall downwith "discarding description packet: wrong neighbo state" or "invalid sequence number".

Some other time i need to ssh in and reboot my box.

I have 1 upstream peer with full table, 1 downstream peer with full table, 1 transit peer with full table and 1 upstream ipv6 peer with full table.

I have this problem with 3.22 with routing test, but already tried with older release.

Thanks, bye.

Michele Pietravalle
 
changeip
Forum Guru
Forum Guru
Posts: 3830
Joined: Fri May 28, 2004 5:22 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Mon Apr 20, 2009 5:31 am

please send a supout to support at mikrotik.com and explain your problem. it will never get fixed if more people don't help them figure out the problem.
 
User avatar
MichelePietravalle
Trainer
Trainer
Posts: 100
Joined: Sun Apr 19, 2009 9:03 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Mon Apr 20, 2009 9:56 am

ok, i will send ASAP, but i think that its a "general" problem: get a rb/1000, configure a BGP peer and import full table: during the download the cpu is ar 100%... any other strange configuration! :)
 
Muqatil
Trainer
Trainer
Posts: 573
Joined: Mon Mar 03, 2008 1:03 pm
Location: London - UK
Contact:

Re: BGP full table Large CPU usage on rb1000 3.22

Mon Apr 20, 2009 11:24 am

Ciao Michele and Giuseppe.
I've encountered the same issue some time ago and couldn't get it fixed with only one machine.
I've then setup a BGP machine doing EBGP with full routing table with the provider, then IBGP to a second machine (i called it Gateway) with propagate default only.
The Gateway is linked in IBGP with all the EBGP peers i got, and OSPF with the internal network.
I'm working on setting up a second Gateway machine to work with the first one to improve redundancy and failover.
The systems are x84 Xeon Quad RouterOS 3.22. But I think RB1000 would be a better choice.
Hope it Helps
Renato
 
User avatar
MichelePietravalle
Trainer
Trainer
Posts: 100
Joined: Sun Apr 19, 2009 9:03 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Mon Apr 20, 2009 11:46 am

ciao :)

yes, is a good "workaround"... how many peers? full iBGP mesh?

But if in a network do you have only one peer? i don't like to change a small filters and crash the RB/1000!!

Thanks!
 
Muqatil
Trainer
Trainer
Posts: 573
Joined: Mon Mar 03, 2008 1:03 pm
Location: London - UK
Contact:

Re: BGP full table Large CPU usage on rb1000 3.22

Mon Apr 20, 2009 12:35 pm

I got 3 peers and it's not full mesh IBGP, but centralized ( all peers connected to the Gateway)
Having only one peer, i don't see the use of bgp filters since the full routing table is only "an expanded default route"
Since you propagate a /21 , i assume you got a ISP license, shouldn't you have to peer with at least 2 providers?
 
User avatar
lan
newbie
Topic Author
Posts: 32
Joined: Mon Oct 22, 2007 4:06 pm
Location: Verona italiy
Contact:

Re: BGP full table Large CPU usage on rb1000 3.22

Tue Apr 21, 2009 4:59 pm

Hello.

tested 3.23 and 3.23 routing test. problem is the same.

To Mr Mikrotik:
An upgrade of ram to 2Gb is possible solutions?


Thanks
Best regards

Giuseppe
 
bofh666ie
just joined
Posts: 12
Joined: Sat Feb 14, 2009 9:57 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Fri Apr 24, 2009 1:41 am

I've tested against this problem with an x86 (P4, 2.8GHz) with 1GB RAM and it appears in this case too. BGP convergence takes very long (at least3-4 minutes) and the CPU is at 100% the whole time.

Doesn't look like a RAM shortage as the free RAM is still at 780 MB or so.
 
oskrobad
just joined
Posts: 13
Joined: Sun Mar 25, 2007 9:18 pm
Location: Warsaw, Poland

Re: BGP full table Large CPU usage on rb1000 3.22

Thu Apr 30, 2009 6:20 pm

Hi, have same problem with RB1000. Even still 300MB ram free. When changing filter ith hangs for more than one minute. Sometimes winbox/ssh hangs too.
BGP 1 session external, one internal, OSPF VRRP.

Funny that using same config on another router, but x86 P4 1.8 works better. 100% is onlu for several seconds (15-20) after filter change. But prefixes are loading much slower than on RB1000.
Mr Mikrotik, when you see so many posts, means you really have something to do.
Of course I'll provide supout when asked. Even read-only access.
Cheers.
Darek
 
User avatar
lan
newbie
Topic Author
Posts: 32
Joined: Mon Oct 22, 2007 4:06 pm
Location: Verona italiy
Contact:

Re: BGP full table Large CPU usage on rb1000 3.22

Fri May 01, 2009 12:19 pm

Mr Mikrotik, when you see so many posts, means you really have something to do.
Of course I'll provide supout when asked. Even read-only access.

Best regards
Giuseppe
 
marko_bg
Member Candidate
Member Candidate
Posts: 119
Joined: Sat Jun 03, 2006 11:48 am

Re: BGP full table Large CPU usage on rb1000 3.22

Wed May 13, 2009 1:18 am

CPU = C2D i7400 2,8Ghz
RAM = 2GB 1066mhz DC (2x1GB)

when take full table from one peer, CPU go to 17%, (can see over telnet) (if i have open winbox, winbox block )
for take full table 300k routes time are about 15 sec.

when peer go down, and MT start deleting routes 300k, CPU go to 50%, and deleting time are about 30 sec.

if I take full table from 3 peer , 900k routes, did CPU go to X 3 ( 17% x 3, 50% x 3 (100%) ) ?
or time go to X 3 (download = 20sec x 3 = 60sec , deleting = 40sec x 3 = 120 sec) ?
 
changeip
Forum Guru
Forum Guru
Posts: 3830
Joined: Fri May 28, 2004 5:22 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Wed May 13, 2009 1:39 am

Probably both. CPU & time (and memory)

What device on this planet can take a full set of routes from 3 peers and not have any problems?
 
bofh666ie
just joined
Posts: 12
Joined: Sat Feb 14, 2009 9:57 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Wed May 13, 2009 2:50 am

Any (full table capable) Cisco/Juniper/Quagga/whatever router I've ever worked with.
for example, a Juniper J4350 (Celeron 2.5 GHz based software router) will converge 2 full tables
in about 20-30 seconds. CPU usage, of course, goes up but the router stays responsive and will process traffic while converging, too...

In fact, the ROS problem might be a simple process scheduling issue.
 
User avatar
lan
newbie
Topic Author
Posts: 32
Joined: Mon Oct 22, 2007 4:06 pm
Location: Verona italiy
Contact:

Re: BGP full table Large CPU usage on rb1000 3.22

Wed May 13, 2009 12:12 pm

Hello Guy!
I've performed another test in lab environment i've test Quagga with 2 REAL full table.
My old P4 2.0Ghz, 512Mb ram and linux debian lenny. Get full table in few seconds cpu ad 50% and memory usage is 380mb.

Best Regards Giuseppe!

PS: i've testing RB1000 with ONE full table on 3.23 both routing and routing-test. the problem persist. :?
 
User avatar
MichelePietravalle
Trainer
Trainer
Posts: 100
Joined: Sun Apr 19, 2009 9:03 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Wed May 13, 2009 12:49 pm

yes, i can found the problem only with ROS!

Only one full table,no ospf. Only a RB/1000 connected to a upstream carrier.

If i try to add or modify a routing filter.... cpu at 100% and winbox disconnection!
 
changeip
Forum Guru
Forum Guru
Posts: 3830
Joined: Fri May 28, 2004 5:22 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Wed May 13, 2009 7:05 pm

and let me guess, not one of you sent details to support at mikrotik about it and ask for a fix? Without people reporting their problems (opening a ticket) it will never get fixed. Please guys, if you have problems with BGP, email support. We need stable BGP and your help to get it there.
 
User avatar
MichelePietravalle
Trainer
Trainer
Posts: 100
Joined: Sun Apr 19, 2009 9:03 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Wed May 13, 2009 7:12 pm

opened a new ticket with supout some hours ago! :D
 
changeip
Forum Guru
Forum Guru
Posts: 3830
Joined: Fri May 28, 2004 5:22 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Wed May 13, 2009 8:01 pm

thank you. i encourage anyone else with BGP problems to do the same. We all want stable BGP and we all know it's not there yet.
 
User avatar
MichelePietravalle
Trainer
Trainer
Posts: 100
Joined: Sun Apr 19, 2009 9:03 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Thu May 14, 2009 9:31 am

reply from Mikrotik:
Hello,

This is a known problem. v3.24 will have some improvements, and we are still working to improve current situation.

Regards,
Maris

Good! :D
 
User avatar
lan
newbie
Topic Author
Posts: 32
Joined: Mon Oct 22, 2007 4:06 pm
Location: Verona italiy
Contact:

Re: BGP full table Large CPU usage on rb1000 3.22

Fri May 15, 2009 10:19 am

we are working on the problem, there are already some
improvements that will be included in v3.24 routing-test

:D
 
User avatar
sioannou
Member Candidate
Member Candidate
Posts: 121
Joined: Tue Apr 29, 2008 3:14 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Sat May 16, 2009 4:06 pm

Hello I had the same problem with BGP 3 peers. 3.22 was causing a lot of problems at least once every two day the system would hang and needed a reboot. I send a support ot mikrotik and they said try the 3.23 routing test. 1 of the three peers is now stable for 60 days the other two once a week with subcode 0. Which means the other side is not reachable (not true).

Hopefully they will find a fix soon this is becoming a more common problem among the mikrotik community.
 
marko_bg
Member Candidate
Member Candidate
Posts: 119
Joined: Sat Jun 03, 2006 11:48 am

Re: BGP full table Large CPU usage on rb1000 3.22

Sat May 16, 2009 4:12 pm

of topic:

sioannou,
can you look this:
http://forum.mikrotik.com/viewtopic.php ... 96#p155696

thanks
 
marko_bg
Member Candidate
Member Candidate
Posts: 119
Joined: Sat Jun 03, 2006 11:48 am

Re: BGP full table Large CPU usage on rb1000 3.22

Sat May 16, 2009 4:15 pm

btw, v3.23

winbox , work worse on routing-test package

if I try change some rules, cpu go to 50%, and need much more time, 30-40 sec

with standar package go to 50%, but end job in 5-10 sec.

3 peer, 3 full table. about 900k prefix/routes
 
User avatar
MichelePietravalle
Trainer
Trainer
Posts: 100
Joined: Sun Apr 19, 2009 9:03 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Sat May 16, 2009 5:12 pm

btw, v3.23

winbox , work worse on routing-test package

if I try change some rules, cpu go to 50%, and need much more time, 30-40 sec

with standar package go to 50%, but end job in 5-10 sec.

3 peer, 3 full table. about 900k prefix/routes

Good,
my RB/1000 with standard or testing package, reach 100 cpu usage and ospf fall down, also with only 1 only full table....
 
User avatar
lan
newbie
Topic Author
Posts: 32
Joined: Mon Oct 22, 2007 4:06 pm
Location: Verona italiy
Contact:

Re: BGP full table Large CPU usage on rb1000 3.22

Fri May 22, 2009 9:18 pm

Hello!
3.24 is out.. i've tested it.. in lab ( with a REAL full table ) i've setup an ibgp link between two RB1000.. i've do 8 test ( enable and disable peer, or reboot.. )

in 5 case rb1000 works ok, 280k prefix getted in 40 seconds ( cpu 100% ) ospf is ok, winbox also
in 3 case winbox close and cpu go to 100% for 4 o 5 minutes,, and of course.. ospf CRASH.

i think the problem is partially fixed..

the bug persists :? :? :? :?

PS: ROUTING-TEST of course ;)
PS2: Please mikrotik FIX FIX FIX the bug!!!
 
User avatar
MichelePietravalle
Trainer
Trainer
Posts: 100
Joined: Sun Apr 19, 2009 9:03 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Sat May 23, 2009 11:16 am

ehi guy, be carefull, mi rb/1000 with 3.24 routing test have got an automatic reboot for kernel after pptp server disabling...

i'm writing to mikrotik.
 
User avatar
mrz
MikroTik Support
MikroTik Support
Posts: 7053
Joined: Wed Feb 07, 2007 12:45 pm
Location: Latvia
Contact:

Re: BGP full table Large CPU usage on rb1000 3.22

Mon May 25, 2009 11:50 am

Hello!
3.24 is out.. i've tested it.. in lab ( with a REAL full table ) i've setup an ibgp link between two RB1000.. i've do 8 test ( enable and disable peer, or reboot.. )

in 5 case rb1000 works ok, 280k prefix getted in 40 seconds ( cpu 100% ) ospf is ok, winbox also
in 3 case winbox close and cpu go to 100% for 4 o 5 minutes,, and of course.. ospf CRASH.

i think the problem is partially fixed..

the bug persists :? :? :? :?

PS: ROUTING-TEST of course ;)
PS2: Please mikrotik FIX FIX FIX the bug!!!
Is it actually crashed (autosupout was generated)? Or ospf was unable to receive hello packets? Try to increase OSPF dead-interval, it might help.
If ospf crashed then send supout file to support@mikrotik.com
 
User avatar
mrz
MikroTik Support
MikroTik Support
Posts: 7053
Joined: Wed Feb 07, 2007 12:45 pm
Location: Latvia
Contact:

Re: BGP full table Large CPU usage on rb1000 3.22

Tue May 26, 2009 7:54 am

Try not to use winbox when testing BGP. When you open BGP window in winbox, it will generate a lot of traffic and use a lot of CPU to get all BGP advertisements. Also disable SNMP, monitoring full BGP is useless anyway, but takes a lot of router's resources.
 
marko_bg
Member Candidate
Member Candidate
Posts: 119
Joined: Sat Jun 03, 2006 11:48 am

Re: BGP full table Large CPU usage on rb1000 3.22

Thu May 28, 2009 9:03 pm

mikrotik must disable auto show full adver...

must put some filter, like in IP Routes

btw, winbox in 3.24, work worse!

if enable/disable routing filter rules, winbox can not show change, I must close/open winbox to see change.
 
rpingar
Long time Member
Long time Member
Posts: 593
Joined: Fri May 28, 2004 2:46 pm
Location: Italy

Re: BGP full table Large CPU usage on rb1000 3.22

Mon Jun 15, 2009 8:48 pm

did some one test 3.25 rotuing-test on a multi peer behavior?

regards
Ros
 
User avatar
MichelePietravalle
Trainer
Trainer
Posts: 100
Joined: Sun Apr 19, 2009 9:03 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Mon Jun 15, 2009 9:01 pm

wait, this night i will try :)

i hope...

Michele Pietravalle
 
ronniee
Member Candidate
Member Candidate
Posts: 125
Joined: Sun Jan 15, 2006 9:32 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Fri Jun 18, 2010 3:52 pm

Hi,

I need to get 100,000 BGP prefixes with a RB1000U.
Right now I run on ver. 4.4 RouterOS.

My router crashed, when we tested now, I waited 5 minutes, and no response from router, maybe freezed when receive the prefixes.

I read that RB1000 works on 300k prefix too. Maybe some delays on getting prefixes.

If somebody use RB1000 with BGP >100k prefixes, please tell me witch version of RouterOS is the best?
It's very important form me, my clients are down.

thanks,
ronniee
 
ste
Forum Guru
Forum Guru
Posts: 1924
Joined: Sun Feb 13, 2005 11:21 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Fri Jun 18, 2010 4:10 pm

Hi,

I need to get 100,000 BGP prefixes with a RB1000U.
Right now I run on ver. 4.4 RouterOS.

My router crashed, when we tested now, I waited 5 minutes, and no response from router, maybe freezed when receive the prefixes.

I read that RB1000 works on 300k prefix too. Maybe some delays on getting prefixes.

If somebody use RB1000 with BGP >100k prefixes, please tell me witch version of RouterOS is the best?
It's very important form me, my clients are down.

thanks,
ronniee
3.30 routing-test is working
 
User avatar
Chupaka
Forum Guru
Forum Guru
Posts: 8709
Joined: Mon Jun 19, 2006 11:15 pm
Location: Minsk, Belarus
Contact:

Re: BGP full table Large CPU usage on rb1000 3.22

Fri Jun 18, 2010 5:01 pm

try 4.10
 
ronniee
Member Candidate
Member Candidate
Posts: 125
Joined: Sun Jan 15, 2006 9:32 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Wed Jun 23, 2010 5:47 pm

thanks
it's working on 4.10 with 100k BGP route prefixes
the CPU is at 30-40%.

Is the RB1100 more powerful than RB1000?
the CPU frecvency is lower than RB1000
 
azg
Frequent Visitor
Frequent Visitor
Posts: 57
Joined: Thu Jun 17, 2010 1:40 pm

Re: BGP full table Large CPU usage on rb1000 3.22 / x86

Wed Jul 21, 2010 12:07 pm

i can confirm BGP in 4.10 is improved: on a x86 system for the first time i was able to accept the full 320'000 prefixes from one peer, and the same BGP session was up for two weeks now.
i still use traffic shaping for the BGP session itself, to less than 100kbps. with 4.6 shaping was the only way to get any BGP to work at all on 100Mbps provider links.
however it is still a rough ride: winbox disconnects and can not be used for several minutes after a BGP session with a large table is established. so be prepared that winbox can be unavailable exactly when you have network problems that cause BGP sessions to go down and come back up.
with 4.6 even ssh used to disconnect, and OSPF lost neighbors. in 4.10 that seems no longer the case.

unfortunately mikrotik does not document any of the changes in BGP - the changelog between 4.6 and 4.10 does not contain a word about it. why?
 
xxiii
Member Candidate
Member Candidate
Posts: 234
Joined: Wed May 31, 2006 12:55 am

Re: BGP full table Large CPU usage on rb1000 3.22

Mon Jul 26, 2010 11:32 pm

Last I heard, you also need to disable SNMP when doing large BGP (or possibly even small) tables.

We are finding this extremely inconvenient.
 
locu
just joined
Posts: 8
Joined: Tue Oct 07, 2008 3:50 am

Re: BGP full table Large CPU usage on rb1000 3.22

Thu Aug 05, 2010 3:45 am

For the first time we've attempted to bring in a full route via bgp4 on a RB1100 using 4.10. 100% pegged CPU for as long as 30 minutes for connect or disconnect of BGP. Winbox doesn't disconnect, but it sure does take significantly longer to arrange the routes than a 350Mhz Cisco with half the power does.

There's clearly an issue here. I'm curious if anyone has actually successfully gotten multiple upstreams with full routes functioning reliably on a Mikrotik device. This is our first attempt and unfortunately all roads are pointing back to Cisco at the moment. =(
 
User avatar
mrz
MikroTik Support
MikroTik Support
Posts: 7053
Joined: Wed Feb 07, 2007 12:45 pm
Location: Latvia
Contact:

Re: BGP full table Large CPU usage on rb1000 3.22

Sat Aug 07, 2010 11:22 pm

30min is not normal. Make sure you have snmp disabled. Also do not open route window or bgp advertisement window in winbox while receiving routes.
 
azg
Frequent Visitor
Frequent Visitor
Posts: 57
Joined: Thu Jun 17, 2010 1:40 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Sun Aug 08, 2010 11:08 am

mrz: sure "30 minutes is not normal". but this is not the first time this issues comes up.
last november you wrote under the thread "Taking full BGP routes | resources RB1000": "RB1000 can receive full BGP feed and install routes in few minutes and functions without any noticeable problems."

would you please document to us what exact configuration you tested with?

have you used reasonably fast internet connections (e.g. 100Mbps)?
have you, with this connectivity, tested against cisco equipment, or only against your own?
how many peers, prefixes each? filters?
which board, or what x86 configuration?

without this information, broad statements like "no noticeable problems" are difficult to put into perspective.
you can always contact customers for testing with real-world infrastructure that is difficult to build in the lab.
 
User avatar
mrz
MikroTik Support
MikroTik Support
Posts: 7053
Joined: Wed Feb 07, 2007 12:45 pm
Location: Latvia
Contact:

Re: BGP full table Large CPU usage on rb1000 3.22

Sun Aug 08, 2010 12:38 pm

BGP was tested against cisco routers, receiving full bgp routing table (~350k prefixes) from our ISP (which has cisco routers).
It was also tested with 3 feeds between mikrotik devices.
Please contact support and send supout files, so that your specific configuration could be examined.
 
User avatar
maxrate
Frequent Visitor
Frequent Visitor
Posts: 94
Joined: Mon Oct 23, 2006 10:55 pm
Location: Toronto

Re: BGP full table Large CPU usage on rb1000 3.22

Tue Aug 24, 2010 3:33 am

I have tried this on a RB1000 and an RB1100. 3 Full BGP feeds, CPU goes to 100%, sometimes router crashes, sometimes I get disconnected (winbox.exe). Slow convergence time (often up to almost 5 minutes) - RAM gets consumed quite a bit (understandably). Has anyone tried this using x86 hardware? This is becoming a very large problem. I have large bodies of customers go down because the core (RB1000 or RB1100) crashes. SNMP is disabled.

Sometimes is it just faster to reboot the router and let BGP start up again then let it try and fix itself. One time the crash happened late, so i gave the ROS 40 minutes to work thru 100% cpu - finally gave up waiting and rebooted.

I've also found that the terminal window won't properly respond to a restart command when this happens. Everything slows down, even routing.

I am on a peering exchange and I put a dedicated RB1100 facing the exchange (not full routes), doing some iBGP to the RB1000 - this helps, but not by much. I love the RouterOS but I'll have to scrap it if this continues as my customers just aren't very happy about the outages. What's frustrating is the outages are caused by the router - not a communications link or power failure! Even building a support.rif is problematic at times. Mikrotik can never find anything wrong when I do send them a support.rif. I am running 4.11 - this has been happening for over a year on various versions of Mikrotik. the suggestion is always 'try upgrade to the latest version' - that never helps. I do recall an older version (3.??) working fairly well.
 
User avatar
lan
newbie
Topic Author
Posts: 32
Joined: Mon Oct 22, 2007 4:06 pm
Location: Verona italiy
Contact:

Re: BGP full table Large CPU usage on rb1000 3.22

Tue Aug 24, 2010 9:23 am

if you have single peer with full table rb1000 ( 4.11 ) works fine. if you have some peer is better to use a supermicro with xeon quad core and 4 Gb of ram. With this hw your core router works fine!
 
azg
Frequent Visitor
Frequent Visitor
Posts: 57
Joined: Thu Jun 17, 2010 1:40 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Tue Aug 24, 2010 1:16 pm

the problem maxrate describes starts to happen on a x86 as well - it actually takes less than three peers to bring it down. my x86 mikrotik is connected via a 100Mbps Ethernet link to the Equinix-1 data center in Zurich, Switzerland, to a Cogent 100Mbps port. above in this thread, i asked mrz whether they ever tested against cisco, and at what leased line speed. his reply indicated testing, but no speed information.... plus "turn to support", which, as maxrate hints, tends to result in endless loops. i did not write to support... this forum is more transparent.
however i did some testing: from my home mikrotik (DSL 5M down, 0.5M up) i opened a BGP session over L2TP to my BGP-mikrotik, trying to just get the full table. BOOM the session towards Cogent goes down, my AS disappeared from the global table for a moment. both routers run 4.10 on x86. if i send just a handful of prefixes to my home mikrotik then it works.

here is what i suspect is the issue: the linux kernel, when the routing table gets large, it gets very slow adding and removing routes. because this code runs in the kernel, it freezes the rest of the system while adding/removing routes. if you do that repeatedly like with BGP, then the system is frozen most of the time, the processes no longer run properly. winbox freezes, the router stops to respond or takes very long, OSPF can go down, L2TP can go down, and BGP sessions themselves can go down because of TCP timeout.
it is very brittle whether it "works" or not. a faster CPU helps a bit, and slower WAN links with larger delay help too. yes, if the WAN links are slower, then BGP TCP sessions across them will have longer timeouts, making them more tolerant to a router that is more frozen than alive. that is why i asked mrz about the line speed he used for his tests.
if my suspicion above is right, then VoIP or Video traffic would get massively delayed because of the freezing when it crosses a mikrotik that has a BGP full table and does route updates (from BGP or OSPF).

mikrotik needs to decouple the BGP process from the linux kernel. this means it takes more RAM, but okay.
andy
ps: i have no second BGP peer ready, but if anyone with an AS wants to help testing it could be done through a tunnel.
 
blake
Member
Member
Posts: 426
Joined: Mon May 31, 2010 10:46 pm
Location: Arizona

Re: BGP full table Large CPU usage on rb1000 3.22

Tue Aug 24, 2010 3:56 pm

mikrotik needs to decouple the BGP process from the linux kernel. this means it takes more RAM, but okay.
I'm no expert, but I think you're a little off in your diagnosis. RouterOS has to perform its route selection algorithm first before any routes are installed in the kernel's FIB. Routing daemons normally run in user-space, process routes in the various RIBs, then install the best paths in the kernel's FIB.

This whole issue sounds like a user-space BGP daemon problem.
 
azg
Frequent Visitor
Frequent Visitor
Posts: 57
Joined: Thu Jun 17, 2010 1:40 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Tue Aug 24, 2010 4:31 pm

blake, from my experience with a process alone you can not bring down other processes the way we see here, the scheduler would keep the system minimally responsive. that is why i suspect that the CPU is stuck in the kernel (during inserting a new route into the FIB) -- i had similar things in my own drivers.
maybe it can be measured with ICMP ECHO requests - if the kernel is all fine, then the replies should come back during large routing table changes with no jitter. i'll try.
 
User avatar
gustkiller
Member
Member
Posts: 419
Joined: Sat Jan 07, 2006 5:15 am
Location: Brazil
Contact:

Re: BGP full table Large CPU usage on rb1000 3.22

Tue Aug 24, 2010 4:47 pm

i think, mikrotik must use bird bgp daemon, as its one of the best bgp daemons and kicks quagga ass! it was if i recall correctly the only bgp daemon that supported tests that crashed quagga.
 
ste
Forum Guru
Forum Guru
Posts: 1924
Joined: Sun Feb 13, 2005 11:21 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Tue Aug 24, 2010 5:55 pm

i think, mikrotik must use bird bgp daemon, as its one of the best bgp daemons and kicks quagga ass! it was if i recall correctly the only bgp daemon that supported tests that crashed quagga.
I had a small linux-box (600MHz VIA) with quagga which does not show this behavior at all.
It shows other problems but it was responsive while starting up the bgp-sessions.
BGP-Session startup was much faster.

We've 3 RB1000 with BGP running. 3.30 routing-test. Older versions were a nightmare.
This boxes run stable. We do not dare to update to 4.x.
 
alexandrecorrea
just joined
Posts: 22
Joined: Fri Sep 22, 2006 6:18 pm
Location: Sacramento, MG, Brasil
Contact:

Re: BGP full table Large CPU usage on rb1000 3.22

Wed Aug 25, 2010 5:37 am

running two rb1000 with 2 peers each (full table each peer)..

sometimes are better reboot the router !!

when changing routing filter cpu goes to 100% for a long time.. (15.. 20 minutes) !!

i have the same situation runnin on vyatta 5.x on celeron with 512m ram and it´s very faster !!


as gustkiller said... bird is better for bgp !!
 
User avatar
lan
newbie
Topic Author
Posts: 32
Joined: Mon Oct 22, 2007 4:06 pm
Location: Verona italiy
Contact:

Re: BGP full table Large CPU usage on rb1000 3.22

Wed Aug 25, 2010 10:16 am

celeron with 512 is quite little... try with xeon!!
 
Ozelo
Member
Member
Posts: 338
Joined: Fri Jun 02, 2006 3:56 am

Re: BGP full table Large CPU usage on rb1000 3.22

Thu Aug 26, 2010 3:52 pm

We receive one full table on RB1000 and it takes no more than 29 seconds @ 100% CPU to get 324k routes IPv4. Acceptable for the hardware it is.
 
User avatar
mrz
MikroTik Support
MikroTik Support
Posts: 7053
Joined: Wed Feb 07, 2007 12:45 pm
Location: Latvia
Contact:

Re: BGP full table Large CPU usage on rb1000 3.22

Thu Aug 26, 2010 4:15 pm

It looks like BGP can become slow if routing filters are changed. Anyone who experience BGP problems after routing filter changes, please contact support and specify what routing filters you had and what changes you have made.
 
User avatar
gustkiller
Member
Member
Posts: 419
Joined: Sat Jan 07, 2006 5:15 am
Location: Brazil
Contact:

Re: BGP full table Large CPU usage on rb1000 3.22

Thu Aug 26, 2010 5:27 pm

i'm experiencing some problems with updates.. sometimes a peer withdraw a route but mikrotik keeps it in the routing table.
 
mhosts
newbie
Posts: 36
Joined: Tue Nov 03, 2009 4:43 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Wed Sep 08, 2010 11:33 pm

not to hijack the thread or anything... But i'm in a similar situation, I have an RB1000 ready to start taking BGP routes and if this is going to be problematic I'd rather use another router as the border.

Does anyone know of a way to test the behaviour of this anywhere? IE a BGP test environment that isn't public but rather to see how the router will behave before going to production? basically a test bgp feed.

Thanks,
 
azg
Frequent Visitor
Frequent Visitor
Posts: 57
Joined: Thu Jun 17, 2010 1:40 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Thu Sep 16, 2010 10:22 pm

It looks like BGP can become slow if routing filters are changed. Anyone who experience BGP problems after routing filter changes, please contact support and specify what routing filters you had and what changes you have made.
it seems to me that it is not the route filters themselves, but merely a winbox issue.

i tested on a x86 4.11 router with a full table (325000 prefixes) with one EBGP peer and one IBGP peer:
i changed the BGP Weight parameter in the "accept" filter that is responsible for most of the prefixes from EBGP -- this influences the whole table. when i do this in winbox without having any window open (i.e. closing all windows before clicking OK in the dialog box), then the CPU load goes near 100%, and it takes long... meanwhile, on a different router i monitor a 4.5Mbps stream of data from the router to winbox... (the maximum the network in between can do) and this with no windows open!

if i change the filter and then *immediately close winbox*, wait 20 seconds, and log back in with winbox, surprise, the routing table is already updated. wow...
 
User avatar
maxrate
Frequent Visitor
Frequent Visitor
Posts: 94
Joined: Mon Oct 23, 2006 10:55 pm
Location: Toronto

Re: BGP full table Large CPU usage on rb1000 3.22

Tue Sep 28, 2010 1:03 am

I posted earlier in this thread - I am running Ros4.8 on a RB1000 with 3 full BGP feeds. We're down the 414M of RAM, that seems to be close to when the router crashes. I am generating a support.rif file (which has been over 5 minutes now and still waiting) and I'm going to submit to Mikrotik. I like Mikrotik, but the entire 4.xx releases have been flawed in my opinion for BGP. I hope they can fix this issue as it's limiting what I can do with a mikrotik router :(
 
mhosts
newbie
Posts: 36
Joined: Tue Nov 03, 2009 4:43 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Wed Sep 29, 2010 4:55 am

I posted earlier in this thread - I am running Ros4.8 on a RB1000 with 3 full BGP feeds. We're down the 414M of RAM, that seems to be close to when the router crashes. I am generating a support.rif file (which has been over 5 minutes now and still waiting) and I'm going to submit to Mikrotik. I like Mikrotik, but the entire 4.xx releases have been flawed in my opinion for BGP. I hope they can fix this issue as it's limiting what I can do with a mikrotik router :(
Thanks for the info!

BGP implementation on Mikrotik RB1000 is not going to happen after a comment like that.

I think I'm going to look into the Juniper J2320 in the meantime... Hopefully Mikrotik can get their stuff together. It really has the potential to be a great cost effective router. The fact that they are even being mentioned in the same sentences as Juniper and Cisco all over the net is awesome.

I just wish they get the message and work on more stability rather than new features.
 
User avatar
maxrate
Frequent Visitor
Frequent Visitor
Posts: 94
Joined: Mon Oct 23, 2006 10:55 pm
Location: Toronto

Re: BGP full table Large CPU usage on rb1000 3.22

Wed Sep 29, 2010 6:15 am

mhosts: i really hate to say anything negative regarding Mikrotik, but yes, BGP seems to be a problem. So last night, I generated my support.rif file and then gracefully rebooted the router. I did this to 'beat' the failure. Guess what? around 2pm today my core router took a nose dive again and our customers were offline for about 9 minutes. I had to remote in the using a backdoor router and command the remote power switch to turn off, then on the power (power cycle) the RB1000 to resolve the issue. I find the convergence time slow as well. I am happy i made that support.rif file just before the crash - i have submitted to mikrotik, haven't heard anything yet - but in all fairness it takes them a good day to take a look. I'm looking at Vyatta right now and those same junipers you are looking at. I'm also looking at the imagestream rebel routers for a bgp solution. I have been running a imagestream as an LNS server for about 3 years - never a problem. Just bought a new rebel router to replace the old one (nothing was wrong with the old one, but it would only route about 45 mbps the documentation states - apparently the new one does a gig or two. Apparently they work well for BGP. I'm going to stick with mikrotik for all the routing except BGP until ROS 5.xx comes out - apparently they are fixing the BGP issue. too bad - I purchased several RB1100's thinking there was something wrong with the RB1000's. I have a few 1000's as well. sorry for the run on message
 
azg
Frequent Visitor
Frequent Visitor
Posts: 57
Joined: Thu Jun 17, 2010 1:40 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Wed Sep 29, 2010 1:07 pm

BGP on 4.11 (x86 w/ 1GB in my case) has been stable for weeks in my case. I have two routers for two upstream peers, full tables, IBGP between the routers, and around 20 route filters per router. I'd say it is actually good... There ARE however some bugs as discussed in this thread.

For me the real problem is that Mikrotik is not transparent about known issues. There is nothing in the Changelog, even for bugs that were clearly fixed. There is no public listing of Know Issues.
Sometimes I think Mikrotik does itself not know what changed, because they may import a newer version of some external code. On the other hand, some of the information from Mikrotik in these forums indicates they indeed do have a deep understanding of the protocols -- and the product is in many ways excellent and well thought through. But if Mikrotik knows about the issues, why then this 'russian-style' denial and deception about bugs? Do the bugs go away if you hide them?

I believe BGP on 4.11 is quite good, but as we can see in this thread people are turned off and loose trust in Mikrotik BGP because of the uncertainty about where bugs are. If everyone knew what the bugs are, we could make an educated decision whether its acceptable or not. And it would save a lot of time.
 
User avatar
gustkiller
Member
Member
Posts: 419
Joined: Sat Jan 07, 2006 5:15 am
Location: Brazil
Contact:

Re: BGP full table Large CPU usage on rb1000 3.22

Wed Sep 29, 2010 2:41 pm

the problem running bgp on 4.x version is the lack of performance like multiqueing support for the newer intel nics.

we're running 5.04 beta for bgp with about 5 feeds but only parcial routes and no problems at all. we're receiving 2 new hp servers to run two gig internet feed and enabling full routing. i hope it works as good as its been running with partial.
 
mhosts
newbie
Posts: 36
Joined: Tue Nov 03, 2009 4:43 pm

Re: BGP full table Large CPU usage on rb1000 3.22

Tue Oct 05, 2010 11:22 pm

maxrate: No worries on the "long-winded" message. I've been there too.

The reason it's frustrating me is that there doesn't seem to be a stable in-between solution for small hosts. Even Juniper j series and srx platforms have their fair share of stability issues. People can't even get VRRP stable on srx.If you read some of the posts on the juniper forums people are recommending "no less" than the j6350 (3U) for doing 2x 100mbit transit and bgp!! That's a $10k router (used/greymarket) for 200mbit of traffic!

There should be a way to acheive a stable 200mbit of routing + partial bgp tables in a RB1000u. According to their datasheets it can route 300k pps and there's more than enough ram to fit the bill.

I think azg said it perfect:
why then this 'russian-style' denial and deception about bugs? Do the bugs go away if you hide them?
If I knew that specific things would cause stability issues then I would avoid it and see if it still meets my needs. Open discussion and knowledge sharing from the company would take the guessing game out of it.

There's my long-winded message.
 
User avatar
ropix
Trainer
Trainer
Posts: 13
Joined: Thu Dec 22, 2011 6:42 am
Location: behind IP>firewall
Contact:

Re: BGP full table Large CPU usage on rb1000 3.22

Fri Feb 28, 2014 4:30 am

Actually, this problem coming to my router, I am using CCR1016-12G with v6.10 and run 2 eBGP with full routing table. There is no problem until i change routing filter parameter (set-bgp-prepend), and suddenly my winbox access was not responding, anyway ssh and telnet working fine.
I try to change winbox port in IP services but its not work and thereis notification in terminal :
action time out - try again, if error continues please contact MikroTIk support and send supout file (13)
Mean, I think something happen with winbox services if I run bgp full routing table and change some routing filter parameters.
I am not yet send supout file to Mikrotik support because I am afraid when I create it, my router will be reboot (like happen previously)

One problem again that I found, I see in my MRTG machine, and it said that snmp of this router error. If I want to check it via terminal in snmp menu, the command "snmp print" was freeze and show the same error:
action timed out - try again, if error continues contact MikroTik support and send a supout file (13

Who is online

Users browsing this forum: No registered users and 12 guests