Community discussions

MikroTik App
 
User avatar
netzwerghh
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Sun Aug 07, 2011 4:23 pm
Location: Hamburg, DE
Contact:

RouterOS 7.2 CCR2004 BGP full table device lockup / horrorible "performance" with activated MPLS

Mon Apr 11, 2022 6:30 pm

Hi,

I'm testing RouterOS 7 on CCR2004 for our backbone. We are currently running some of our peering routers with RouterOS 6 on CCR2004 and are not really happy. Never get more than 300Mbit throughput when handling traffic that has to go to the outside while internal traffic (OSPF routes only) reaches near wirespeed. If it reaches that amount of traffic, packet loss is happening. Thought it might have something to do with bad BGP implemantation in RouteOS 6. So we decided to try v7 in a test setup.
We put a CCR2004 with RouterOS 7.2 at the edge of our network and integrated it in our OSPF/MPLS network. Which worked fine so far. After that we decided to build BGP sessions towards our route reflectors and one of our upstream providers. In this first test we do not announce any prefixes to upstream and route reflectors (output filters filter all prefixes). So the router does not attract any traffic.
It consumes 3x full bgp table (2x RR, 1x Upstream). The router reads those 2.4 million routes pretty fast and also installs them pretty fast. We expected high CPU load during this phase. But the load never really went down. It even got worser over time. Eventually it got that worse that even SNMP reads timed all out from our traffic monitoring.
Those 3 BGP full tables aren't static of course. They are real time live BGP feeds of the DFZ (Default Free Zone) with 5 to 20 updates per second. I've tried to play with affinity. All BGP processes in one process. All in main. One process per feed. No difference. Performance ist actually not present. It's just shit. The only traffic the router has to handle is OSPF/MPLS/BGP/Winbox/SNMP. If I disable all BGP sessions after a few minutes everything goes back to normal. But also of course all routes are gone.
The router used to have a v6 config which got upgraded and then tweaked to v7.
Also at the moment we are only analyzing IPv4. With also activated IPv6 BGP it got even worse.

Has anybody tried to achieve something similar? Any tipps and hints?

Here is the BGP part of the config:
/routing bgp template
set default as=XXXXXX disabled=no input.affinity=instance output.affinity=instance \
    .network=bgp-networks router-id=194.XXX.XXX.0 routing-table=main
/routing bgp connection
add address-families=ip as=XXXXXX cisco-vpls-nlri-len-fmt=auto-bits connect=yes \
    disabled=yes input.affinity=instance listen=yes local.address=194.XXX.XXX.0 .role=\
    ibgp name=NETZWERGE.RR01 output.affinity=instance .filter-chain=REJECT_ALL \
    .network=bgp-networks remote.address=194.XXX.XXX.6/32 .as=XXXXXX .port=179 \
    router-id=194.XXX.XXX.0 routing-table=main templates=default
add address-families=ip as=XXXXXX cisco-vpls-nlri-len-fmt=auto-bits connect=yes \
    disabled=yes input.affinity=instance listen=yes local.address=194.XXX.XXX.0 .role=\
    ibgp name=NETZWERGE.RR02 output.affinity=instance .filter-chain=REJECT_ALL \
    .network=bgp-networks remote.address=194.XXX.XXX.7/32 .as=XXXXXX .port=179 \
    router-id=194.XXX.XXX.0 routing-table=main templates=default
add address-families=ip as=XXXXXX cisco-vpls-nlri-len-fmt=auto-bits connect=yes \
    disabled=yes listen=yes local.address=185.XXX.XXX.1 .role=ebgp name=AS-Upstream \
    output.filter-chain=REJECT_ALL .network=bgp-networks remote.address=\
    185.XXX.XXX.2/32 .as=YYYYYY .port=179 router-id=194.XXX.XXX.0 routing-table=main \
    templates=default
/routing filter rule
add chain=REJECT_ALL disabled=no rule="reject;"
CCR2004-BGP-v7-Test.PNG
You do not have the required permissions to view the files attached to this post.
Last edited by netzwerghh on Tue Apr 12, 2022 11:21 am, edited 1 time in total.
 
User avatar
netzwerghh
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Sun Aug 07, 2011 4:23 pm
Location: Hamburg, DE
Contact:

Re: RouterOS 7.2 CCR2004 BGP full table device lockup / horrorible "performance"

Tue Apr 12, 2022 10:36 am

Update on this. It seems that the kernel is nearly loocked up all the time. FIB process has high kernel times:
[admin@XXXXX] /routing/filter> /routing/stats/process/print 
Columns: TASKS, PRIVATE-MEM-BLOCKS, SHARED-MEM-BLOCKS, PSS, RSS, VMS, RETIRED, ID, PID, RPID, PROCESS-TIME, KERNEL-TIME, CUR-BUSY, MAX-BUSY, CUR-CALC, MAX-CALC
 # TASKS                         PRIVATE-MEM-BLOCKS  SHARED-MEM-BLOCKS  PSS  RSS  VMS  RETIRED  ID       PID  RPID  PROCESS-TIME  KERNEL-TIME  CUR-BUSY  MAX-BUSY   CUR-CALC  MAX-CALC  
 0 routing tables                43.8MiB             70.2MiB              0    0    0        1  main     108     0  1m35s840ms    10s560ms     0ms       2s800ms    10ms      2m54s290ms
   rib                                                                                                                                                                                  
 1 fib                           4352.0KiB           0                    0    0    0           fib      124     1  38s720ms      54m13s650ms            2m45s10ms            2m45s10ms 
 2 ospf                          768.0KiB            256.0KiB             0    0    0           ospf     128     1  3s170ms       2s110ms                10ms                 10ms      
 3 pimsm                         256.0KiB            0                    0    0    0           pim      129     1  560ms         510ms                  60ms                 60ms      
 4 fantasy                       0                   0                    0    0    0           fantasy  131     1  430ms         480ms                  10ms                 10ms      
 5 configuration and reporting   32.5MiB             512.0KiB             0    0    0           static   132     1  1m16s170ms    580ms                  1s700ms              7s290ms   
 6 ldp                           1024.0KiB           512.0KiB             0    0    0           mpls     130     1  43s200ms      18s810ms               3s330ms              3s330ms   
   Copy                                                                                                                                                                                 
 7 rip                           256.0KiB            0                    0    0    0           rip      127     1  510ms         530ms                  20ms                 20ms      
 8 routing policy configuration  512.0KiB            768.0KiB             0    0    0           policy   125     1  450ms         580ms                  20ms                 20ms      
 9 BGP service                   512.0KiB            0                    0    0    0           bgp      126     1  4s270ms       9s730ms                10ms                 20ms      
10 BGP Input 194.XXX.XXX.6        8.0MiB              18.2MiB              0    0    0     1192  26       464     1  6s930ms       5s100ms                10ms                 10ms      
   BGP Input 194.XXX.XXX.7                                                                                                                                                               
   BGP Input 185.XXX.XXX.65                                                                                                                                                              
11 BGP Output 194.XXX.XXX.6       0                   0                    0    0    0           27       465     1  12s610ms      130ms                  3s890ms              3s890ms   
   BGP Output 194.XXX.XXX.7                                                                                                                                                              
   BGP Output 185.XXX.XXX.65                                                                                                                                                             
12 Global memory                                     256.0KiB                                   global     0     0                                                                      
[admin@XXXXX] /routing/filter> /routing/route/print count-only 
2641682
[admin@XXXXX] /routing/filter> /system/resource/cpu/print 
Columns: CPU, LOAD, IRQ, DISK
#  CPU   LOAD  IRQ  DISK
0  cpu0  29%   0%   0%  
1  cpu1  40%   0%   0%  
2  cpu2  64%   0%   0%  
3  cpu3  53%   0%   0%
 
User avatar
netzwerghh
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Sun Aug 07, 2011 4:23 pm
Location: Hamburg, DE
Contact:

Re: RouterOS 7.2 CCR2004 BGP full table device lockup / horrorible "performance"

Tue Apr 12, 2022 11:20 am

Another update on this: After disabling MPLS on the router everything seems to work as expected. CPU is down to nearly idle state. SNMP is responding. /routing/route/print is fast again. It seems there is a problem with MPLS in general or with my MPLS configuration.
Well but now I can not use my VPLS tunnels anymore. Do I really have to switch over to EOIP?
 
User avatar
mrz
MikroTik Support
MikroTik Support
Posts: 7042
Joined: Wed Feb 07, 2007 12:45 pm
Location: Latvia
Contact:

Re: RouterOS 7.2 CCR2004 BGP full table device lockup / horrorible "performance" with activated MPLS

Tue Apr 12, 2022 11:34 am

What is your MPLS config? Are you using LDP and trying to distribute labels for all of the BGP routes?
 
User avatar
netzwerghh
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Sun Aug 07, 2011 4:23 pm
Location: Hamburg, DE
Contact:

Re: RouterOS 7.2 CCR2004 BGP full table device lockup / horrorible "performance" with activated MPLS

Tue Apr 12, 2022 11:43 pm

Hi mrz,

not that I know of. I'm just redistributing the loopbacks:
/mpls ldp
add lsr-id=194.XXX.XXX.0 transport-addresses=194.XXX.XXX.0
/mpls ldp advertise-filter
add advertise=yes disabled=no prefix=194.XXX.XXX.0/24 vrf=main
add advertise=yes disabled=no prefix=185.XXX.YYY.0/32 vrf=main
add advertise=yes disabled=no prefix=185.XXX.XXX.0/29 vrf=main
add advertise=no disabled=no prefix=0.0.0.0/0 vrf=main
/mpls ldp interface
add disabled=no interface=bonding1
But it seems that you are somehow right. Although I do not see any of my BGP routes in MPLS forwarding tables on any of my routers. It looks like the routing engine is at least processing the BGP routes in some special way when MPLS is enabled. But I am unable to find where and how to make it stop.

Who is online

Users browsing this forum: No registered users and 26 guests