Community discussions

MikroTik App
 
ofca
Member Candidate
Member Candidate
Topic Author
Posts: 228
Joined: Fri Aug 20, 2004 7:18 pm

CPU usage 6.49.6 -> 7.2.1

Wed Apr 27, 2022 9:46 am

Image
So, this RB3011 was very busy computing uptime at the moment of upgrade, and same process of being busy computing uptime suddenly became very cpu-intensive post-upgrade. Pretty much same thing happens on RB2011s. Anyone else seeing this? If I didn't know better, I'd suspect Mikrotik started mining Monero or smth ;)
 
ofca
Member Candidate
Member Candidate
Topic Author
Posts: 228
Joined: Fri Aug 20, 2004 7:18 pm

Re: CPU usage 6.49.6 -> 7.2.1

Wed Apr 27, 2022 11:02 am

oh, that's great. People starting using it, and while this router peaked at ~60% CPU on 6.49.6 stable (~400 mbit traffic routed), it's now reaching 100% and losing packets on 7.2.1 pre-alpha when reaching ~120 mbit. Guess I'll take a sup-out before downgrading. Are there any better tools to try to figure out what's wrong other than:
> /tool/profile cpu=all duration=10s
Columns: NAME, CPU, USAGE
NAME          CPU  USAGE
snmp            0  0.5% 
ethernet        0  2.5% 
console         0  0%   
firewall        0  5%   
networking      0  11.5%
logging         0  0%   
management      0  12.5%
wireless        0  3%   
encrypting      0  5%   
routing         0  21.5%
ssl             0  1%   
profiling       0  0%   
bridging        0  0%   
unclassified    0  4%   
cpu0               66.5%
snmp            1  0%   
ethernet        1  5%   
firewall        1  10%  
networking      1  30.5%
management      1  1.5% 
wireless        1  3.5% 
encrypting      1  14.5%
routing         1  13%  
ssl             1  2%   
bridging        1  3.5% 
unclassified    1  8%   
cpu1               91.5%
 
jookraw
Member Candidate
Member Candidate
Posts: 141
Joined: Mon Aug 19, 2019 3:06 pm

Re: CPU usage 6.49.6 -> 7.2.1

Wed Apr 27, 2022 11:52 am

do you have fast-track enabled?

also, there is no route-cache on rOS v7.x.x like rOSv6 had
 
pe1chl
Forum Guru
Forum Guru
Posts: 10183
Joined: Mon Jun 08, 2015 12:09 pm

Re: CPU usage 6.49.6 -> 7.2.1

Wed Apr 27, 2022 12:09 pm

Version 7 will use more CPU than version 6. When you are already using most of your capacity and do not want to lose it, do not upgrade to v7 before you are buying new hardware.
 
ofca
Member Candidate
Member Candidate
Topic Author
Posts: 228
Joined: Fri Aug 20, 2004 7:18 pm

Re: CPU usage 6.49.6 -> 7.2.1

Wed Apr 27, 2022 12:18 pm

do you have fast-track enabled?

also, there is no route-cache on rOS v7.x.x like rOSv6 had
One of the culprits was 10ms keepalive time on BGP sessions. For some reason crossfig or whatever it's called decided, that keepalive=1s in 6.x means keepalive=10ms in 7.x (which is impossible to set by hand). Not only this: routing engine decided to obey and happily spammed keepalives 100 times per second. Setting it back to 1s and restarting sessions reduced CPU usage by 20%

There's NAT and some firewall here, and fast path/fast track aren't active. That being said, I'm not expecting 0% CPU load from this router. Just something reasonable ;)

btw. if there's no route-cache on v7, then what does "/ip settings set route-cache=yes" do?
 
jookraw
Member Candidate
Member Candidate
Posts: 141
Joined: Mon Aug 19, 2019 3:06 pm

Re: CPU usage 6.49.6 -> 7.2.1

Wed Apr 27, 2022 12:27 pm

btw. if there's no route-cache on v7, then what does "/ip settings set route-cache=yes" do?
to be frank, I have no idea... this info was what I got from support when opened a ticket for high cpu usage on my RB4011 before the fasttrack was fixed.
 
ofca
Member Candidate
Member Candidate
Topic Author
Posts: 228
Joined: Fri Aug 20, 2004 7:18 pm

Re: CPU usage 6.49.6 -> 7.2.1

Wed Apr 27, 2022 12:33 pm

btw. if there's no route-cache on v7, then what does "/ip settings set route-cache=yes" do?
to be frank, I have no idea... this info was what I got from support when opened a ticket for high cpu usage on my RB4011 before the fasttrack was fixed.
maybe there wasn't, but now there is? Who knows.
Still, even after fixing the impossible BGP configuration, and noticing that IPSec wasn't taking advantage of HW acceleration, still 75% CPU usage at 100 mbit traffic. Well, at least there's no packet loss and end users are unaffected, so I can postpone the downgrade and see what else is broken. :)
 
User avatar
mrz
MikroTik Support
MikroTik Support
Posts: 7038
Joined: Wed Feb 07, 2007 12:45 pm
Location: Latvia
Contact:

Re: CPU usage 6.49.6 -> 7.2.1

Wed Apr 27, 2022 12:37 pm

route cache setting is doing nothing, it will be removed in the future.
 
ofca
Member Candidate
Member Candidate
Topic Author
Posts: 228
Joined: Fri Aug 20, 2004 7:18 pm

Re: CPU usage 6.49.6 -> 7.2.1

Wed Apr 27, 2022 12:51 pm

route cache setting is doing nothing, it will be removed in the future.
Thanks for clearing this up. Please figure out why keepalive-time=1s gets converted to 10ms when upgrading from 6 to 7. I've seen this few times already, but didn't investigate until now.
btw. do you have any suggestions other than /routing/stats/process/print or /tool/profile ?
 
User avatar
andkar
newbie
Posts: 47
Joined: Tue Aug 11, 2020 9:20 pm

Re: CPU usage 6.49.6 -> 7.2.1

Wed Apr 27, 2022 12:59 pm

Hi,

See this thread viewtopic.php?t=185242 It seems to be bridge bandwidth issues with RB3011 and v7.
 
msatter
Forum Guru
Forum Guru
Posts: 2897
Joined: Tue Feb 18, 2014 12:56 am
Location: Netherlands / Nīderlande

Re: CPU usage 6.49.6 -> 7.2.1

Wed Apr 27, 2022 2:07 pm

Mikrotik about not route-cache in v7:

viewtopic.php?p=882429#p882429

and read on down.
 
pe1chl
Forum Guru
Forum Guru
Posts: 10183
Joined: Mon Jun 08, 2015 12:09 pm

Re: CPU usage 6.49.6 -> 7.2.1

Wed Apr 27, 2022 2:13 pm

route cache setting is doing nothing, it will be removed in the future.
Please figure out why keepalive-time=1s gets converted to 10ms when upgrading from 6 to 7. I've seen this few times already, but didn't investigate until now.
1 second is an unreasonably low keepalive time. You would normally not set the keepalive time but rather set the hold time, and the keepalive time will be 1/3 of that.
Indeed 3s is the lowest hold time, and it would result in a 1s keepalive time, but I think in cases where you want fast BGP response to link down it is better to use BFD.
(unfortunately BFD does not yet work in v7 but it is "promised" to arrive soon)
 
User avatar
mrz
MikroTik Support
MikroTik Support
Posts: 7038
Joined: Wed Feb 07, 2007 12:45 pm
Location: Latvia
Contact:

Re: CPU usage 6.49.6 -> 7.2.1

Wed Apr 27, 2022 2:48 pm

Problem confirmed, but bug aside as pe1chl mentioned 1s is an unreasonably low value for keepalive.
 
ofca
Member Candidate
Member Candidate
Topic Author
Posts: 228
Joined: Fri Aug 20, 2004 7:18 pm

Re: CPU usage 6.49.6 -> 7.2.1

Wed Apr 27, 2022 6:18 pm


Please figure out why keepalive-time=1s gets converted to 10ms when upgrading from 6 to 7. I've seen this few times already, but didn't investigate until now.
1 second is an unreasonably low keepalive time. You would normally not set the keepalive time but rather set the hold time, and the keepalive time will be 1/3 of that.
Indeed 3s is the lowest hold time, and it would result in a 1s keepalive time, but I think in cases where you want fast BGP response to link down it is better to use BFD.
(unfortunately BFD does not yet work in v7 but it is "promised" to arrive soon)
Only thing that's unreasonable is converting 1000 milliseconds to 10 milliseconds when upgrading from 6.49.6 to 7.2.1; when one day BFD arrives, I'll use it to get response times faster than 3 seconds, but until then I guess I'll have to live with 3 second lag until redundancy kicks in after some usual fiber vs. rat or fiber vs. runaway excavator ;)

btw. I have some rare cases of hold-time=10s, but still keep keepalive=1s there.
 
User avatar
mrz
MikroTik Support
MikroTik Support
Posts: 7038
Joined: Wed Feb 07, 2007 12:45 pm
Location: Latvia
Contact:

Re: CPU usage 6.49.6 -> 7.2.1

Thu Apr 28, 2022 9:59 am

Increasing the frequency of keepalives does not make BGP converge faster. Hold time is the one that controls it. The only reason to set very frequent keepalives is when the latency or packet drop on the working link is so high, that you need to send 10 keepalives to make sure that at least one of them will reach the destination within 10 seconds.
 
pe1chl
Forum Guru
Forum Guru
Posts: 10183
Joined: Mon Jun 08, 2015 12:09 pm

Re: CPU usage 6.49.6 -> 7.2.1

Thu Apr 28, 2022 10:41 am

Increasing the frequency of keepalives does not make BGP converge faster. Hold time is the one that controls it. The only reason to set very frequent keepalives is when the latency or packet drop on the working link is so high, that you need to send 10 keepalives to make sure that at least one of them will reach the destination within 10 seconds.
That would not work, as BGP is running over TCP not over UDP. New keepalives are inserted above TCP and it is the TCP re-try mechanism that governs sending them to the other side. A well-implemented TCP would not even try to send the newly added data before the (re-)transmission timers of the existing data kicked in (or an ACK is received).

Fiddling with the BGP timers is usually done to get quicker detection of a link state change when the underlying layers do not provide that information. But BFD is a more suitable mechanism for that.

Who is online

Users browsing this forum: No registered users and 27 guests