Community discussions

MikroTik App
 
User avatar
blackpaw
newbie
Topic Author
Posts: 37
Joined: Thu Jan 28, 2010 7:19 pm
Location: Amsterdam, NL

Dude 4.0 beta3: high CPU on cisco devices

Fri Apr 01, 2011 4:53 pm

Don't know if this is something introduced in the latest version of the Dude but I am getting very high CPU utilization when the Dude is enabled.

I have deleted most of my custom probes and functions but still getting like 40% CPU caused by SNMP server when the Dude server is active.

Switching off the Dude the CPU goes down to almost nothing (what is expected from this device)

Anyone having similar experiences?

thanks


Andreas

P.S. Also starting a new thread with Lebowski (hope he agrees to join in) ;) to discuss probe design using snmpwalks in functions - might be related...

P.P.S. Win7, 32bit, dedicated Dude machine - 4.0beta3
 
User avatar
gsandul
Member Candidate
Member Candidate
Posts: 154
Joined: Mon Oct 19, 2009 1:42 pm

Re: Dude 4.0 beta3: high CPU on cisco devices

Fri Apr 01, 2011 5:47 pm

I have deleted most of my custom probes and functions but still getting like 40% CPU caused by SNMP server when the Dude server is active.
There are some conditions which could lead to high cpu on cisco. I know (have realy experienced) 2 conditions
1) You have a lot of interfaces on your device. It could be, for a sample, on router holding a lot of pptp sessions.
2) You have a lot of routes on your device. It could be, for a sample on a router holding eBGP sessions.
In such a cases high cpu is detected when The Dude is trying to construct tables for device snmp tabs.
snmp_tab.png
The solution is to disable corresponding snmp subtrees on cisco device.
You do not have the required permissions to view the files attached to this post.
 
User avatar
blackpaw
newbie
Topic Author
Posts: 37
Joined: Thu Jan 28, 2010 7:19 pm
Location: Amsterdam, NL

Re: Dude 4.0 beta3: high CPU on cisco devices

Fri Apr 01, 2011 6:06 pm

The solution is to disable corresponding snmp subtrees on cisco device.
Uh... how would I go about that? :)

Besides that, you also contributed a lot to the Cisco probe back then.. would you like to do some experimenting how to get the Dude to use less CPU?
I didn't mean that I only want to discuss with Lebowski he just posts a lot about that ;)



Andreas
 
User avatar
gsandul
Member Candidate
Member Candidate
Posts: 154
Joined: Mon Oct 19, 2009 1:42 pm

Re: Dude 4.0 beta3: high CPU on cisco devices

Mon Apr 04, 2011 9:25 am

Uh... how would I go about that? :)
This is done using snmp viev
sample cisco config could be
snmp-server view noifread iso included
snmp-server view noifread interfaces excluded
snmp-server view noifread iproute excluded

snmp-server community TheSuperSecret view noifread ro 1
also you should read this
http://www.cisco.com/en/US/tech/tk648/t ... 48e6.shtml
Besides that, you also contributed a lot to the Cisco probe back then.. would you like to do some experimenting how to get the Dude to use less CPU?
I will. But I do not realy believe The Dude probe could lead to high cpu on cisco.
 
User avatar
blackpaw
newbie
Topic Author
Posts: 37
Joined: Thu Jan 28, 2010 7:19 pm
Location: Amsterdam, NL

Re: Dude 4.0 beta3: high CPU on cisco devices

Mon Apr 04, 2011 1:15 pm

I bow before you - thank you for this! :)

Now I only need time to simulate, test and then deploy this!


Andreas
 
lebowski
Forum Guru
Forum Guru
Posts: 1619
Joined: Wed Aug 27, 2008 5:17 pm

Re: Dude 4.0 beta3: high CPU on cisco devices

Mon Apr 04, 2011 4:59 pm

I was out all last week but gsandul is sharp as usual. Call manager training...

I have to agree with gsandul in that the dude doesn't run the CPU up very high on Cisco devices unless you build the probe poorly. i.e. oid_column("1.3.6") that would be bad :). I did turn off SNMP of BGP routes for the external routers though.
 
User avatar
blackpaw
newbie
Topic Author
Posts: 37
Joined: Thu Jan 28, 2010 7:19 pm
Location: Amsterdam, NL

Re: Dude 4.0 beta3: high CPU on cisco devices

Mon Apr 04, 2011 5:28 pm

totally not that kind of probe :)
I even tried to remove the oid_column part to remove the (partial) SNMP walk but still...
if(array_size(oid_raw("1.3.6.1.4.1.9.2.1.57.0", 10 ,29)), oid("1.3.6.1.4.1.9.2.1.57.0", 10, 29)+1 ,"False")
and:
if(Cisco_CPU() <>"False",if(Cisco_CPU() -1 < 80, "", concatenate("Warning: high CPU: ", Cisco_CPU(), "%")), "CPU polling fault")
Probes for interfaces look similar...

however I get this on my 7600's:
#sho proc cpu sort | e 0.00 
CPU utilization for five seconds: 82%/16%; one minute: 39%; five minutes: 34%
PID Runtime(ms)   Invoked      uSecs   5Sec   1Min   5Min TTY Process 
 196   2988263362440177117          0 60.43% 26.83% 23.45%   0 SNMP ENGINE      
 184   456210516  23618850      19315  2.23%  1.20%  1.02%   0 IP SNMP          
 191   4625225643723604616          0  1.75%  1.58%  1.45%   0 IP Input         
 185    761816562165763256          0  0.79%  0.26%  0.18%   0 PDU DISPATCHER   
Kinda unhealthy, eh? ;) These should be able to handle a little SNMP traffic.

CPU goes back to like 3-6% when I disable Dude server... so back to the reading room for me :(
It seems that it's enough to have the dude server enabled and this device is added to the map with IP and snmp profile.. I will check all tooltips and refresh-intervals now and (if everthing else fails) re-build the Dude from scratch :(


Andreas
 
lebowski
Forum Guru
Forum Guru
Posts: 1619
Joined: Wed Aug 27, 2008 5:17 pm

Re: Dude 4.0 beta3: high CPU on cisco devices

Mon Apr 04, 2011 6:59 pm

I only monitor cpu on all my cisco devices.
 
User avatar
gsandul
Member Candidate
Member Candidate
Posts: 154
Joined: Mon Oct 19, 2009 1:42 pm

Re: Dude 4.0 beta3: high CPU on cisco devices

Tue Apr 05, 2011 9:40 am

CPU goes back to like 3-6% when I disable Dude server... so back to the reading room for me :(
As I said, in your case second condition of high cpu is trigered.
2) You have a lot of routes on your device. It could be, for a sample on a router holding eBGP sessions.
You should just make a view for The Dude on your cisco device, in this view you should exclude routing information.
No need to go "back to the reading room" you can do it in 30 minutes. :)
Just read this.
http://www.cisco.com/en/US/tech/tk648/t ... 48e6.shtml
 
User avatar
gsandul
Member Candidate
Member Candidate
Posts: 154
Joined: Mon Oct 19, 2009 1:42 pm

Re: Dude 4.0 beta3: high CPU on cisco devices

Tue Apr 05, 2011 9:51 am

I only monitor cpu on all my cisco devices.
I do monitor CPU, Temperature, Traffic and memory on all my cisco routers. Also number of Active Calls, DSP resources, ASR (Answer Seizure Ratio), number of E1 in UP state on VoIP Access Servers.
access_server.png
Not any high cpu caused by my probes.
You do not have the required permissions to view the files attached to this post.
 
User avatar
blackpaw
newbie
Topic Author
Posts: 37
Joined: Thu Jan 28, 2010 7:19 pm
Location: Amsterdam, NL

Re: Dude 4.0 beta3: high CPU on cisco devices

Tue Apr 05, 2011 11:19 am

Interesting.. wouldn't have imagined monitoring call status using the Dude (Cisco must hate you for not using their fine software!) ;)

how are you polling this data in your interface/device labels? Just created a function "oid(1.2.3.4.5.6.7.....)" or did you create a function with conditions for each? How often do you refresh the labels?

I am monitoring CPU and free RAM to be able to see things like memory leaks and other issues that only become apparent when monitored for several weeks but I also need to monitor the interface status and interface description (needed for automatic alarms/notifications to our NOC for customer connections)
BTW: I like the pps in your labels.. somehow more useful than bits/second for big trunks!

Something else I wonder about: Just about how many devices are you guys monitoring and with how many probes for each device? Also how many links that are monitored do you have? (interface Rx/Tx is polled, too)

I am monitoring about 300 devices (200 switches/routers, ~50 servers and some other devices that are monitored just because we can monitor them and I exercised writing probes for them (like firewalls, laser output levels or the paper level in a xerox printer) ;)
I have 10-20 services monitored per device (depending on type) so about 2650 services in total and about 400 interfaces with Rx/Tx polled+graphed
(1 minute polling interval, label refresh set to 30 seconds (10 sec. for the important links)

I totally love the Dude by now and have not been able to reproduce many of its features in Cacti/Nagios, yet - and definitely not the ease of use and accessibility of it so I'd like to stick to it.
Some people have suggested that the Dude might not scale well so I think about setting up distributed polling by several Dude servers as I am getting too many timeouts of lately (but that might also be related to SNMP so I will attempt the Cisco snmp view hacks Gsandul sent me, first)

Thanks!


Andreas
 
User avatar
gsandul
Member Candidate
Member Candidate
Posts: 154
Joined: Mon Oct 19, 2009 1:42 pm

Re: Dude 4.0 beta3: high CPU on cisco devices

Tue Apr 05, 2011 11:54 am

how are you polling this data in your interface/device labels? Just created a function "oid(1.2.3.4.5.6.7.....)" or did you create a function with conditions for each? How often do you refresh the labels?
It depends..... Any network and server group should be monitored depending on what you need to see.
My label for TxPPS/RxPPS code is:
Fe0/22
Rx: [Interface.InBitRate]
Tx: [Interface.OutBitRate]
RxPPS:[Interface.InUnicastPacketsRate]
TxPPS:[Interface.OutUnicastPacketsRate]
Stats on: [oid(concatenate("1.3.6.1.2.1.2.2.1.2.",link_index()))]
Something else I wonder about: Just about how many devices are you guys monitoring and with how many probes for each device? Also how many links that are monitored do you have?
I have 137 devices and 361 services and about 80 snmp links. But I'm currently not working at ISP :)
I am monitoring about 300 devices (200 switches/routers, ~50 servers and some other devices that are monitored just because we can monitor them and I exercised writing probes for them (like firewalls, laser output levels or the paper level in a xerox printer) ;)
You can count your devices, maps, and probes. Just install python 2.7, run my script and provide to it your last The Dude backup :).
Follow the link http://forum.mikrotik.com/viewtopic.php?f=8&t=43761
I also need to monitor the interface status and interface description (needed for automatic alarms/notifications to our NOC for customer connections)
This is easy to do.
 
User avatar
blackpaw
newbie
Topic Author
Posts: 37
Joined: Thu Jan 28, 2010 7:19 pm
Location: Amsterdam, NL

Re: Dude 4.0 beta3: high CPU on cisco devices

Tue Apr 05, 2011 2:50 pm

how are you polling this data in your interface/device labels? Just created a function "oid(1.2.3.4.5.6.7.....)" or did you create a function with conditions for each? How often do you refresh the labels?
It depends..... Any network and server group should be monitored depending on what you need to see.
My label for TxPPS/RxPPS code is:
Fe0/22
Rx: [Interface.InBitRate]
Tx: [Interface.OutBitRate]
RxPPS:[Interface.InUnicastPacketsRate]
TxPPS:[Interface.OutUnicastPacketsRate]
Stats on: [oid(concatenate("1.3.6.1.2.1.2.2.1.2.",link_index()))]
OK, this is basically the way I do it, too

I am monitoring about 300 devices (200 switches/routers, ~50 servers and some other devices that are monitored just because we can monitor them and I exercised writing probes for them (like firewalls, laser output levels or the paper level in a xerox printer) ;)
You can count your devices, maps, and probes. Just install python 2.7, run my script and provide to it your last The Dude backup :).
Follow the link http://forum.mikrotik.com/viewtopic.php?f=8&t=43761
More black magic.. and just what I needed! Thanks! :)

I also need to monitor the interface status and interface description (needed for automatic alarms/notifications to our NOC for customer connections)
This is easy to do.
Well, reading the interface description is, but not getting the interface description dynamically into a notification email, I had to create a lot of probes for that because I could not come up with a better solution than this:
Example ifindex: 10001 (interface FastEthernet1/0/1)
  <sys-name>if_10001_status</sys-name>
  <code>if(array_size(oid_column("1.3.6.1.2.1.2.2.1.8",10,29)), oid_raw("1.3.6.1.2.1.2.2.1.8.10001", 10, 29),"False")</code>
  <descr>polls the status of ifindex 10001 (1 means 'up', 2 means 'down')</descr>
  <functionAvailable>if_10001_status() = 1</functionAvailable>
  <functionError>if(if_10001_status()<>"False", if(if_10001_status() = 1, "", concatenate(oid("1.3.6.1.4.1.9.2.2.1.1.28.10001")," connected to interface: ", oid("1.3.6.1.2.1.2.2.1.2.10001") ) )  , "SNMP polling fault - most likely false alarm")</functionError>
I had to use a script to create probes for all interfaces that I want autodiscovered and when an alarm is sent instead of "down" the error status is the interface number, description, the location and name of the device - and the NOC knows what is connected there and can look at it (instead of calling me on my mobile at night) - which was the whole purpose of this exercise ;)

There are probably easier/smarter ways to get the interface description into an email (filling an array, using a database, modifying the xml file, etc...) but I wanted to use as little external tools (none so far) as possibe.


Andreas
Last edited by blackpaw on Mon Apr 11, 2011 11:33 pm, edited 1 time in total.
 
nanet
just joined
Posts: 7
Joined: Mon Apr 04, 2011 1:31 pm

Re: Dude 4.0 beta3: high CPU on cisco devices

Tue Apr 05, 2011 2:52 pm

I switch from http://forum.mikrotik.com/viewtopic.php?f=8&t=26627 that article to this one, to keep all answers in one topic.

Very good idea to make views for the snmp community. I was able to speed up a complete snmpwalk by 48sec and took now "only" 1min and 10sec.

Nevertheless, it would be a good feature to edit the intervall of the snmp queries (as suggested in the other topic).


Best regards
 
User avatar
blackpaw
newbie
Topic Author
Posts: 37
Joined: Thu Jan 28, 2010 7:19 pm
Location: Amsterdam, NL

Re: Dude 4.0 beta3: high CPU on cisco devices

Wed Apr 06, 2011 6:22 pm

ok. this is not officially sick.

Test Patient: Cisco 7606 router/switch with SUP720 supervisor
created a new community for testing purposes:
snmp-server community ********* view TheDude RO 6
I am allowing only the CPU and Memory OIDs (for testing)
snmp-server view TheDude lsystem.57 included
snmp-server view TheDude ciscoMemoryPoolEntry.5 included
snmp-server view TheDude ciscoMemoryPoolEntry.6 included
the ACL allows only one IP. The dude creates this in less than 5 minutes:
Standard IP access list 6
    10 permit 10.16.147.0, wildcard bits 0.0.0.255 (24614 matches)
    20 deny   any log
And the CPU skyrockets:
DIST76-01.AMS4(config)#do sho proc cpu sort | i SNMP ENGINE
 196   3601375082453772088          0 86.62% 80.99% 46.39%   0 SNMP ENGINE   
There is one more test I have to run: Start from scratch... try it all again but I am starting to lose confidence...

Anything I am doing wrong? Why is the Dude polling so much?

I found out when you open device and go to SNMP tab it creates like 20000 packets extra on the ACL - is this a feature? Half of the packets is UDP, the other half TCP


Andreas
 
User avatar
gsandul
Member Candidate
Member Candidate
Posts: 154
Joined: Mon Oct 19, 2009 1:42 pm

Re: Dude 4.0 beta3: high CPU on cisco devices

Thu Apr 07, 2011 9:03 am

Anything I am doing wrong? Why is the Dude polling so much?
The Dude is polling by default.
Yes you are wrong in cisco config.
You had only included some subtrees, but you also have to exclude all the others. So all the snmp tree is included in your config.
The best practice is to include all by default, and exclude all you do not want to be accessible.
In your case the config should be.
snmp-server view TheDude iso included
snmp-server view TheDude  iproute excluded
snmp-server community ********* view TheDude ro 6
First - include the root (and all the subtrees).
Second - exclude routing information.
That is all.

The Dude will try to get your routing information, and cisco will return nothing to snmp request. So The Dude will not send more snmp packets.
 
User avatar
blackpaw
newbie
Topic Author
Posts: 37
Joined: Thu Jan 28, 2010 7:19 pm
Location: Amsterdam, NL

Re: Dude 4.0 beta3: high CPU on cisco devices

Mon Apr 11, 2011 11:45 pm

Ok, did it "textbook style" applying the snmp views
access-list 6 permit 10.xxx.yyy.zzz 0.0.0.255
access-list 6 permit x.x.x.x 0.0.1.255
access-list 6 deny   any log
access-list 6 remark SNMP view for The Dude - snmp polling system

snmp-server view TheDude iso included
snmp-server view TheDude at excluded
snmp-server view TheDude snmpUsmMIB excluded
snmp-server view TheDude snmpVacmMIB excluded
snmp-server view TheDude snmpCommunityMIB excluded
snmp-server view TheDude ip.21 excluded
snmp-server view TheDude ip.22 excluded

snmp-server community ******** view TheDude RO 6
and guess what:

CPU usage dropped 20-40% all over the network, gaps in graphs have disappeared and there are no more false alarms. The Dude now accounts for 2-3% CPU total (instead of 10 %)
You were right somehow about the routing table / arp table - the devices that were hit the hardest were either aggregation switches for a lot of customers or routers carrying the internet routing table.

I can poll now every 10 seconds and still no outages, do entire snmpwalks to find interesting OIDs and only the occasional timeout.. like it was many years ago when the network was small and easy to maintain ;)
I will play with remote polling now once I can get my hands on some dedicated servers.

You guys are now officially awesome. Thank you! :)

cheers



Andreas
 
lebowski
Forum Guru
Forum Guru
Posts: 1619
Joined: Wed Aug 27, 2008 5:17 pm

Re: Dude 4.0 beta3: high CPU on cisco devices

Tue Apr 12, 2011 5:09 pm

My Cisco CPUs has never been that high. Even with the dude polling every 30 seconds I have routers that are so low CPU that a 0% would be returned for the average and the dude would false positive on it.

I have been watching this thread and the only idea I have as to why your seeing such high CPU is maybe you modified a global device label but I like your solution and would consider it if I needed it.

Great work indeed :)
 
User avatar
otgooneo
Trainer
Trainer
Posts: 581
Joined: Tue Dec 01, 2009 3:24 am
Location: Mongolia
Contact:

Re: Dude 4.0 beta3: high CPU on cisco devices

Wed May 18, 2011 7:32 am

Hey guys, The CPU usage of my RB1000U increases when using Dude monitoring. I have seen from profiler it shows management process is 20%-28%. I`m monitoring only CPU usage, PPPoE active users, 2 queue rules using OID. How can I disable other SNMP sources (like interfaces, routes, IP addresses...) on my RB1000?
 
wildbill442
Forum Guru
Forum Guru
Posts: 1055
Joined: Wed Dec 08, 2004 7:29 am
Location: Sacramento, CA

Re: Dude 4.0 beta3: high CPU on cisco devices

Wed Aug 17, 2011 5:14 pm

Hey guys, The CPU usage of my RB1000U increases when using Dude monitoring. I have seen from profiler it shows management process is 20%-28%. I`m monitoring only CPU usage, PPPoE active users, 2 queue rules using OID. How can I disable other SNMP sources (like interfaces, routes, IP addresses...) on my RB1000?
I have the same issue -- CPU is through the roof due to SNMP.. I don't think we can limit the viewable OID's like on a Cisco device...

CPU usage drops from 100% to 40-50% when disabling SNMP on the router. The router I'm monitoring is a RB1200 running RouterOS 5.5 monitored by Dude 4.0beta3

I have 450+ PPPoE interfaces on this device and each PPPoE interface has their own respective routes, and queues.
 
User avatar
blackpaw
newbie
Topic Author
Posts: 37
Joined: Thu Jan 28, 2010 7:19 pm
Location: Amsterdam, NL

Re: Dude 4.0 beta3: high CPU on cisco devices

Wed Feb 29, 2012 11:33 am

bump - seems I forgot to exclude the "inetCidrRouteTable" or "ipForward" as Cisco calls it.
So if you wanted to disallow reading of ARP entries and - say - prevent the Dude from polling the Internet routing table (if you have it) .21 and .24 would be enough to exclude.

access-list 6 permit 10.xxx.yyy.zzz 0.0.0.255
access-list 6 permit x.x.x.x 0.0.1.255
access-list 6 deny   any log
access-list 6 remark SNMP view for The Dude - snmp polling system

snmp-server view TheDude iso included
snmp-server view TheDude system included
snmp-server view TheDude at excluded
snmp-server view TheDude snmpUsmMIB excluded
snmp-server view TheDude snmpVacmMIB excluded
snmp-server view TheDude snmpCommunityMIB excluded
snmp-server view TheDude ip.21 excluded
snmp-server view TheDude ip.22 excluded
snmp-server view TheDude ipForward excluded

snmp-server community ******** view TheDude RO 6
 
changeip
Forum Guru
Forum Guru
Posts: 3829
Joined: Fri May 28, 2004 5:22 pm

Re: Dude 4.0 beta3: high CPU on cisco devices

Sun Aug 24, 2014 1:25 am

All this talk about how to disable the mib tree on Cisco... What about on a mikrotik network with 1000 devices - same issue. I tried writing layer 7 on the router right after the dude traffic to filter it but its not quite working. Can someone help fix?

/ip firewall filter
add action=drop chain=forward disabled=yes dst-port=161 layer7-protocol=\
snmp-route-table out-interface=1-wan protocol=udp

/ip firewall layer7-protocol
add name=snmp-route-table regexp="\\+\\x06\\x01\\x02\\x01\\x04\\x15"

Trying to use a layer 7 rule to block the routeTable tree. Seems hit or miss - I thought i had it working but not 100%.

Who is online

Users browsing this forum: No registered users and 17 guests