Community discussions

MUM Europe 2020
 
dsdee
newbie
Topic Author
Posts: 43
Joined: Thu Dec 08, 2005 2:32 am
Location: Denver, CO

Load Average heads towards 100% after about 6 days

Mon Jul 07, 2008 3:55 pm

Hello;

I'm on a RB 532, running RouterOS 3.10. I have been experiencing my problem over the past several months, on the past few ROS versions.

After the router being up about 5 days, the load average will slowly start to climb, and then it spikes and hangs the router sometime late in the 6th day or into the 7th day. Sometimes it's been closer to 8 days, sometimes closer to 6.

Without a PS command, I am unable to tell what is taking the CPU time.

When the load spikes, the router slows down on passing traffic, and will slowly begin to refuse to answer queries from the console, or via winbox. So, I am unable to generate a supout at the time when I need it the most.

I don't believe I'm doing anything funky in my configuration, and I don't see any other changes in my traffic graphs to indicate that the router is getting overworked with traffic at these ~6 day points.

Is anyone else experiencing anything similar?

Any ideas on how to generate the supout, if I can't get the CPUs attention long enough to generate it?? I got close last week, and let the supout generate for 30 minutes, and then the console timed out and kicked me out, and the supout file was not left behind.

I am about two days away from this happening again, so I would appreciate any thoughts before I go thru the next cycle...

Thanks in advance,
 
User avatar
jwcn
Forum Guru
Forum Guru
Posts: 1501
Joined: Sun Aug 27, 2006 6:49 am
Location: Maryland, USA
Contact:

Re: Load Average heads towards 100% after about 6 days

Mon Jul 07, 2008 4:01 pm

Upgrade the firmware on the 532
 
dsdee
newbie
Topic Author
Posts: 43
Joined: Thu Dec 08, 2005 2:32 am
Location: Denver, CO

Re: Load Average heads towards 100% after about 6 days

Mon Jul 07, 2008 4:20 pm

Hmmm, current is 2.12, I see now that there is a 2.15 (there wasn't one, last time i looked)

I will upgrade it now, which will buy me another week.

I checked the changelog for 2.15, but there doesn't appear to be anything related to my problem. Was there a discussion on this problem somewhere else??

Thanks,
 
User avatar
che
Frequent Visitor
Frequent Visitor
Posts: 94
Joined: Fri Oct 07, 2005 1:04 pm

Re: Load Average heads towards 100% after about 6 days

Mon Jul 07, 2008 5:32 pm

Did you notice progressive memory consumption during period of uptime, not just CPU usage? In my case, there was less and less available RAM to router, and eventually at ~ 20% available memory he starts slowing so u cant even console log in, for that leak made CPU usage to 100% IMHO. I had that experience on one router, upgrade solved problem. Not sure if it is directly related to your problem, for in my case it was RB112. I replied because they use same RB500 packages.

Good luck with upgrade.

Che
 
dsdee
newbie
Topic Author
Posts: 43
Joined: Thu Dec 08, 2005 2:32 am
Location: Denver, CO

Re: Load Average heads towards 100% after about 6 days

Mon Jul 07, 2008 6:28 pm

There was indeed progressive memory consumption, but not down enough to the point where it should have been unusable. There is usually 10-12 MB remaining free on my 32 MB RB532.

attaching RRD graphs of the past month, where you can see the trends. The Load Avg ramp-up usually starts 1.5-2 days before I reboot (or, in some cases, power-cycle). You can see it in the graphs.
sentry_uptime---MONTH-large.gif.png
sentry_load---MONTH-large.gif.png
sentry_mem---MONTH-large.gif.png
You do not have the required permissions to view the files attached to this post.
 
User avatar
JJCinAZ
Member
Member
Posts: 473
Joined: Fri Oct 22, 2004 8:03 am
Location: Tucson, AZ
Contact:

Re: Load Average heads towards 100% after about 6 days

Tue Jul 08, 2008 7:40 am

I'm seeing this on some routers as well. One is an RB232 running a hotspot. I don't see memory leaking, just the CPU running up to 100%. I'm waiting to see if we see this on other routers not running hotspots.
 
dsdee
newbie
Topic Author
Posts: 43
Joined: Thu Dec 08, 2005 2:32 am
Location: Denver, CO

Re: Load Average heads towards 100% after about 6 days

Tue Jul 08, 2008 11:12 am

Glad to hear it's not just me.

I am not running hotspot.

I do have a PPTP client connected to my RB532 most of the time. But I haven't correlated that connection being up with the CPU load. I'm not seeing an influx of traffic around my high-load times either.
 
dsdee
newbie
Topic Author
Posts: 43
Joined: Thu Dec 08, 2005 2:32 am
Location: Denver, CO

Re: Load Average heads towards 100% after about 6 days

Mon Jul 14, 2008 7:12 am

Six days later, and my load average has slowly risen throughout the day today, to be nailed near 100% right now.

I have started a "/system sup-output" but it has not returned yet. Last time it timed out and logged me out before creating the .rif file.

So, the firmware upgrade obviously is not the answer. Anything else to try?
 
User avatar
JJCinAZ
Member
Member
Posts: 473
Joined: Fri Oct 22, 2004 8:03 am
Location: Tucson, AZ
Contact:

Re: Load Average heads towards 100% after about 6 days

Mon Jul 14, 2008 7:54 am

Okay, here's another thing to try, but it's a long-shot. Export your config with an /export file=xxx1 then do a /system reset. Edit the dumped config to just put back what is changed from the default and paste it back in. See if that clears up the problem. The theory here is that something under the covers is whacked due to the 2.9 to 3.0 upgrade. The /system reset should get it all back to known values and then program back only what you need.
 
dsdee
newbie
Topic Author
Posts: 43
Joined: Thu Dec 08, 2005 2:32 am
Location: Denver, CO

Re: Load Average heads towards 100% after about 6 days

Mon Jul 14, 2008 2:46 pm

But i've already built the 3.0 config from scratch, effectively moving over section by section that i needed to from the base system config.

so in-effect, that's how it is already configured.

Any ideas on how to capture a sup-out.rif if the machine is too busy to be writing one?
 
User avatar
JJCinAZ
Member
Member
Posts: 473
Joined: Fri Oct 22, 2004 8:03 am
Location: Tucson, AZ
Contact:

Re: Load Average heads towards 100% after about 6 days

Mon Jul 14, 2008 8:08 pm

Write a script which periodically creates one, maybe every couple of hours, with an initial delay after startup. You can have a startup script which disables the periodic script, and enables another script to start at t+1 hours. This second script will reenable the periodic dump script. This gives you a fairly large window to download the last .out after a reboot. When the router gets to the 100% state, reboot it and copy off the last .out file.
 
dsdee
newbie
Topic Author
Posts: 43
Joined: Thu Dec 08, 2005 2:32 am
Location: Denver, CO

Re: Load Average heads towards 100% after about 6 days

Mon Jul 14, 2008 9:33 pm

yeah, that's along the lines what I had churning in my mind this AM when I wrote the email and headed out to work.

I'll work on that this evening.

Thanks!

Who is online

Users browsing this forum: paulct, ryangibson5 and 131 guests