CPU stuck at 100%

I’m having a problem with the RB532 (RouterOS 4.5) at my base station. The CPU will sometimes jump to 100% and might stay that way for a few seconds or for minutes on end.

It happens at irregular intervals as well. There was not a lot of traffic when I took this screen shot at 6:45AM this morning. Only 17 out of 57 users were logged in and only 3 were actually active. I have no scripts running and I checked the log file, but found nothing out of the ordinary.

There also appears to be a lot of unreplied established connections. Before anyone asks, we haven’t had any power cuts and haven’t been disconnected from the Internet in days. I actually deleted all of the unreplied established connections about 8 hours prior to this screen shot.

Does anyone know what the problem might be, or how to go about finding it?

Many thanks, Gareth

It just happened again, only this time it carried on for about 30 minutes. No one, including myself was able to log in and in the end I had to pull the plug out to get it to restart.

This is very frustrating and I’m losing customers as a result. Does anyone have an idea what it could be, or where I should start looking for the problem?

:frowning:

You might try setting up a few test scripts to monitor different activity on the Router and also log the cpu usage at that moment.

Also, try setting some lower connection tracking values:

/ip firewall connection tracking set tcp-established-timeout=6h

for example. This should reduce the connections hanging around in the connection table (if you think this may be the problem).

Looks like P2P. Do you run a filter? Disable any packages you aren’t using.

If you can manage to get in during an event torch the client-side interface and see if any of your customers has a large number of active connections. If so try disabling service to that individual and see if performance cleans up.

And the lame answer…upgrade to a 433AH or 800. =)

Good luck!

-Nelson-

Thanks guys.

I’ve set the timeout to 6 hours as recommended and have scheduled the calea and gps packages to be disabled on next reboot. I think I’ll stick with the others, especially the ups one, power outages are a frequent occurrence here…

I’m sure it’s P2P though. I’ve been logged in during two of these events and have deleted the user with the most connections. CPU usage fell by up to 90%.

I’ve set the firewall to block users with more than 50 tcp connections for 24 hours, but I can’t figure out how to do that for users with sky-high udp connections. Any suggestions?

As for the 433AH, I’ll be replacing the Rb532 in about a month

It happened again at almost the exact same time as it did yesterday. I wasn’t able to log in and see what was going on, but about a minute after it started, the firewall on my pc alerted me to a port scanning attack from the RB532:

Somebody is scanning your computer.
Your computer’s UDP ports:
4099, 4102, 4105, and 59609 have been scanned from 172.23.0.254..

configure proper firewall, and if that doesn’t help, alert your upstream provider to take care of it.