Community discussions

MikroTik App
 
Syonyk
Member Candidate
Member Candidate
Topic Author
Posts: 109
Joined: Mon Feb 14, 2005 6:32 pm
Location: Coralville, IA
Contact:

Hotspot/Winbox crashed, SSH & routing still work?

Thu May 05, 2005 12:29 am

*twitches*

Unhashing a RADIUS database after a Mikrotik reboot is not on my list of "fun things to do."

One of our Mikrotiks just crashed. It's running 2.8.26 and handling a good number of Hotspot clients (95 or so). Hotspot died, Winbox couldn't connect, the admin web page couldn't connect, but SSH connections & general routing were still working.
[admin@ROUTER] ip> service print
Flags: X - disabled, I - invalid 
 #   NAME                                  PORT  ADDRESS            CERTIFICATE
 0 X telnet                                23    0.0.0.0/0                     
 1 X ftp                                   21    0.0.0.0/0                     
 2 I www                                   8080  0.0.0.0/0                     
 3 I hotspot                               80    0.0.0.0/0                     
 4   ssh                                   22    0.0.0.0/0                     
 5 I hotspot-ssl                           443   0.0.0.0/0          none       
The "I" markings are what concern me. Those were the specific functions that failed. Rebooting the router fixed it, but... having to reboot the router is not a good thing. Disabling & reenabling the service didn't fix it either - they would enable back to Invalid.

What would cause these services to go offline? It's rather concerning, because this router serves a lot of people.

-=Russ=-
 
User avatar
normis
MikroTik Support
MikroTik Support
Posts: 26322
Joined: Fri May 28, 2004 11:04 am
Location: Riga, Latvia

Thu May 05, 2005 1:03 pm

make a supout.rif file while it is in this 'crashed' state, please. send it to support@mikrotik.com with the same description
 
Syonyk
Member Candidate
Member Candidate
Topic Author
Posts: 109
Joined: Mon Feb 14, 2005 6:32 pm
Location: Coralville, IA
Contact:

Thu May 05, 2005 4:01 pm

How does one make one of those support.rif files again?

-=Russ=-
 
User avatar
normis
MikroTik Support
MikroTik Support
Posts: 26322
Joined: Fri May 28, 2004 11:04 am
Location: Riga, Latvia

Thu May 05, 2005 4:44 pm

 
Syonyk
Member Candidate
Member Candidate
Topic Author
Posts: 109
Joined: Mon Feb 14, 2005 6:32 pm
Location: Coralville, IA
Contact:

Mon Jun 13, 2005 10:34 pm

We just had the exact same issue, same symptoms, same fix (reboot the router). supout.rif file has been sent to support, so hopefully there will be something useful in it.

-=Russ=-
 
jarosoup
Long time Member
Long time Member
Posts: 596
Joined: Sun Aug 22, 2004 9:02 am

Fri Jun 24, 2005 4:32 am

FWIW, we are seeing this exact same problem...I haven't had a chance to generate a supout before the router crashes and reboots itself, but the Hotspot completely freaks out right before it crashes.

At first I thought this was something else causing it, but it's the hotspot as only our routers with hotspot enabled do this. The log shows tons of hotspot login and logouts, but fails to list the second line that shows the IP address. Active user count also becomes inacturate. Seems to take almost exactly 2 weeks for this all to happen. I'll try to send support a supout file if I can catch this one early next time. Versions tried are 2.8.26 and 2.8.27.
 
Syonyk
Member Candidate
Member Candidate
Topic Author
Posts: 109
Joined: Mon Feb 14, 2005 6:32 pm
Location: Coralville, IA
Contact:

Fri Jun 24, 2005 6:48 am

How many users is your router running, and what kind of hardware?

With around 120 users on at any given point, and 64 meg RAM, we usually see about a month between crashes (so far).

I talked to some of the support guys, and they said that the issue would be fixed in 2.9.

-=Russ=-
 
jarosoup
Long time Member
Long time Member
Posts: 596
Joined: Sun Aug 22, 2004 9:02 am

Fri Jun 24, 2005 8:06 am

One of them currently has at least 20 at any given time - but at one time a few months ago, we were averaging 40 concurrently all the time (it's a seasonal location). This is on a VIA C3 533MHz thinrouter from FEN running just with 2 ethernet ports. Traffic is rather low even with 40 people - and there's no P2P at all (we love this network). We'll call this the "Router A"

The other is a RB230 equiv (Geode 266) with a peak of 20 users during the day (a public hotspot). Traffic can be high, but abuse is low and nothing is ever out of hand - the CPU never spikes to more than 20 percent and averages 5-10% with 20 users. We'll call this the "Router B"

Both devices are running with only 2 ethernet ports. One is running with the "transparent proxy for hotspot" enabled, the other isn't. I know this wasn't a problem with 2.8.13 as that's what we used to run but had to upgrade a few months ago due to the msn/hotmail issue with .13

Router A actually last up to 4 weeks. Just checked my graphs for than one, and it's lasting anywhere between 1 1/2 weeks to 4 weeks (lasting much longer now than 2 months ago). Router B lasts about 2 weeks with less variation. When this happens on Router A, we lose all active hotspot users, but the mangle rules and queues are all still in place, and the hotspot itself is broken. All it needs is a manual reboot to correct. When this happens on Router B, the device reboots itself (it does have watchdog enabled and is watchdog-capable) and so we never see much of what happened. However, both show issues in the log as they only show part of the hotspot login process - but this problem doesn't showup in the actual syslog server the logs get sent to, just what shows up in the log window in winbox (we've set these to write logs to disk).

My theory is that it's an issue with the total number of times a hotspot user has logged in and some memory allocation issue. I've now got both graphing with snmp and notice that the memory usage slowly crawls up to some given peak (somewhere around 28MB on both routers) and then crashes. Here's the interesting part. I mention the total number of times that a hotspot login occurs because the public hotspot (Router B) has a lot during the day, but none at night. Still, aleast twice as many total hotspot logins in one day than Router A. Router A's network is a private Hotspot with 5-day keepalive timeouts so the logins happen much less frequently. Even more, this a seasonal location where the summer is "slow season" and so it's being used much less than it was a few months ago. A few months ago, this occured every 1 1/2 to 2 1/2 weeks - now it hasn't happened in almost 4 weeks. These have to be related. Honestly, I never really looked at it this way until tonight when I typed this all out.

Who is online

Users browsing this forum: Amazon [Bot], artone, uxertxo and 84 guests