Community discussions

MikroTik App
 
User avatar
dancuofzhills
newbie
Topic Author
Posts: 49
Joined: Sun Apr 02, 2006 5:13 am

Still locking up all the time!!!

Fri Jun 30, 2006 7:45 pm

I have many (about 50) mikrotik systems deployed on my wireless network. About 15 of them are PC computer based, and they always work excellent.
The two main systems at my main office however lockup constantly and is causing me very much grief. Those two systems have to be watched over 24/7 to be sure they haven’t failed, I have replaced all hardware 3 times, and I have reinstalled MT twice. I have tried it on a flash card, and on a hard drive. I have tried several different hardware types (amd athlon64 at first, then an amd athlonxp 1700, now an amd duron 800). When the system is about to lock up the console will become very slow to respond and any input from the keyboard will take longer and longer to appear on screen, then the whole system will be locked solid (num lock doesn't respond, activity indicators stop blinking, etc..). Also right before the very slow responsiveness all connectivity is lost through the network interfaces, I have been troubleshooting this for months, and I have lost a lot of sleep and customers over it. Please let me know if anyone has ever heard of anything like this before or know of a solution, I do not want to abandon mikotik on these systems but I absolutely have to get it working all the time. I am willing to try about anything at this point.

I have other systems on towers with the same config and they perform flawlessly for weeks or longer, the only difference is that these two systems that lock up are at the heart of my operation.

Two other points that may be relevant are that, the faster the system I put in place the more frequently it locks up (or so it seems). Also for a couple of days one of my primary remote towers was down. It put about one third of my network offline, during that time I had NO lockups. As soon as I got that equipment replaced the server locked up again. This makes me think that something on that part of the network is somehow sending some bad packets or data of some kind that is causeing my problem.

I can post any config or other information that may shed some light on this upon request

Thanks!
 
changeip
Forum Guru
Forum Guru
Posts: 3819
Joined: Fri May 28, 2004 5:22 pm

Fri Jun 30, 2006 7:57 pm

Regarding those core routers...

What version of MT?

What NICs are you using?

What mobo and chipset are you using?

Have you run a '/tool memory-test' and let it complete a pass with no errors?

I have had our production router do the same thing - although Im not at the console to see what the problem is. my guess is that its 2.9.6 and bgp was making it crap out - last night we implemented 2 new border routers on 2.9.26 and we'll see what happens with the one that was locking up - now that its not rnning bgp anymore.

Sam
 
User avatar
dancuofzhills
newbie
Topic Author
Posts: 49
Joined: Sun Apr 02, 2006 5:13 am

further info..

Fri Jun 30, 2006 11:31 pm

One of the systems has 2.9.19 and the other has 2.9.26. The v .19 server has an expired license and cannot upgrade.
The NICS are Via VT6122 Gigabit Ethernet cards. the systems also have an onboard VT6102 rhine ethernet cards, but i do not have these enabled or performing any funtion. The motherboards are M7VIG-400 from Biostar with KM400 Chipset (VT8237/VT8378). This is the 3rd different Mobo/Chip combination i have had on these systems though. I think the only commonality between the different hardware ive used is via chipsets on the motherboards, i was about to buy SiS this last time and opted not to... i wish i had now though

I have not ran the memory test but i will tonight.
 
changeip
Forum Guru
Forum Guru
Posts: 3819
Joined: Fri May 28, 2004 5:22 pm

Fri Jun 30, 2006 11:34 pm

I was going to say that VIA has shown issues in the past while under heavy loads ... not sure if its just network related or what. Maybe you can try using 'intel' chipset boards next time, they are the most reliable when it comes to network routing it seems. You should be able to take the drive (hd or flash) and just move it to another machine - if you take the NICs with you there is less reconfiguration.

Try that memory test, it will put a heavy cpu load as well as test the ram modules. It's a good test to run on any router before it goes into production. We try to run it for 1-2 days to really stress it.

Sam
 
User avatar
dancuofzhills
newbie
Topic Author
Posts: 49
Joined: Sun Apr 02, 2006 5:13 am

Tests

Fri Jun 30, 2006 11:44 pm

I am running the CPU test now because it doesn't require rebooting, tonight when there is not much traffic i will runthe memory-test

I did run the program call memtest from a bootable cd on this ram on a different computer and it passed, but i will run the MT version anyway.

Ill post my results tomorrow
 
NZLamb
newbie
Posts: 46
Joined: Wed Oct 19, 2005 6:10 am
Location: New Zealand

Re: further info..

Sat Jul 01, 2006 1:32 pm

i was about to buy SiS this last time...
You have got to be joking, right? :lol:
 
User avatar
dancuofzhills
newbie
Topic Author
Posts: 49
Joined: Sun Apr 02, 2006 5:13 am

I wish...

Sat Jul 01, 2006 7:14 pm

I would never usually buy sis, but with the support nightmare i've been going through for the last few months i am about ready to try anything! i never use intel for anything either, but i am now considering purchasing another system based on intel. If intel made boards for amd porocessors i probably would have gotten that already.
 
csickles
Forum Guru
Forum Guru
Posts: 1257
Joined: Fri May 28, 2004 8:46 pm
Location: Phoenix, AZ
Contact:

Sun Jul 02, 2006 2:15 am

Use intel based boards...

I have several of them and NONE of them lockup or act up.

I have used PII thru Xeon and NO issues from the system..

Use Intel CPU and chipset.

I have never had anything but trouble from AMD products.
(I reserve judgement on Opteron as I have not used one yet.)

Just my opinion..

Craig
Things that make you go "Hmmmmmmmm"...

Craig
 
NZLamb
newbie
Posts: 46
Joined: Wed Oct 19, 2005 6:10 am
Location: New Zealand

Sun Jul 02, 2006 2:29 am

I have never had anything but trouble from AMD products.
Ditto, though I think this is due to a lack of any decent chipsets rather than the quality of the CPU itself. I have never had one glitch running MT on Intel stuff.
 
jo2jo
Forum Veteran
Forum Veteran
Posts: 972
Joined: Fri May 26, 2006 1:25 am

Sun Jul 02, 2006 8:11 pm

one other thing possibly to consider:

On the problem machine / problem cpu - do you have a good heat sink and fan...is it properly seated?

just the idea of it crashing under load..maybe the cpu is over heating. or the chipset maybe.

I saw this a while ago on a 2003 server...changed everything out except the cpu then one day in the bios i noticed the cpu was at 48c IDLE! i re seated the HS and applied some new thermal greese and no more probs.
 
changeip
Forum Guru
Forum Guru
Posts: 3819
Joined: Fri May 28, 2004 5:22 pm

Sun Jul 02, 2006 8:20 pm

one changed everything out except the cpu then one day in the bios i noticed the cpu was at 48c IDLE!
This is why we ask that MT add lm_sensor abilty. A p4 will throttle itself almost to a crawl when overheating - therefore affecting the stability of RouterOS and routing. At least support the 3-4 most common chips.

Sam
 
User avatar
dancuofzhills
newbie
Topic Author
Posts: 49
Joined: Sun Apr 02, 2006 5:13 am

Intel

Mon Jul 03, 2006 4:11 pm

I run lots of amd servers without trouble. But i think im going to start finding a way to set up these two systems on intel platforms and see if they work any better. I have already invested so much money into re-replacing these that it may be a couple of weeks yet before i can switch them out again...
In the mean time i am still looking for someone else with this problem or any other sugestions.
My systems pass ram and cpu tests, both the ones under /tool and from 3rd partys like memtest and pc-doctor tests. I have double checked my cooling, everything is in order there, actually i have an AC duct routed to my server rack for extra cooling.

I only actually have one Mikrotik system running on amd that works, all my other AMD servers are running microsoft 2003, and they work very well.
The one MT system that runs AMD successfuly does not have any wireless interfaces, so i think thats got something to do with it.

On my towers all the PC systems are running an integrated motherboard with the integrated Via C3 processor (800mhz) these work very well and stable, ive never had a problem with those, but i needed something a little higher end for the border systems.
 
User avatar
dancuofzhills
newbie
Topic Author
Posts: 49
Joined: Sun Apr 02, 2006 5:13 am

Intel

Mon Jul 03, 2006 4:12 pm

I run lots of amd servers without trouble. But i think im going to start finding a way to set up these two systems on intel platforms and see if they work any better. I have already invested so much money into re-replacing these that it may be a couple of weeks yet before i can switch them out again...
In the mean time i am still looking for someone else with this problem or any other sugestions.
My systems pass ram and cpu tests, both the ones under /tool and from 3rd partys like memtest and pc-doctor tests. I have double checked my cooling, everything is in order there, actually i have an AC duct routed to my server rack for extra cooling.

I only actually have one Mikrotik system running on amd that works, all my other AMD servers are running microsoft 2003, and they work very well.
The one MT system that runs AMD successfuly does not have any wireless interfaces, so i think thats got something to do with it.

On my towers all the PC systems are running an integrated motherboard with the integrated Via C3 processor (800mhz) these work very well and stable, ive never had a problem with those, but i needed something a little higher end for the border systems.
 
digus
just joined
Posts: 23
Joined: Mon Sep 11, 2006 5:47 pm
Contact:

Thu Nov 09, 2006 5:44 pm

We are having lots of problems with some P4 systems running Biostar motherboards with VIA chipsets. About to try an intel board - Will post results...
 
jarosoup
Long time Member
Long time Member
Posts: 600
Joined: Sun Aug 22, 2004 9:02 am

Fri Nov 10, 2006 6:17 am

Have you tried tweaking the BIOS of any of these boards? Try backing off the memory timings and chipset settings (depending on what is available)...this might help.
 
Vadim
newbie
Posts: 27
Joined: Sat May 29, 2004 9:58 pm
Location: Liepaja, Latvia
Contact:

Mon Nov 13, 2006 4:28 pm

I have installed some MTs on intel-based systems with P4 CPU and I had trouble with system hanging up at the average CPU load. The solution was to disable Enhanced C1 control, Intel SpeedStep Technology and Hyper-Threading Technology in BIOS. Now everything runs perfectly.
 
pekr
Member Candidate
Member Candidate
Posts: 138
Joined: Tue Feb 22, 2005 9:05 pm
Location: Czech Republic
Contact:

Mon Nov 13, 2006 8:36 pm

Hi,

we have some occassional problems too. As for me, I don't believe in AMD myth. We changed completly two PCs, various parts, and sometimes system locked on us - still the same scenario - refusing Winbox, refusing SSH, Internet connection still running for some time.

As for me, I would not run ANY important site without external GSM or ping watchdog.

PS1: - on Czech MT forum, there is one interesting post. Some guy was exchnaging PCs too, various components, still not luck. Later on, with one consultant, they found some dependency. The faulty part was Compact Flash card - maybe it becomes corrupt from time to time. When he used binary backup, there was some trouble. So after your new HW installation, try NOT to use binary backup, but import exported plain text configs, it might help ...

PS2: two weeks ago I was at another MT training here, and guys were suggesting SuperMicro boards. Those are reported as running absolutly flawlesly. SuperMicro are server class boards, they have multiple PCI busmasters etc. A bit pricey, but maybe worth it. But I am not really sure the trouble lays in proper HW. Isn't there Linux underneath after all? I doubt AMD processors are insufficiently supported, even with old 2.4 kernel, which current ROS seems to be based upon ...

Cheers,
Petr
 
User avatar
normis
MikroTik Support
MikroTik Support
Posts: 24609
Joined: Fri May 28, 2004 11:04 am
Location: Riga, Latvia

Tue Nov 14, 2006 10:16 am

We have some Supermicros and they are good

Who is online

Users browsing this forum: adros and 97 guests