Community discussions

MUM Europe 2020
 
marisspringis
just joined
Topic Author
Posts: 15
Joined: Wed Dec 12, 2018 2:17 pm

RB4011 and RB1100 AHx4 "bricks" randomly

Mon Jun 03, 2019 1:18 pm

Hi everyone,
since the last week of may, strange thing has happened to Router Boards which i manage.
issue - Router simply bricks, what i mean by that is - you cannot connect to router in any way (stays on logging in and nothing more happens), APs that are connected lose all config from dude, SNMP stops working and so on.
in the same time, from computers which are connected to switch, internet is working. also i can ping that router
this can be resolved only by hard reset (take power cable off/on)
issue has happened only on RB4011 and RB1100AHx4 Dude edition
Router Os - 6.44.3
previously this has never happened.
these RB are in different countries.

so far this has happened only once but with every RB4011 we have and one RB1100 AHx4
does anyone else has seen this?
 
User avatar
ccardenas
Trainer
Trainer
Posts: 5
Joined: Thu May 15, 2014 9:04 pm
Location: Spain

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Mon Jun 03, 2019 1:48 pm

Hello! Do you have bridges in your network implementation? How many hosts are passing traffic among these "random bricking" devices? Could you provide us more info? So we can understand your problem and we'll be able to help you better.

We have a similar issue, and we are suspecting about the bridge host table size and (possible) memory exhaustion problem. It only happens in new arm devices (RB4011 and RB1100x4).

Symptoms are loss of connectivity, manageability and it's impossible to access the device in any way, but it keeps working as a switch. After a reboot (unplug power cable/ replug) all begin to work fine and we can see a lot log lines like: snmp, warning timeout while waiting for program XX (where xx is a variable two digit number)

Image

Regards
 
marisspringis
just joined
Topic Author
Posts: 15
Joined: Wed Dec 12, 2018 2:17 pm

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Mon Jun 03, 2019 2:43 pm

Hi ccardenas,
yes, we have 3 bridges in RB4011 and in RB1100AHx4.
hosts in RB4011 most of the time are - 30-40
in RB1100 not more than 5 connected directly, it is used as dude server for monitoring.

one more thing to point that problem is in these RB4011 is that we have a lot of RB2011 with the same config, and they work perfectly, without any problems. also ROS version in all ar is the 6.44.3

symptoms are identical to yours.
 
marcin21
Member Candidate
Member Candidate
Posts: 194
Joined: Tue May 04, 2010 4:50 pm

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Mon Jun 03, 2019 4:32 pm

Itr seems that cpu is getting exhausted over time.
this particular ARM based WAP60g is 149d up.
You do not have the required permissions to view the files attached to this post.
 
User avatar
ccardenas
Trainer
Trainer
Posts: 5
Joined: Thu May 15, 2014 9:04 pm
Location: Spain

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Mon Jun 03, 2019 5:45 pm

one more thing to point that problem is in these RB4011 is that we have a lot of RB2011 with the same config, and they work perfectly, without any problems. also ROS version in all ar is the 6.44.3
Hello!! Yes, totally true. RB2011 and RB1100Hx2 in the same place, in the same network, in the same situation and nothing happens, they never block. We've opened a support ticket to Mikrotik and they told us to plug a serial cable and wait the device to block, then try to access it via console and make a supout, but we have a couple of them with cables attached and now they never block! :cry:

Other devices within the network keep blocking randomly. In the meanwhile we have scheduled a reboot (lame solution, but it saves the day) at nights a couple of times a week until we find the real problem, but it seems that some process inside the routerboard hangs or collapses the memory, making the another processes fall in cascade and block the access to the device.

If someone is experiencing the same problem, please share with us, maybe we can find a hint in the meanwhile, until I can get a good supout file and send to Mikrotik support.

Regards!!
Last edited by ccardenas on Mon Jun 03, 2019 7:27 pm, edited 1 time in total.
 
Dude2048
Member Candidate
Member Candidate
Posts: 101
Joined: Thu Sep 01, 2016 4:04 pm

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Mon Jun 03, 2019 6:38 pm

I have a RB1100ahx4 Dude edition which has the same behavior. What happens is that the memory hogs and the device will become inaccessible. I have a script that reboots the device when 70% is used. During the times that it is inaccessible I tried to make a supout, via console, but that didn't work.
 
User avatar
kehrlein
just joined
Posts: 8
Joined: Tue Jul 09, 2019 1:35 am
Location: Munich, Germany
Contact:

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Thu Dec 05, 2019 12:24 pm

Same issue here with 3 different RB1100AHx4. The situations happens with < 7 days uptime.
During the incident, the devices don't accept new SSH, Winbox, SNMP or PPTP connections to the router itself.
Also logins via CLI aren't possible (serial connection is possible; login doesn't work). We tried to have a running serial connection to the affected devices. If the issue occurs, we are able to type commands into the cli, but generating the supout or doing real actions (e.g. initiate a reboot) doesn't work.
Other traffic goes through the router smoothly. The issue can temporary be fixed by powering off an on the device. Temporary solution is a scheduled reboot.
Sandro Kehrlein
MikroTik Certified Consultant
 
meshnet
Frequent Visitor
Frequent Visitor
Posts: 56
Joined: Tue Jun 01, 2004 6:57 pm

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Thu Dec 05, 2019 9:05 pm

Starting to see this issue also on 4011s. 6.44.1 on the last one it happened to.
Power cycle restores all functionality..
SNMP polling becomes very intermittent right before this happens, pointing to the CPU issues..
Only bridge configured on the devices is an empty bridge for loopback.
 
Dude2048
Member Candidate
Member Candidate
Posts: 101
Joined: Thu Sep 01, 2016 4:04 pm

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Fri Dec 06, 2019 12:01 pm

Same issue here with 3 different RB1100AHx4. The situations happens with < 7 days uptime.
During the incident, the devices don't accept new SSH, Winbox, SNMP or PPTP connections to the router itself.
Also logins via CLI aren't possible (serial connection is possible; login doesn't work). We tried to have a running serial connection to the affected devices. If the issue occurs, we are able to type commands into the cli, but generating the supout or doing real actions (e.g. initiate a reboot) doesn't work.
Other traffic goes through the router smoothly. The issue can temporary be fixed by powering off an on the device. Temporary solution is a scheduled reboot.


What version are you using.
 
quackyo
Member Candidate
Member Candidate
Posts: 116
Joined: Mon Nov 16, 2015 10:14 am

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Fri Dec 06, 2019 1:36 pm

Does everybody experiencing this run Dude on the device?
I have had the exact same issue on my RB1100AHx4 shortly after setting up Dude. Happened 3 or 4 times over a month or two before i connected the dots. Disabled Dude server and no hickup since (6-8 months ago...).
 
User avatar
kehrlein
just joined
Posts: 8
Joined: Tue Jul 09, 2019 1:35 am
Location: Munich, Germany
Contact:

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Fri Dec 06, 2019 1:38 pm

Same issue here with 3 different RB1100AHx4. The situations happens with < 7 days uptime.
During the incident, the devices don't accept new SSH, Winbox, SNMP or PPTP connections to the router itself.
Also logins via CLI aren't possible (serial connection is possible; login doesn't work). We tried to have a running serial connection to the affected devices. If the issue occurs, we are able to type commands into the cli, but generating the supout or doing real actions (e.g. initiate a reboot) doesn't work.
Other traffic goes through the router smoothly. The issue can temporary be fixed by powering off an on the device. Temporary solution is a scheduled reboot.
What version are you using.

Latest bad experience was with 6.45.7.
Time for experience with 6.46 was too short.

After some mails with the MikroTik support during the last month, I finally got this answer:
Unfortunately, this problem seems to be caused by a hardware issue.
Please contact the seller and return the router for warranty repairs, if the router is still covered by it. You can refer to this ticket number - SUP-3012.
So now I am talking to the reseller about refund.
Sandro Kehrlein
MikroTik Certified Consultant
 
User avatar
kehrlein
just joined
Posts: 8
Joined: Tue Jul 09, 2019 1:35 am
Location: Munich, Germany
Contact:

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Fri Dec 06, 2019 1:40 pm

Does everybody experiencing this run Dude on the device?
I have had the exact same issue on my RB1100AHx4 shortly after setting up Dude. Happened 3 or 4 times over a month or two before i connected the dots. Disabled Dude server and no hickup since (6-8 months ago...).
I had the issues on several devices without running the Dude.
Sandro Kehrlein
MikroTik Certified Consultant
 
Bolle
just joined
Posts: 2
Joined: Fri Aug 23, 2019 9:42 pm
Location: Germany

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Fri Dec 06, 2019 2:49 pm

Hi !

Same here with RB4011iGS+5HacQ2HnD und RB1100Dx4.

Both devices ´freezes´ several times a week.
The RB1100 sometimes two or three times a day.

I tryied with ROS 6.45.7 and 6.46beta59.
 
marcin21
Member Candidate
Member Candidate
Posts: 194
Joined: Tue May 04, 2010 4:50 pm

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Mon Dec 16, 2019 1:38 pm

Anyone has found a solution to this problem ?
Maybe SNMP disable?

I've got problematic 4011 and since few days it started to die.
 
aoakeley
newbie
Posts: 38
Joined: Mon May 21, 2012 11:45 am

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Sat Dec 21, 2019 3:18 pm

I have two 1100AHx4 with this issue, and an open case with Mikrotik (Ticket#2019100722004559). I have just been able to connect in with a serial cable and generate a supout file.

I am not running Dude on them.
SNMP is enabled though.
 
FreeVoip
just joined
Posts: 4
Joined: Tue Oct 15, 2019 5:33 pm

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Sat Dec 21, 2019 4:49 pm

Hello!.

Same problem here!!!!! viewtopic.php?f=2&t=154859&sid=9cf5b071 ... 23dad9a667

I don't know what else to do. It already happened to me 4 times in the week.
There is no high CPU consumption, no RAM consumption or anything strange.

It is a very big problem for me.
 
FreeVoip
just joined
Posts: 4
Joined: Tue Oct 15, 2019 5:33 pm

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Sat Dec 21, 2019 4:57 pm

It is incredible that since June this was reported and nobody from Mikrotik did anything.

In my case they suggested connecting via serial cable but I am 800 km from my node. It is impossible for me to do that. It is easier to restart it but it is not very serious for an internet provider to cut the service every day at the same time. Needless to say, if it "hangs" at 10 am I have to wait until the next day to access again.
 
marcin21
Member Candidate
Member Candidate
Posts: 194
Joined: Tue May 04, 2010 4:50 pm

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Sat Dec 21, 2019 8:31 pm

since last problem, I 've got winbox session opened on screen a this problemtic 4011 is up and running for 5d+
I wonder if it has something in common with
viewtopic.php?f=3&t=142298
 
aoakeley
newbie
Posts: 38
Joined: Mon May 21, 2012 11:45 am

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Mon Dec 23, 2019 3:34 am

It is incredible that since June this was reported and nobody from Mikrotik did anything.
I suspect that they did not have enough information to replicate the issue and develop a fix. Like you many of my devices were a long way away, and they asked to have a serial cable plugged in. This was difficult, but I now have one connected and have sent a supout to support.

With a few more people reporting the issue now, hopefully they will be able to find a fix.
 
marcin21
Member Candidate
Member Candidate
Posts: 194
Joined: Tue May 04, 2010 4:50 pm

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Mon Dec 23, 2019 11:59 pm

I wouldn't be so optimistic regarding fast tracking the problem down.
it seems that very similar issue with rb4011
viewtopic.php?f=3&t=142298
was reported in dec'18, and in april'19 first response from Mikrotik staff that they recognise problem and working on solution,
and it seems that they haven't solved it till now.
maybe there is some serious flaw in hardware, as we see it above
Unfortunately, this problem seems to be caused by a hardware issue.
Please contact the seller and return the router for warranty repairs, if the router is still covered by it. You can refer to this ticket number - SUP-3012.
either the way i've got my winbox opened on my problematic 4011 and 7d12h uptime. maybe winbox opened does the trick? I doubt it but whatever, when it works.
Merry Xmas :)
 
luciansilviu
just joined
Posts: 1
Joined: Fri Dec 27, 2019 4:10 pm

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Fri Dec 27, 2019 4:14 pm

Hi!

Just a quick "me too". I have had this issue with two boards RB4011iGS+ running on 6.45.4.
 
reapster
just joined
Posts: 2
Joined: Mon Jun 10, 2013 8:01 pm

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Sat Dec 28, 2019 8:03 pm

Hey there,

We also see this on our RB1100AHx4 (Currently 6.45.7, and inaccessible), every few weeks it exhibits this behaviour. Cannot access via Winbox/Telnet/SSH, but routing remains fine (thankfully!). This unit does almost no traffic (< 10Mbps), and has no special configuration, its not even doing SNMP. About 10 firewall rules, usual masquerade NAT and I think a DHCP server, very basic setup.

All functionality returns after a power cycle, sadly not a feasible long term solution though. We have quite a few Mikrotik units deployed, but will have to start moving away from Mikrotik if there's no real solution to this.
 
aoakeley
newbie
Posts: 38
Joined: Mon May 21, 2012 11:45 am

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Thu Jan 02, 2020 2:57 pm

We have quite a few MikroTik units deployed, but will have to start moving away from Mikrotik if there's no real solution to this.
We have hundreds (thousands??) of MikroTik units in the field, and only 2 exhibiting this issue. So don't throw the baby out with the bath water.

After obtaining supout.rif files by console cable, MikroTik have organised warranty replacement for the two 1100 units exhibiting this issue. Can't ask for more than that.
We have not seen this issue with any 4011 units.

Andy
 
allevot
just joined
Posts: 3
Joined: Fri Oct 19, 2018 2:44 pm

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Fri Jan 03, 2020 3:18 pm

Hi,

We have several RB4011 and exactly the same happens to us. Randomly some MikroTik is no longer accessible by management, and in Dude appears the services "memory, disk and cpu" down, but the ping goes well.
The services work correctly, but there is no way to log in to the equipment and the only way to be able to have management is to restart it.

For more info we have VRRP, Bridge, Tunnel GRE, ACLs, BGP, Queues, SNMP and traffic is 20-50Mbps.

Regards,
 
aoakeley
newbie
Posts: 38
Joined: Mon May 21, 2012 11:45 am

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Sat Jan 04, 2020 2:56 am

We have several RB4011 and exactly the same happens to us. Randomly some MikroTik is no longer accessible by management, and in Dude appears the services "memory, disk and cpu" down,

Regards,
- Use a console cable to get a supout.rif file while the router is in the non responsive state (might as well do this when you lodge the ticket, as they will ask you to do this anyway). To generate the suport.rif you will probably need to have the console cable connected and be logged on before the router craps out.
- Log a ticket with support@mikrotik.com.au
- if there are any autosupout.rif in the file list send these also

Good luck.
 
allevot
just joined
Posts: 3
Joined: Fri Oct 19, 2018 2:44 pm

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Thu Jan 09, 2020 10:43 am

Hi,

We are unable to make a supout.rif because we are unable to log in into the device in any way.
Someone from MikroTik support can help us? It's a big problem for our company.

Thanks,
 
r00t
Member Candidate
Member Candidate
Posts: 263
Joined: Tue Nov 28, 2017 2:14 am

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Thu Jan 09, 2020 6:51 pm

If router is still running, you could try to generate supout by watchdog script that:
1) checks if router can ping some IP that should normally be reachable at all times (gateway, some other LAN device)
2) if not (interfaces are frozen, down, IP not accessible) generate supout
3) reboot the router
Run it every minute and hope it works when the issue happens.
But if router is not even accessible by console cable, it's likely completely frozen (CPU/ROS) and above script would not run anyway...

Some more debugging options for ROS would be great, like redirecting kernel syslog to console port at all times.
 
User avatar
inteq
Member Candidate
Member Candidate
Posts: 126
Joined: Wed Feb 25, 2015 8:15 pm

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Fri Jan 10, 2020 4:31 pm

Only one RB4011 (without WiFi) out of 12 crashed once with some process stuck.
None of RB1100AHx4 or RB1100AHx4 Dude Edition out of 19 crashed so far.
Also, bricking can happen to Mikrotiks, but it did not happen to me (yet) and if a power reset fixes it, it did not happen to you (yet).
 
User avatar
jdejansb
Frequent Visitor
Frequent Visitor
Posts: 67
Joined: Thu Jul 13, 2006 1:35 pm
Location: Srbija
Contact:

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Sat Jan 11, 2020 12:07 am

... and I was just about buying a 4011 . . . should I do that or not :? .. it's for a main network gateway, PPPoE, dude, . . .
 
aoakeley
newbie
Posts: 38
Joined: Mon May 21, 2012 11:45 am

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Mon Jan 13, 2020 5:18 am

... and I was just about buying a 4011 . . . should I do that or not :? .. it's for a main network gateway, PPPoE, dude, . . .
I have plenty running that have no issue. I think the devices with this issue are very few.
 
tquibell
just joined
Posts: 2
Joined: Fri Jan 17, 2020 6:17 pm

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Sat Jan 18, 2020 7:41 am

Same Problem as OP I think.

Not using any fancy routing. Only doing basic Firewall/NAT/Mangle and DHCP Server and Client. Some static routes, etc. Nothing else.

RouterOS Long Term 6.44.3 would cause RB4011 to randomly lock up. First few times a reboot did the trick, uptimes averaged about 60 days between lockups. The router would remain visible in Winbox but otherwise I could not communicate with it in any fashion. IP addresses would become 0.0.0.0 etc. Terminal unresponsive. Have not confirmed whether the built in serial console was also unresponsive. Finally on the last go-round the board bricked.

On boot up was able to see that the board was hanging on loading kernel. Used Netinstall to wipe the NAND and reflash the Firmware and RouterOS. Voila. Moved to Stable 6.46.2 to see if this thing will survive with a different version.

Will report back after some light stress testing and a few weeks of uptime (hopefully alot more).
 
Mikhalich
just joined
Posts: 1
Joined: Mon Feb 05, 2018 10:16 am

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Mon Jan 20, 2020 7:05 pm

Have same problem on RB4011 without WiFi.
System stops forward client packets but I can connect via winbox to WAN-port.
System runs script make supout.rif every 2-3 minutes but can not finish it.
Device did not execute any commands via console but respond on command via winbox.
Tools-Profile shows 100% load 1 of 4 core and 77% unclassified load.
Uptime was about 60 days. ROS 6.46.1

After reboot via winbox system runs normally.
Last edited by Mikhalich on Mon Jan 20, 2020 7:12 pm, edited 1 time in total.
 
User avatar
Maggiore81
Member Candidate
Member Candidate
Posts: 238
Joined: Sun Apr 15, 2012 12:10 pm
Location: Italy
Contact:

Re: RB4011 and RB1100 AHx4 "bricks" randomly

Sat Jan 25, 2020 7:12 pm

Hello
I have a 4011 that had the same behaviour described. I replaced it with another 4011, same problem!
Ethernet ports simply stop responding, then in the logs there were "waiting for progam xx" , sometimes 20, sometimes other number as described.
I tried with watchdog to one ip to reboot automatically, with no help.

I solved completely planning a system reboot at 05:30:00 every day.
Then with the latest 6.44.6, every few day.
zero issues since then :)
Dott. Elia Spadoni
---
Network Administrator,
MTCNA, MTCRE, MTCTCE, MTCINE, MTCWE
Spadhausen Internet Provider
Ravenna, ITALY
http://www.spadhausen.com

Who is online

Users browsing this forum: Bing [Bot], fabrix, mbovenka and 158 guests