Community discussions

 
MichaelMueller
just joined
Topic Author
Posts: 3
Joined: Tue Nov 01, 2016 12:25 am

RB911G-5HPnD with 6.35.2 suddenly becomes inaccessible

Tue Nov 01, 2016 1:21 am

Dear forum and Mikrotik guys,

we have encountered a serious problem with this combination:

After a non-predictable time the device (see subject) becomes completely inaccessible and non-operational.
I do not know what happens exactly. I can say Winbox (MAC discovery) cannot see the device anymore. Restarts or Power-Cycles do not change that. Even the reset procedure did not help.

Because the device is an essential part in our customer system (far away locations included)
and for us only reachable remotely this kind of fail always means an emergency situation.

At the moment this happened 3 times in total until this day. All case within the last 3 month.

But I'm afraid it will happen more often because we have a lot more of these devices out there.
BTW: We a small but growing company and using a lot of RouterBoards at our customer sites in the field out there. And we use different models than this one too.
But so far we encountered this kind of critical fail only with the mentioned device.

These things all 3 fail cases had in common:
  • Failed device: RB911G-5HPnD
  • RouterOS 6.35.2 installed (I read this was a MikroTik factory internal release only)
  • Directly before the fail the customer system did undergo a complete powerloss or power-off/power-on cylce.
  • At least two of the devices had firmware 3.24 active
  • All work configured as Access Point in "P2P bridge mode" (only one client)
  • The power supply unit used is the recommended original one "18POW" (24V,0.8A).
I got one of these failed devices back and got the opportunity to analyse it myself.
It indeed was non-operational from a user's standpoint and inaccessible by any configuration tool (Winbox).
All LEDs seemed to react in kind of the right way and I could see a clear reaction when doing the reset procedure.
But nothing helped to get access to the device. Nothing except when I tried the hardcore method of "reset & netinstall".
I was able to revive the inaccessible device that way.
In addition I updated the firmware and RouterOS using Winbox just some days before.
Now this device is under permanent testing.
But to be honest my trust to the stability of this device is not very high anymore.

Because I really have no idea how to prevent this fail at our customers my questions now are:
  • Is there a known problem in this combination causing such malfunction?
  • Are I'm unlucky and there is a known bad production batch of these devices.
  • Is it possible to upgrade the fallback safe-boot firmware?
Our emergency plan is to send a spare device to each of the customers but that is only fighting the symptoms. I need a stable solution...

I know the I can update all devices to actual RouterOS+Firmware.
But this means a lot of work and always carries the risk of making a working WiFi-connection non-operational and connected clients unreachable.
And the worst: I even do not know if it will prevent the sudden fail from happing.

Can somebody bring some light into this?
I would really appreciate any help.

Kind regards,
Michael Müller
 
Quared
Trainer
Trainer
Posts: 61
Joined: Tue Aug 13, 2013 8:29 am
Location: Central Europe

Re: RB911G-5HPnD with 6.35.2 suddenly becomes inaccessible

Tue Nov 01, 2016 1:50 pm

Hello,

currently we've got some RB911G (2HP, 5HP, also ac's/QRT) flawlessly running in projects for longer periods of time, most of them running 6.36, some already 6.37.1.
These PCBs are running stable, no issues like those mentioned.

Some hints:
Try to stick with the most recent RouterOS version and upgrade the firmware as well.
Also plan for maintenance upgrade of those units already deployed after the 'upgrade' has passed your local test scenario.

If needed, please contact me via forum-private message to arrange some on-site-checkup

greets
 
dinclan
just joined
Posts: 2
Joined: Mon Nov 07, 2016 8:42 pm
Location: Puerto Rico

Re: RB911G-5HPnD with 6.35.2 suddenly becomes inaccessible

Mon Nov 07, 2016 8:51 pm

Hello:
Since last Friday we power cycle two RB911G qrt-5 working as a P2P. After that , they become inaccessible with WinBox too. Routher version 6.30.6 WinBox version 3.7,
Before that they a running with out any issue.
 
MichaelMueller
just joined
Topic Author
Posts: 3
Joined: Tue Nov 01, 2016 12:25 am

Re: RB911G-5HPnD with 6.35.2 suddenly becomes inaccessible

Tue Nov 08, 2016 12:49 pm

Hello,

now I got the device back from the last failure case which happened last month.

Regarding my setup protocols this device was running on RouterOS 6.35.4. Because I do not want to touch any more than trying Winbox I can only guess that it ran on Firmware 3.24 too.

Including the former post of user "dinclan" it seems to be independent from the exact RouterOS version.
Now we have four cases including these RouterOS versions:
  • 6.30.6
  • 6.35.2
  • 6.35.4
My short analysis of this last device is as following:
  • It starts to be exactly the same symptoms: No accessibility by Winbox anymore. No active WLAN.
  • And from the starting sounds only the "one beep" can be heard. I think that means RouterOS is not starting up correctly.
Because I hope you Mikrotik guys have additional ways in deep analyzing such devices I especially did no more steps with this device now.
Intention was to keep the errourneous situation in place as is.
Maybe doing RMA procedure to give you experts the ability to connect by chip-interface (JTAG, ISP, Flash-Copy, what else ...).

We hav more of this device in stock so we have the clear opportunity to start a RMA for this if this will bring some light into it.

Or does all of this make no sense and I should try to revive the device with reset+netinstall?

Waiting for further hints,

Michael
 
dinclan
just joined
Posts: 2
Joined: Mon Nov 07, 2016 8:42 pm
Location: Puerto Rico

Re: RB911G-5HPnD with 6.35.2 suddenly becomes inaccessible

Wed Nov 09, 2016 10:02 pm

Hello Michael and dear Forum:

Because we need the QRT 5 in production , I decide to perform the Netinstall in the RB911g. (QRT 5 is an antenna how use the RB911g inside) I use version 6.37.1 downloaded from http://www.mikrotik.com/download
The process runs fast and easy. After that I'm able now to connect using Winbox 3.7. Now I will set it again for P2P and also will upgrade the other sister QRT 5.

Im not sure If it happen again, but to add some hint about the situation: During the issue, I can ping the router ip address with my laptop, and notice the network packets are flowing. But when I tried Telnet, Winbox, or Browsing, its no answer from the RB911g packets. As I notice, all the issue begin with a restart of the QRT 5.
Thank you all.
Have a nice day.
Daniel Inclan
 
MichaelMueller
just joined
Topic Author
Posts: 3
Joined: Tue Nov 01, 2016 12:25 am

Re: RB911G-5HPnD with 6.35.2 suddenly becomes inaccessible

Wed Mar 22, 2017 11:57 am

Dear MikroTik guys and forum readers,

I have new information and open questions too this.

This thing is happening with newer RouterBoot+RouterOS combinations too and now does not seem to be limited to the RB911-devices only.

Up to this day I can add two different devices being inaccessable by the exact same symptoms:
  • RB951Ui-2HnD (2,4GHz only device) with ROS 6.35.4 (I do not know FW but the SafeBoot was at 3.24 so FW was the same or newer)
  • another RB911G-5HPnD with FW 3.33 and ROS 6.36
A common thing to all the prior cases and I think to all cases of other users replied was that they went into that state after a hard power-off, with I already mentioned before but now think is the main reason for this happening.

Since the time I checked the ROS changeLog frequently and there was fixes to the filesystem in particular:

Now there is this entry in the newer releases:
*) filesystem - implemented procedures to verify and restore internal file structure integrity upon upgrading;
Because I followed this changelog a lot I noticed this entry was two other entries before, where it was mentioned that a configuration could get lost on power-cycle in rare cases. Is there a reason this cannot be read anymore? For me it seems to be the exact issue our customers encounter when these devices become inaccessible - but that of course is speculation.

BTW: We have an own RTOS in use on embeded PCs with a FAT-filesystem that needs to be verified on each startup too. Before, without doing that, we had some cases of unbootability too, and they happened "rarely" but for sure.
So as a software developer somewhat familiar with such things I see a link between topics "getting inaccessible" and "configuration lost/filesystem verification".

The question to you MikroTik guys is now. Was there a fix because of such symptoms which can explain the issue we have?
Is this problem limited to some specific RBxxx devices.

Which is the firmware/ROS that fixes that behaviour? I ask that because it happened with 6.36 also and I have no clue when it has been fixed probably. I cannot simply update hundreds of devices on a regular base to up-to date FW and ROS without knowing if it fixes my issues and does not introduce new ones.

I would really appreciate any answer.

Best regards,
Michael Müller

P.S.: As a workaround failsafe we put a redundant RB911G-device to each customer using that device, it helped already once but that is not the way we are intended to go. I hope there is a fix for that.

Who is online

Users browsing this forum: Bing [Bot] and 56 guests