we have encountered a serious problem with this combination:
After a non-predictable time the device (see subject) becomes completely inaccessible and non-operational.
I do not know what happens exactly. I can say Winbox (MAC discovery) cannot see the device anymore. Restarts or Power-Cycles do not change that. Even the reset procedure did not help.
Because the device is an essential part in our customer system (far away locations included)
and for us only reachable remotely this kind of fail always means an emergency situation.
At the moment this happened 3 times in total until this day. All case within the last 3 month.
But I'm afraid it will happen more often because we have a lot more of these devices out there.
BTW: We a small but growing company and using a lot of RouterBoards at our customer sites in the field out there. And we use different models than this one too.
But so far we encountered this kind of critical fail only with the mentioned device.
These things all 3 fail cases had in common:
- Failed device: RB911G-5HPnD
- RouterOS 6.35.2 installed (I read this was a MikroTik factory internal release only)
- Directly before the fail the customer system did undergo a complete powerloss or power-off/power-on cylce.
- At least two of the devices had firmware 3.24 active
- All work configured as Access Point in "P2P bridge mode" (only one client)
- The power supply unit used is the recommended original one "18POW" (24V,0.8A).
It indeed was non-operational from a user's standpoint and inaccessible by any configuration tool (Winbox).
All LEDs seemed to react in kind of the right way and I could see a clear reaction when doing the reset procedure.
But nothing helped to get access to the device. Nothing except when I tried the hardcore method of "reset & netinstall".
I was able to revive the inaccessible device that way.
In addition I updated the firmware and RouterOS using Winbox just some days before.
Now this device is under permanent testing.
But to be honest my trust to the stability of this device is not very high anymore.
Because I really have no idea how to prevent this fail at our customers my questions now are:
- Is there a known problem in this combination causing such malfunction?
- Are I'm unlucky and there is a known bad production batch of these devices.
- Is it possible to upgrade the fallback safe-boot firmware?
I know the I can update all devices to actual RouterOS+Firmware.
But this means a lot of work and always carries the risk of making a working WiFi-connection non-operational and connected clients unreachable.
And the worst: I even do not know if it will prevent the sudden fail from happing.
Can somebody bring some light into this?
I would really appreciate any help.