Frequent reboots of hAP ac2

Hallo,

I have several hAP ac2 working in my house. One of these reboots quite frequently (approx every 1-3 houres).

Here ist what log shows after reboot:

 jan/21 09:18:59 system,error,critical router was rebooted without proper shutdown by watchdog timer
 jan/21 09:19:00 system,error,critical kernel failure in previous boot
 jan/21 09:19:05 interface,info vlan10 link up
 jan/21 09:19:05 interface,info vlan20 link up
 jan/21 09:19:05 interface,info vlan1 link up
 jan/21 09:19:05 interface,info vlan50 link up
 jan/21 09:19:05 interface,info vlan61 link up
 jan/21 09:19:05 interface,info vlan62 link up
 jan/21 09:19:05 interface,info vlan60 link up
 jan/21 09:19:07 interface,info ether1-Trunk link up (speed 1G, full duplex)
 jan/21 09:19:12 wireless,info 64:D1:54:5D:B1:4E@wlan2-wire established connection on 5500000, SSID EGLink
 jan/21 09:19:16 dhcp,info dhcp-client on vlan1 got IP address 192.168.178.8
 11:05:28 system,critical,info ntp change time Jan/21/2022 09:19:48 => Jan/22/2022 11:05:28
 11:34:50 wireless,info FA:4D:9D:44:98:30@wlan1-Heim: connected, signal strength -81
 11:35:06 wireless,info FA:4D:9D:44:98:30@wlan1-Heim: disconnected, extensive data loss
 11:35:23 system,info,account user admin logged in from 10.10.10.37 via winbox
 11:35:29 wireless,info 60:03:08:A8:A0:26@wlan1-Heim: connected, signal strength -54
 11:37:12 wireless,info 9E:85:68:0C:E7:E5@wlan1-ff-gast: connected, signal strength -64
 11:42:44 wireless,info 9E:85:68:0C:E7:E5@wlan1-ff-gast: disconnected, extensive data loss

Configuration is attached.

I can’t find an error in the configuration, but of course there might be a configuration issue.

What could I do to find the cause of the issue?

Thank you!
2022-01-22_garage_station.rsc (4.92 KB)

I 've not seen your whole config line by line, however you have a watch address enabled and that is 192.168.178.1, so if your device can’t reach that address it will finally reboot…
So, is there a chance it can’t reach the watch address ?

Another chance for the device to be rebooted from watchdog, is if it is unresponsive for a minute…

What seems strange is the kernel failure in previous boot

Thank you for your reply. I activated the watchdog in order to regain access to the device without the need of a physical reboot at the device.

So, I’m quite sure that this is not the origin of the “kernel failure in previous boot”. I’m looking for a way to get more information from the device.

try setting cpu frequency to 716MHz

Ok, I will will try and report. Did you observed that before or is it just trial and error?

Unfortunately, that didn’t helped. Swiched back to ‘auto’.

Another strange thing: After each reboot I get an ntp log entry like

13:04:26 system,critical,info ntp change time Jan/21/2022 09:19:33 => Jan/23/2022 13:04:26

How can it be that clock is running that wrong with just 1-3 hours between reboots?

Logical if time is not written to storage before reboot happens.
It is strange though it goes back some days whereas reboots happen each couple of hours.
I’ve seen it happen as well on some previous version of ROS7.

Adding
Do you also have this problem on 7.1 or 7.2rc1 ?
Already tried clean netinstall on that device of 7.1 1 ?

As far as I can see there is no 7.1 for the hAP ac2. I’ll try 7.2rc1 later. I will also try to get netinstal run on my mac under wine.

Thank you very much for you suggestions!

There definetly is. Download https://download.mikrotik.com/routeros/7.1.1/routeros-7.1.1-arm.npk, upload to Files and reboot.
Don’t go to 7.2rc1 yet.

Well, isn’t this 7.1.1? I’m already running that version.

Have you tried a different power supply (swap the one with the bad device with one of the good ones) or maybe netinstalled the device?

It could be some sort of nand/flash issue and that it why I mentioned netinstall.

Do it reboot with default config?
My guess is that here are some config/function that do eat memory or other resources. Router the reboots.

Do you need to use 7.x?

Thank you! I will try a different power supply and netinstall. I will report back!

No, I don’t need 7.x. Will try older version asap. Will also try default config.

@All: Thank you all for your suggestions. I will take a while to perform the tests (also because I have to work :wink: . I will report…

I agree…

Also, did you have that configuration on v6 and upgraded to v7 ?
If yes, did you have same problems on v 6?

I would suggest you downgrade to v6, reset to defaults and test the device for a few days…

Yes, will do this later.

BTW: Kind of a record right now. Up for more than 6 hours. Don’t think that it’ a memory problem according to this:

 uptime: 6h5m36s
                  version: 7.1.1 (stable)
               build-time: Dec/21/2021 11:53:05
         factory-software: 6.45.9
              free-memory: 54.3MiB
             total-memory: 128.0MiB
                      cpu: ARMv7
                cpu-count: 4
            cpu-frequency: 448MHz
                 cpu-load: 3%
           free-hdd-space: 2024.0KiB
          total-hdd-space: 15.2MiB
  write-sect-since-reboot: 15803
         write-sect-total: 37106
               bad-blocks: 0%
        architecture-name: arm
               board-name: hAP ac^2
                 platform: MikroTik

To see memory problem, write down memmory usage every 30 min or 1 hour (depends how often it reboots.)
If memory usage goes up and not goes down again, you may have a problem.
You can use an external program like SNMP tools or use Splunk from my signature to monitor the router.
I had a memory problem with DoH

Disable watchdog…or config ip.
Old problem with watchdog reboot.