We have a mikrotik in the CORE of our network that has rebooted a number of times at random times over the last few weeks. I am unable to identify a reason why this might be. Details of the device are:
CCR1072-1G-8S+
6.43 (stable)
0% load
We have SNMP traps implemented on the router, as well as logging to our system log server, but there is nothing that suggests a reason why this would occur in those logs.
Can someone provide some guidance as to what I should be looking for in terms of a possible reason? There is always plenty of memory, and the CPU never goes above about 7%.
There have been discussions that power supplies of CCRs sometimes fail. The cause are usually defunct capacitors. It is possible to replace them.
Try to search for those posts in this forum (probably in RouterBOARD hardware section).
I find it odd that it’s a hardware failure. The only reason being that it will reboot, rather than simply “fail” and not come back up again. Would a faulty power supply do that?
Do you suggest going back to the distributor here, and seeing if they would replace the unit? I have a spare here so I could swap this out and see how that one goes in the meantime but I find it a bit troubling that a device would randomly reboot like that…
Faulty (or old, but that doesn’t seem to be your case - how long has it been running until now?) power supply can cause unexplainable reboots quite easily, and such behaviour is more likely than full stops. But a software bug also cannot be excluded.
First, have you upgraded also the firmware or only the RouterOS itself when installing 6.43? Plus 6.43.2 came out quite quickly after 6.43, maybe this was one of the reasons?
In any case, if you have a spare which was idling while the production one was up and running, go ahead and try it first. Warning, do not get tempted to make a backup on the existing machine and restore it on the spare, doing so is a voucher for a headache. Instead, use export for the purpose. Do /export file=my-export on the original one, download it from there, upload it to the spare and them use /system reset-configuration keep-users run-after-reset=my-export.rsc on the spare.
This should tell you whether to talk to support (two boxes behaving the same are more likely to suffer from a software bug which only shows up under some specific condition which happens often in your network and never in many others) or to the local seller (one of two boxes with identical configuration failing and second one doing just fine suggests that the failing one has a hardware issue).
Yes, there are things like ill production lots, but that’s less likely than the other two unless you’ve really purchased two with serial numbers differing by 1.