I have a small network with a central router (providing DHCP) that is using CAPSMAN to manage two access points. Everything was powered until recently by PoE (centrally provided from the router) and was shut down during the night. All worked perfectly.
Just a few days ago I had to replace a switch and now one of the APs is no longer PoE powered, but stays powered 24/7 - the remaining network (router, switch, other AP etc) still shuts down for the night.
However, when the network (central router with CAPSMAN etc.) comes back online in the morning the always-on AP is no longer reachable and it does no longer provide Wlan. Only a reboot of the AP fixes that.
I would prefer to understand the issue and fix it elegantly (instead of just scheduling a regular reboot in the morning for the corresponding AP), so any advice would be greatly appreciated.
I created a shut down script in the router and a timed switch cuts the power.
Why do you shutdown every night?
Save power, reduce operating hours on the equipment, avoid wlan emissions when they’re not needed…
Worst thing for electronics is repetitive power off/power on
That is a curious statement as most electronics power supplies employ high frequency on/off switching (switching power supplies), so technically do precisely that just up to 100000 times a second…just like the pixel in old Plasma TVs did, some laptop LCD screens do even today to regulate brightness and much more…
There is a difference using frequency signals and not having ANY power AT ALL for some hours.
BIG difference.
Also consider the repetitive impact of cooling down, warming up again, cooling down, warming up again …
Repetitive thermal stress as a result of switching off power.
Most electronic device failures will happen when power is being applied again.
Most devices nowadays (and especially routers/wifi APs) are designed to be kept on continuously.
Common exception: everything with batteries and chargers. Those should be unplugged when charging is done. Also the charger.
Besides that, do you have maybe some advice to fix my connectivity issue after the outtage?
Thank you, Hannes
There is a difference using frequency signals and not having ANY power AT ALL for some hours.
BIG difference.
I do have 10+ years background in electronics and some knowledge of semiconductor physics, but this is new to me. Also it is new to me that supposedly different parts are used for plugged electronics (suffer from shut off as you write) and battery powered electronics (should be shut off as you write). Weird that this is not mentioned in the manufacturer datasheets of the ICs, passive components, displays etc.
Thermal cycling is an issue for high power switching electronics, but not for low power gear.
You’d be surprised … it’s very much an issue for low power gear as well.
(PS my base education is electronics, worked 10 years as test engineer developing automated test processes in a factory with a worldwide supplier of communication equipment so I’ve seen my share of electronics testing and the results of thermal impact on solder joints (mainly on prototypes during initial engineering tests) and other components)
But we certainly can agree there
I guess the device not being power-cycled has some IP issue and can therefor not reach the network.
Do you have long lease times from that router towards those APs ? Can you make those shorter ? REALLY short (as in max 15 or 30m or so) ?
Be sure not to make it too short for client devices connecting to those APs if they are on the same subnet (not all clients respond well to short lease times).
It might anyhow be better to use fixed IP addresses for those APs. If DHCP does not work on whatever end, you can always use the known IP address to reach the device.
That’s my personal preference for any device which is part of the infrastructure part of a network (switches, printers, routers, APs, …).
Just a guess based on the assumption you use DHCP for that device as well.
If not, it might help to describe a bit more the surrounding environment that AP operates in.
Worst case you can always schedule a reboot script on that AP some minutes after the other gear is supposed to be online again (provided its time settings remain more or less accurate enough during the outage of the other equipment).
Or a bit more fancy, a netwatch looking for the router, if it doesn’t see the router within some minutes, reboot.
Workarounds which do not address the main issue but maybe easier to move forward ?
Did you set fixed IP on that device as I already suggested two times before ? (probably will not help if MAC doesn’t work either)
Can you get on the device itself via direct connection when this happens (using ether2) ?
Anything visible in the logs ?
Or is your only way out of this situation power off and power on again ?
Is config of that device different from other cAP AC devices which are behaving nicely ?
What config does the device have ?
What packages are installed ? wifi-qcom-ac by any chance ?
Especially that last part and answer might be interesting to know…
Did you set fixed IP on that device as I already suggested two times before ?
From what I know I can only set the IP as static in the DHCP Server of the router (and that has always been the case). I’m not aware that I can give a static IP on the client side for mikrotik devices, but maybe I’m missing something here.
Can you get on the device itself via direct connection when this happens (using ether2) ?
Can you get on the device itself via direct connection when this happens (using ether2) ?
Anything visible in the logs ?
I haven’t tried that yet.
Or is your only way out of this situation power off and power on again ?
Yes.
What config does the device have ?
What packages are installed ? wifi-qcom-ac by any chance ?
Just the regular stuff is installed - advanced tools, dhcp, hotspot, mpls, ppp, routeros-arm, routing,security, system and wireless are shown in the package list.
There is nothing peculiar about the config - regular setup, but wifi is managed by CAPSMAN. Both APs are setup in exact the same way.
On the interface where your DHCP client is on that CAP, you can add the address manually.
And then disable DHCP client.
You should also add a route to your router then (since that’s something taken care of by DHCP client).
Easiest: copy the existing route before disabling DHCP client.
It will make a duplicate but that get’s resolved once you disable DHCP client (the dynamic route will disappear then).
Normally when you set a static IP on DHCP server, this should also be ok.
But clearly something is big time wrong on that cap making DHCP not functioning anymore.
What ROS version is on that device ? Looking at your package list, I assume ROS6.49.something ?
Are all those packages needed on that cap ?
E.g.
mpls ?
ppp ?
security ? Are you really using those added features ?
Can you uninstall those ?
How much HDD space is left before uninstalling ? (check system / resources)
How much is left after uninstalling ?
Reason for asking:
cap AC is one of those device with limited flash storage (only 16Mb).
And when that memory runs out (especially when logging quite a bit, the device will crash. Usually with a reboot but sometimes not.
It might also be time to put the config on the table for that device …
/export file=anynameyouwish ( minus router serial number, any public WANIP information, keys etc.)
After a long time not bothering with this, I finally looked into it again and I think I found the problem - idiotic user behaviour.
I’m sorry for not knowing my own scheduler scripts - although I mentioned in precisely this thread that all network devices shutdown for the night and also the access point does. I’m surprised that I’m surprised that it is off then in the morning, phew.