I’d submit a report to support specifying all devices ROS versions/Firmware version, attaching supouts generated when that happens, on the CCR and on one of the capsman slaves.
Are you sure you’re not experiencing some sort of Layer2 glitch? (bad cable, switch…)
Seems CAPsMAN is single-threaded app, so degrade while connect-disconnect rate utilize single core performance.
Here is two ways to avoid CAP disconnects:
Rise up single core frequency: use 3011 or CHR as single controller.
Set up multiple controllers: 2011 can serve 300-400 concurrent connections, CCR(single core use, so any model) running 1200MGz - 300-700 connections, depending by connect/disconnect rate.
Waiting, when Mikrotik(s) provide multi-threaded support for Manager (for CAP interfaces events balance between cores), or make possible to configure multiple Manager instances on single board, for multiple core load possibility )))
I also had the same problem. I used 1036 to control 220 CAP, and CAPsMAN would crash and restart.
My temporary solution is to add a 1016 to control the other half of the CAP, and that’s normal.
If that’s true, it should be tried, X86.
But hopefully MT can give me an explanation, and it’s hard to accept these products that I trust so much.
Maybe there’s a pc in your network that has wlan and cabled nic’s bridged? When that user connects his laptop to the cabled network it creates a loop causing everything to stop for a while, including your caps connection to capsman.
Maybe?CAPsMAN is set up set the user isolation and isolation in bridge settings, so it is difficult to form loop also have STP, even if the trip was to the loop switch connection paralysis, but I also do the isolation switch. So…
I guess you could connect a device that will create a loop. Try it multiple places in your network. If it causes the caps to disconnect you may have a lead.
Which problem exactly you have?
We have fixed in the recent 6.42RC versions where the CAP interfaces were removed and added back if it looses connection and tries to reconnect for multiple CAPs at the same.
Let´s look at the controller: This is happening from time to time, sometimes after some hours, again and again for all of my access points (currently) running 6.41
network is fine. My assumption is that the new bridge implementation from 6.41 on the access points is the problem and let the controller kill the connection to all access points within seconds.
Looking at some of the access points I found strange settings:
Also after killing that one and rebooting it, I also got “ghost” settings, e.g. within interfaces old “dynamic wlan” entries although no client was connected, or within /interface list “dynamic” entries with no interfaces. Perhaps I downgraded/updated to much those access point from 6.39.* <=> 6.40.* <=> 6.41 in the past…
I had same issue with cAP Gi-5acD2nD and RouteOS 6.46.7 / 6.46.8
“CAP sent max keepalives without response”
“CAP failed to join CRS210 (XX:XX:XX:XX:XX:XX/10//0)”
The Brige Property, under STP tab “Protocol Mode” option was set to “None” , after hard power off I’ve restarted CAP and I switched Protocol mode to “STP”. => both CAPs are stable
My problem was also the “ghost” entries from wlan entries on the bridge prior to me setting up capsman.
Capsman was creating new/additional wlan entries on the bridge, while there were still “unknown” entries left over on the bridge from prior to capsman. Removing the unknown entries from the bridge, and then rebooting the hAPac (in my case) stopped the seemingly never-ending stream of alerts in the logs of “CAP sent max keepalives without response”.