CAPSMAN disconnects sporadic all caps interfaces

I have a CAPSMAN Controller running on a CCR1036-12G.

There are up to 324 clients on 74 wireless interfaces

Majority are CRS Desktop Switches with using 2.4´GHz wireless band.

Sporadically all caps interfaces under registration table disconnects all my clients.
Within a couple of minutes all clients reconnect.

This is a very unhappy situation.
I am using v6.33.5.

The logs only show the disconnection process.

Please any advice what this is?

I’d submit a report to support specifying all devices ROS versions/Firmware version, attaching supouts generated when that happens, on the CCR and on one of the capsman slaves.

Are you sure you’re not experiencing some sort of Layer2 glitch? (bad cable, switch…)

Have the same issue on Layer 2 Caps setup many times per day all cap disconnected an reconnect again
on CCR1016 wtih 180 Remote caps.

Any ideas???

hai all, do you already have the solutions, because we’re also experiencing the same problem here.

Seems CAPsMAN is single-threaded app, so degrade while connect-disconnect rate utilize single core performance.
Here is two ways to avoid CAP disconnects:

  1. Rise up single core frequency: use 3011 or CHR as single controller.
  2. Set up multiple controllers: 2011 can serve 300-400 concurrent connections, CCR(single core use, so any model) running 1200MGz - 300-700 connections, depending by connect/disconnect rate.

Waiting, when Mikrotik(s) provide multi-threaded support for Manager (for CAP interfaces events balance between cores), or make possible to configure multiple Manager instances on single board, for multiple core load possibility )))

I also had the same problem. I used 1036 to control 220 CAP, and CAPsMAN would crash and restart.
My temporary solution is to add a 1016 to control the other half of the CAP, and that’s normal.
If that’s true, it should be tried, X86.
But hopefully MT can give me an explanation, and it’s hard to accept these products that I trust so much.

Maybe this is a (r)STP config issue?

I shut (R) STP and then turn on is useless, but with two CCR control can be OK.

Maybe there’s a pc in your network that has wlan and cabled nic’s bridged? When that user connects his laptop to the cabled network it creates a loop causing everything to stop for a while, including your caps connection to capsman.

Maybe?CAPsMAN is set up set the user isolation and isolation in bridge settings, so it is difficult to form loop also have STP, even if the trip was to the loop switch connection paralysis, but I also do the isolation switch. So…

I guess you could connect a device that will create a loop. Try it multiple places in your network. If it causes the caps to disconnect you may have a lead.

I did a loop test myself, but nothing happened

If it’s loop, then why would it be normal for me to use two CCR?

It seems I have similar problems. Did you find a solution?

Which problem exactly you have?
We have fixed in the recent 6.42RC versions where the CAP interfaces were removed and added back if it looses connection and tries to reconnect for multiple CAPs at the same.

Let´s look at the controller: This is happening from time to time, sometimes after some hours, again and again for all of my access points (currently) running 6.41

network is fine. My assumption is that the new bridge implementation from 6.41 on the access points is the problem and let the controller kill the connection to all access points within seconds.

Looking at some of the access points I found strange settings:

Also after killing that one and rebooting it, I also got “ghost” settings, e.g. within interfaces old “dynamic wlan” entries although no client was connected, or within /interface list “dynamic” entries with no interfaces. Perhaps I downgraded/updated to much those access point from 6.39.* <=> 6.40.* <=> 6.41 in the past…

I had same issue with cAP Gi-5acD2nD and RouteOS 6.46.7 / 6.46.8

“CAP sent max keepalives without response”
“CAP failed to join CRS210 (XX:XX:XX:XX:XX:XX/10//0)”

The Brige Property, under STP tab “Protocol Mode” option was set to “None” , after hard power off I’ve restarted CAP and I switched Protocol mode to “STP”. => both CAPs are stable

Found this while trying to solve my own problem.

My problem was also the “ghost” entries from wlan entries on the bridge prior to me setting up capsman.

Capsman was creating new/additional wlan entries on the bridge, while there were still “unknown” entries left over on the bridge from prior to capsman. Removing the unknown entries from the bridge, and then rebooting the hAPac (in my case) stopped the seemingly never-ending stream of alerts in the logs of “CAP sent max keepalives without response”.

6.49.6