Hotspot vlan issues

Hi,

We have remotes sites with RB1100AHx4 routers + Mikrotik or Unifi switches and access points, providing hotspot on a vlan for wireless and wired users.

It works mostly, but recently I found some users complaining about frequent disconnections and redirected on login page each 5-10minutes. The last case we had is a user with a hAP AC3 in his room with Wifiwave2 controlled by CAPSMAN router. The wireless signal is pretty good, and no problem on his phone, but on his macOS (iOS 13.2) he’s disconnected every 5-10mins and needs to re-login each on hotspot time.

On the RB1100 (WIFIWAVE2 CAPSMAN controller), I can see logs like:

hotspot1: dhcp host 172.16.0.61 moved to vlan id <100> from <0>



hotspot1: dhcp host 172.16.0.61 moved to vlan id <0> from <100>

These 2 lines appeared at the same time.

I precise that users are placed in a VLAN (100) by the access points, and the hospot in running on the VLAN100 interface on the RB1100.

Could these logs be related to the disconnection issues?

Hi, nobody have already seen these hotspot/vlan related logs?
As soon as I activate the “hotspot” logs in “system->logging”, I see a huge amount of these messages continuously, and it seems this only happen on Mikrotik/capsman enabled sites.

On sites where I have Unifi for wireless and Mikrotik only for router/hotspot, I don’t see these logs.

Any idea?

I do have the same with Ros 7.12 on hAP ax3.
Hotspot is defined on VLAN50 of the overall LAN bridge (with all the VLAN interfaces defined as VLAN of the bridge)
LAN direction VLAN interfaces are all defined as VLAN of that bridge.
These VLAN interfaces also have VRRP defined (It’s set up for failover via VRRP, but that generates it’s own conflicts , like in DHCP)
So to mitigate the conflicts the “backup VRRP hAP ax3” has all DHCP, User Manager and Hotspot disabled currently.

LAN connection:
The Hotspot is not defined on a hotspot-bridge (as most of the tutorials show in their setup), but is set up for the VLAN50 interface.
The VLAN50 interface, like all the other VLAN interfaces serviced by the hAP ax3 , have DHCP servers defined acting on those actually untagged VLAN interfaces.
The overall bridge has VLAN filtering NOT enabled (so everything in VLAN tagging is untouched), one physical ethernet interface is connected to the bridge, transporting all the client VLAN’s.
[Those VLAN could have been defined as VLAN on this LAN ethernet interface, avoiding the bridge]

WAN connection
The WAN directed VLAN’s are not defined as VLAN of the bridge, but are VLAN of the WAN ethernet interface (ethernet and VLAN are not a member of the bridge, obviously)

No CAPsMAN used, and the wifi on the hAP ax3 is not used (sits in a shielded cabinet, could not reach client devices)
The 3 LAN directed VLAN connect to 30 MT AP, where the VLAN get split out in different SSID.
The 4th VLAN added is VLAN50 for the Hotspot SSID.

And then I get this, only since I started Hotspot, and only on the Hotspot VLAN. (Already stopped the backup Hotspot and it’s DHCP but messages keep coming.
Klembord2.jpg
In my understanding of VLAN interfaces , the Hotspot should never see VLAN-id 50, the VLAN50 interfcae in ROS is already untagged.
Torch on VLAN50 does not show any VLAN-id. VLAN50 interface is not defined as a port on the bridge
(test with extra hotspot-bridge with VLAN50 interface as port of the hotspot-bridge did not help)
(The LAN ethernet interface as port on the global bridge is hardware offloaded)

Move to VLAN-id 50 , the whole night, exactly every 1 minute (it’s one hAP acting as station to the Hotspot SSID)
Move to <0> is in between but not as steady.
Why does Torch not show the VLAN-id packet? (Torch disables the HW offloading AFAIK? Something to test … and yes … start Torch and the whole thing STOPS.)
Where was the VLAN-id50 leaking ?
Klembord2b.jpg
Like with all tests … stop Torch … and yes it comes back !
One more test … turning off HW offloading on that LAN ethernet bridge port … Hotspot changing Vlan-id keeps coming again
So Torch must be disabling something else ? Fast Forward ??? Bridge VLAN is leaking ??? http://forum.mikrotik.com/t/bridged-vlan-leaking-discovery-question/156788/1

@mkx … Help !

Removed Fast Forwarding on bridge … got a VRRP Master/Slave transition on all my VRRP on VLANs (they are coupled to one physical authority (Not part of the bridge. Ethernet hAPax3 interconnect)) … New discovery of devices by Hotspot … but again that VLANid flapping for Hotspot is back. Resetting the “Idle time” counter after ± 30 seconds for that one Host (the hAP functioning as station,) it is the only thing connected to that VLAN50 apart from the AP’s distributing the SSID, but who have no VLAN interface for VLAN50, they are connected untagged. The newly discovered (???) hosts disappeared with Idle Timeout after 5 minutes. Don’t know what they are. So LAN ethernet is hybrid, allowing all LAN to pass over it, the untagged to distribution Powerboxes and SXT, the tagged VLAN to the AP’s with client SSID and Hotspot SSID.

Tried to check with Packet Sniffer. No VLAN ID seen on VLAN50 … and flapping (moved to vlan) in Hotspot stopped while Packet Sniffer is running on VLAN50. And it’s back when Packet Sniffer is stopped.

“move to vlan” appears when the Idle Time in hosts TAB of IP/Hotspot is reset from 30 seconds to 0 seconds. Cause or consequence ?
Set the “Idle Timeout” in the Hotspot server setting (default is 00:05:00) lower than 30 sec and the story is different. The host is removed and renewed before a “move to vlan” happens.
And while looking at that Hosts table … it does show VLAN ID 50 for my test host. Where did it find this? Well the WLAN driver on the wAP drops it in VLAN50, but the data is not delivered to the hotspot tagged with VLAN 50.

Now added bridge_hotspot again. Interface named VLAN50 (defined on bridge) added as port on bridge_hotspot. Moved Hotspot to bridge_hotspot , had to move VRRP50 also to bridge_hotspot. Nothing changed, Hotspot still sees VLAN50 and unknown bridge port.
Then enabled VLAN filtering and disabled HW offload on “bridge_hotspot” and port to interface VLAN50.
Aha Hotspot changed. VLAN ID is now “0” for that one Host in hotspot.
And then 1 minute later : " dhcp host 10.5.51.140 moved to vlan id <0> from <1>" It’s going 0->1 and 1->0 every minute again


Giving UP. This is a production site, with ±100 users, 20 active AP’s and 30 SXT for wireless bridging to the AP’s.
It looks like the VLAN id does not come as information with a data packet (wanted to filter that out with “bridge_hotspot” to be certain) , but it seems like Hotspot is just finding that VLAN50 information somewhere else.
dhcp host 10.5.51.140 moved to vlan id <0> from <1>

Some progress
connection chain now is hotspot → bridge.hotspot → port VLAN50 (of bridge) → ether3 (hybrid port on bridge with VLAN50)

With VLAN filtering enabled (!) Hotspot is looking at the “bridge.hotspot” for VLAN information.
Messages of VLAN move kept coming for the existing connection.
After stopping that (=all) connections to Hotspot, now the message does not reappear (so far) for new connections.
But this time it is dynamic , not DHCP address in Hotspot
VLAN id 0 is still weird, as the PVID of bridge.hotspot is Vlan id 2.

/interface bridge
add admin-mac =xxxxxxxxx auto-mac=no comment="hAP ax3 Slave" fast-forward=no name=bridge protocol-mode=none vlan-filtering=yes
add frame-types=admit-only-untagged-and-priority-tagged name=hotspot_bridge protocol-mode=none pvid=2 vlan-filtering=yes
/interface bridge port
add bridge=bridge comment=defconf disabled=yes interface="ether2 - hEX"
add bridge=bridge comment=defconf hw=no interface="ether3 - LAN BB"
add bridge=hotspot_bridge frame-types=admit-only-untagged-and-priority-tagged interface=vlan50 pvid=2
/interface bridge vlan
add bridge=hotspot_bridge untagged=vlan50,hotspot_bridge vlan-ids=2
add bridge=bridge tagged="ether3 - LAN BB,bridge" vlan-ids=10,20,30,50
add bridge=bridge untagged="ether3 - LAN BB,bridge" vlan-ids=1

Klembord2.jpg

Workaround so far , is to set the Idle Timeout to 30 seconds (from the default 5 minutes) in the Hotspot server.

The values in the Hosts entry for an idle connection named “Idle Timeout” and “Host Dead Time” are reset to zero around 30 seconds, so the host and DHCP is never removed for timeout.
With 30 sec Timeout in the Server , the timeout happens and the Hosts entry is renewed (new host detected) again somewhat later by a UDP 123 (Time?) request.

Connection is a hAP ac2 as station in the wifi connection to a wAP ac, with no real client device connected to the hAP ac2.
In Hosts is the MAC address of the wifi “station” interface.
SNTP client on the hAP ac2 has an interval of 16 seconds ! And MNDP is running as well.

Pfffffffffffffffffff …

While checking: main bridge “bridge” and hotspot bridge “bridge.hotspot” have the same MAC address? How does that work.

That the VLAN defined as VLAN on bridge “bridge” all have the same MAC address as “bridge” could be normal.
But the bridge “bridge.hotspot” has only a port connected (port is VLAN50 of bridge bridge) to it. That port is not a VLAN of “bridge.hotspot”

Things are complicated with the VRRP for VLAN50 , which had to move as VRRP interface of “bridge.hotspot”

Stand alone setup (no VRRP) , bridge with VLAN and VLAN50 interface as port on bridge “bridge.hotspot”.
Just connecting with laptop on second AP, with that WLAN interface on VLAN50.
Continous Swapping <0> <1> still there

There are indeed 2 entries in the Bridge Host table for that VLAN50 interface and for the “hotspot.bridge” interface ? one for VID <1>, one for VID <> (=0?)
… when VLAN filtering is enabled on the bridges
Klembord2.jpg

Update this test environment to 7.16.

Reboot log … ROS reboot changed “bridge.hotspot” MAC address to be the same as bridge “bridge” again.
Well yes the VLAN50 interface has that same MAC address, and is the first port of the “bridge.hotspot” bridge.
But it does not feel as correct. “bridge” and “bridge.hotspot” are 2 different networks.

So I did a “Admin MAC Address” change on “bridge.hotspot”, to MAC from some unused ethernet interface MAC