Using bridge NAT to achieve client isolation across BSSIDs and APs?

I have 3 cAP AXes connected to a CRS328 bridge. One of the cAP AXes is a CAPsMAN server. I have configured 2 SSIDs on different VLANs: one for trusted devices (VLAN 10) and one for untrusted/guest devices (VLAN 20). The goal is to isolate the wireless stations on the guest network so that they have Internet breakout but cannot see any traffic from other devices on the same network.

The CAPsMAN server has set datapath.client-isolation=yes on the guest network. In my experiments, this works pretty well for clients connected to the same BSSID, but not when the 2 wireless stations are connected to different BSSIDs (either on the same AP or a different AP). datapath.traffic-processing=on-capsman is not currently not supported on devices with a Qualcomm driver (wifi-qcom package).

So I figured I need bridge firewall rules to isolate that traffic.

My Proposed Solution

To be able to break out to the Internet, the clients would need to use DHCPv4 to obtain network information and ARP to resolve the MAC address of the default gateway. Both these protocols use Ethernet/IPv4 broadcasts. I do not want these broadcasts to reach other nodes, so I decided to use bridge destination NAT to change the MAC address to the address of the gateway. This works great.

On IPv6, the clients would use SLAAC to obtain network information. This requires the clients to be able to send IPv6 NDP Router Solicitations to MAC 33:33:00:00:00:02 and IPv6 ff02::2. After that, they would use IPv6 NDP Neighbor Solicitation by sending a solicited-node multicast to the gateway at 33:33:FF:AA:AA:AA and ff02::1:ffaa:aaaa. I again use a destination NAT rule to make sure all solicited-node multicasts are directed to the gateway.

Finally, I use the bridge filter to:

  • allow traffic from and to the gateway MAC address
  • allow all IPv6 NDP multicast frames
  • drop all traffic between WiFi interfaces (on the same AP)
  • drop everything else

Bridge NAT

/interface bridge nat
add action=dst-nat chain=dstnat comment="From GUEST WLAN: redirect all IPv4 \
    broadcasts to GW" in-interface-list=GUEST_WLANS packet-type=broadcast \
    to-dst-mac-address=AA:AA:AA:AA:AA:AA
add action=dst-nat chain=dstnat comment="From GUEST WLAN: solicited-node \
    multicast only to GW" dst-mac-address=33:33:FF:00:00:00/FF:FF:FF:00:00:00 \
    in-interface-list=GUEST_WLANS to-dst-mac-address=33:33:FF:AA:AA:AA
  • This translates Ethernet broadcasts to the all-As MAC address of the gateway to take care of DHCPv4 and ARP broadcasts.
  • It redirects all IPv6 solicited-node multicasts to the gateway.

Bridge Filter

The bridge filter would then allow all traffic from and to the gateway, as well as IPv6 NDP router + neighbor solicitations and advertisements and drop everything else:

/interface bridge filter
add action=accept chain=forward comment="Allow all from GW" src-mac-address=\
    AA:AA:AA:AA:AA:AA/FF:FF:FF:FF:FF:FF
add action=accept chain=forward comment="Allow all to GW" dst-mac-address=\
    AA:AA:AA:AA:AA:AA/FF:FF:FF:FF:FF:FF
add action=drop chain=forward comment="Drop GUEST radio to GUEST radio" 
    in-interface-list=GUEST_WLANS out-interface-list=GUEST_WLANS
add action=accept chain=forward comment="Allow solicited-node multicast to GW" \
    dst-mac-address=33:33:FF:AA:AA:AA/FF:FF:FF:FF:FF:FF
add action=accept chain=forward comment=\
    "Allow GUEST WLANS to send IPv6 multicasts (RA&RS + NS&NA) on ether1" \
    dst-mac-address=33:33:00:00:00:00/FF:FF:FF:FF:FF:00 \
    in-interface-list=GUEST_WLANS out-interface=ether1
add action=drop chain=forward log=yes log-prefix=br-drop-forward-policy

The gateway itself has no filter rules because, being a relatively slow single-core CRS328, I did not want to disturb its bridge fast path.

Caveats

Information leakage. Isolated clients will always see:

  • DHCPv4 broadcasts from the gateway. Clients thus learn MAC and IP address of other hosts in the network from the Offer and ACK DHCPv4 messages. But I noticed this happens for clients connected to the same BSSID even with datapath.client-isolation=yes set, so even MikroTik themselves are not able to filter this.
  • ARP broadcasts sent by the gateway. This happens when the gateway does its duplicate address detection (DAD) before offering a DHCPv4 address and periodically later.

Note that flooded unknown unicasts nor multicasts are a problem, because the bridge filter will drop any frames that do not originate from the gateway MAC address.

Feedback?

Any feedback on this approach? Could this setup be better/tighter?

One problem I can see is that bridge filters disable bridge FastPath and would be a bottleneck once you have more than ~10 devices on an AP (based on ~10% CPU usage when a single client runs a speedtest).

This is a common topology for switched networks, and appropriately enough, has built-in mechanisms for doing it.

  • For bridging done in software, simply set port horizon to the same nonzero value for client-facing ports.(this is called split horizon bridging)
  • For hardware offloaded bridges, simply use the port isolation feature (this is available in the switch menu)

Thanks for your reply, @lurker888!

If I understand you correctly, when setting the the same bridge horizon on bridge ports, I can prevent traffic flowing between those ports. In my case, this prevents traffic flowing between guest SSID virtual interfaces on the same AP. And I do not need a dedicated bridge filter rule for this anymore.

But AFAICT, a bridge horizon is a device-local mechanism. So it would not prevent traffic flowing between guest SSID interfaces on different APs. (My testing confirms this.)

Did I get that right or is there a way to leverage bridge horizon on multiple devices so that the traffic from all clients on different BSSIDs are isolated?

Yes, and it’s quite usual to do so. It actually has a name (of Cisco origin) PVLAN, which means to prevent switching between clients and force all packets to be routed by the router. Yours is a special case of this, where you not only want to selectively forward traffic, but deny it altogether. (If you actually want to forward some traffic between clients, you also have to enable local-proxy-arp on the router.)

This mechanism works for entire hierarchies of switches, but it has to be set up on each switch. Identify the uplink ports - for this we don’t set a horizon value. Identify ports facing client devices (even if through a wifi interface or another switch multiple ones) and assign the same value to those.

The limitation you may run into is that currently on Mikrotik this is a per-port setting and cannot be further restricted to selected vlans. This is unfortunate. It can be worked around in several ways, none ideal.

I suppose this is indeed the limitation I run into. The APs’ guest SSID traffic arrives over trunk ports onto the bridge. At that point, it would be useful if forwarding could be limited to only the bridge itself (IP services) and towards the Internet. Theoretically, we would be able to do that by setting the same horizon value on VLAN 20 on all AP-facing ports. But currently we cannot do that. We can only set horizons per port, affecting all VLANs on that port.

And so we have to use a workaround, my bridge NAT + filter solution being one of those less than ideal solutions.

If that accurately describes what you were trying to convey, then I think I learned something :sweat_smile:

As to the limitation, essentially yes.

If you have to resort to using bridge filters, I would however use a single rule of this form:

/interface bridge filter
add action=drop chain=forward in-interface-list=guestports out-interface-list=guestports mac-protocol=vlan vlan-id=200

I think it’s a bit more straightforward. As ever, such things are in the eye of the beholder.

You would configure that on the CRS328, I think? Maybe I should have made more explicit that the bridge filters mentioned above were configured on the APs. So: 3 APs, 3 bridge filters. A single rule on the CRS328 would indeed be better (more maintainable, definitely easier to understand), but I worry about losing FastPath.

If you have a proper switch, then switch rules should be used. The form is

add new-dst-ports=ether4 ports=ether1,ether2,ether3 switch=switch1 vlan-header=present vlan-id=200

(Here ether1,2,3 are the APs, ether4 is the uplink to the router.)

No, I use the CRS328 as a Layer-3 switch. It also functions as the gateway to my ISP. I have a 100/100 Mbps uplink, so no need for something more powerful yet. This was a compromise as I needed the PoE.

Switch rules are still appropriate.

After some experimentation, it looks like this switch rule on the CRS328 provides the same amount of isolation as the previous bridge filter rules:

/interface ethernet switch rule
add new-dst-ports=sfp-sfpplus1_WAN redirect-to-cpu=yes switch=switch1 vlan-header=present vlan-id=20

The redirect-to-cpu=yes is needed for IP services and source NAT/PAT to work. Either way, with or without this rule, the CPU is fully utilized at 100% whenever I run a speedtest :grimacing:

redirect to cpu shouldn’t be used in this case (and it was buggy for a long time, only freshly fixed.) Anyway, it’s not meant for this use case.

You have a pretty special setup. Correctly configured, 100% cpu will not occur. However having things set up this way will stress out the cpu for any sort of unexpected load or even a temporary failure or delay in flow offloading.

Do you really consider this a special setup? I have 3 APs with 2 SSIDs connected to a Layer-3 switch that does IP services, firewalling and NAT. I just want client isolation on the guest network, which to me does not seem that outlandish.

If redirect-to-cpu should not be used, do you have any suggestion on how to make the CRS328 do all this (IP services and NAT)? If I set that option to no, NAT is not performed.

Okay, I have to admit that I was a bit quick to judge. I thought you were doing NAT/flow offloading to the switch chip. On further inspection it seems that you aren’t doing that because the included switch chip doesn’t support it at all. Sorry.

Using the CRS328 as an L3 switch is of course not at all strange.

What is strange is that you would use this device for anything else, including NAT. The hEX refresh is roughly 2x more powerful, something like the ax2/ax3 are 4x. If you can push the CPU to the max, that means that you roughly have a 500-1000 Mbps Internet connection - the device is simply not suitable to handle these.

redirect-to-cpu, as I said, was only fixed in the latest 7.20beta5 version, so it was buggy before. In your case it’s not what you actually want to use.

Simply allowing traffic to the cpu via the new-dst-ports should fix your problem. Actually the WAN port should be removed from the list. Also, due to “certain things” the ports= matcher should be used.

I also retract my statement to the effect that properly configured 100% cpu shouldn’t occur. With no NAT offloading possible, it should and it will.

And yes, doing NAT for any significant traffic on these things is unusual.

I understand that this is an interim solution and I won’t scuff at your setup for not getting a nuclear power plant to go with your switch straight away - but I can’t sidestep the fact that this device is simply not meant for this, and while it does the job, you simply have to live with some compromises.

Thanks for your detailed answer. The CRS328 was definitely a compromise performance-wise, but it has a big PoE budget and SFP cages to connect over fiber to my ISP.

When trying to set new-dst-ports to the CPU, I get the following error:

/interface/ethernet/switch/rule> set 0 new-dst-ports=switch1-cpu 
failure: switch1-cpu not supported

That’s why I thought I had to use redirect-to-cpu=yes.

On other chips it does. Actually since you have no flow offloading, redirect to cpu will have the same effect. BUT it’s buggy.

Use the ports= matchet. It wasn’t meant as a joke.

1 Like

It seems to work, but like you said, feels buggy. This is what shows in print and export (notice the *FFFFFFFF instead of the actual ports I put in):

/interface ethernet switch rule
add ports=*FFFFFFFF redirect-to-cpu=yes switch=switch1 vlan-header=present vlan-id=20

Also, I cannot disable the rule anymore:

/interface/ethernet/switch/rule> disable 0
failure: all not supported