VLAN over EoIP

anserk · March 4, 2022, 4:57pm

I have two hAP ac2 routers in different geographical locations connected with WireGuard VPN. Each router has multiple VLANs at corresponding sites. The VLANs are configured per official documentation - Ethernet ports are bridged, no bridge VLAN filtering, the VLANs are configured on the switch chip in order to leverage hardware offload. Both VLANs have the switch-cpu port added since they need routing. VLAN interfaces are created on top of the bridge1 interface. Everything works as expected.

Now I have a requirement to connect two small VLANs via layer 2. VLAN10 at site 1 and VLAN20 at site 2, but I can change VLAN IDs as needed. I’m aware of all the downsides for bridging remote networks, so would prefer to have the discussion focus on the technical possibilities rather than design.

EoIP or VXLAN look like perfect candidates for the task. I’ve had experience with both in a small test lab that didn’t involve VLANs. I read MikroTik documentation and some examples online, but haven’t found a clear answer to my particular situation. I’ve been thinking about this:

Create EoIP interface (tunnel over WireGuard).
Create another bridge - bridge2.
Add EoIP and vlan10 interface to the bridge.
Do the same on the second router.

However, I anticipate several issues with this approach. First, this looks awfully similar to a misconfiguration described here: https://help.mikrotik.com/docs/display/ROS/Layer2+misconfiguration#Layer2misconfiguration-VLANonabridgeinabridge. Even though I don’t bridge with eth interface (eth3 in that example), EoIP is basically like Ethernet interface. The solution suggested there - using bridge VLAN filtering - is not appropriate for my router model. Second, the vlan10 interface becomes a slave, but I have DHCP server running on it. That’s another common misconfiguration.

Another approach:

Create EoIP interface.
Create VLAN interface vlan10-eoip on top of EoIP and use VLAN ID 10.
Create bridge2.
Add vlan10 and vlan10-eoip interfaces to bridge2.
However, this again presents the issue with DHCP server now running on a slave interface. I suppose I can move it to run on bridge2, if that’s what it takes.

A seemingly easier solution:

Create EoIP interface.
Add it to the main bridge1.

What I fail to understand is how VLANs are presented in the bridge with switch chip VLAN configuration. Looking at (https://help.mikrotik.com/docs/display/ROS/Layer2+misconfiguration#Layer2misconfiguration-PacketflowwithhardwareoffloadingandMAClearning), it sounds like broadcast packets will be flooded to the CPU port, which is the bridge1 interfaces. But what happens next? How does the bridge process such packet? It doesn’t have VLAN filtering enabled in my scenario, does it mean it transmits the packet to all bridged ports and lets the switch chip look at VLAN ID and allow or block at eth port level accordingly?
If that’s correct, then adding EoIP interfaces to bridge1 will tap into the traffic flow “above” VLAN filtering and therefore would send all VLANs tagged traffic to the other EoIP end. Essentially, it would be another trunk. The remote router would then need to deal with VLAN tags based on its VLAN table on the switch chip.

I appreciate comments and input.

anserk · March 11, 2022, 3:26am

After reading http://forum.mikrotik.com/t/routeros-bridge-mysteries-explained/147832/1 by @sindy and doing some tests and sniffing, I came to conclusion that my thinking was correct about how a bridge deals with tagged packets - they are all going through the bridge as long as they got there from the switch below. Which makes sense as I don’t have VLAN filtering enabled.

So I tested the easiest solution - just added EoIP interface to the existing bridge. As I expected, packets from all VLANs (the ones that made it to the bridge/router from the switch chip) were transmitted to the remote router. Sniffing even showed the VLAN IDs belonging to the other side. EoIP tunnel was serving as a trunk.

When I changed one VLAN ID to match on both routers, I immediately got IP and mDNS connectivity over EoIP. Packets from other VLANs with mismatching IDs were properly dropped by the switch chip since I have secure VLAN mode. So at this point things were working as desired.

However, I didn’t want to clutter VPN link with unwanted packets that would be dropped at the destination anyway. So my next step was trying to use bridge filter rules in order to block some VLAN IDs from going out of EoIP interface. Unfortunately, I got stuck there. I tried several variations but never got a match by the rule.

/interface/bridge/filter/
chain=forward action=drop out-interface=eoip-tunnel1 mac-protocol=vlan vlan-id=20
chain=forward action=drop out-interface=eoip-tunnel1 mac-protocol=vlan vlan-id=20 vlan-encap=vlan

I must be missing something here. If I can get the blocking done, this solution would be perfect for my scenario. This way I don’t have to deal with complicated configuration with multiple bridges bridging interfaces that already sit on top of another bridge.

sindy · March 12, 2022, 12:42pm

It has been reported by multiple users here that bridge filter rules do not work in the current releases of ROS 7.

Second, you’ve said you wanted to link VLAN 10 one one site with VLAN 20 on the other one, which cannot be done without placing an auxiliary bridge and /interface vlan between the EoIP tunnel and the main bridge with all the VLANs. You have to receive a frame tagged with VID 20, strip the tag, send the packet via the EoIP, and before injecting it to the main bridge there, tag it with VID 10. So you need an /interface vlan with the main bridge as its underlying interface, and bridge the tagless end of the /interface vlan with the EoIP tunnel using the auxiliary bridge. No way around that.

anserk · March 12, 2022, 3:01pm

Linking two different VLAN IDs isn’t a requirement, I can easily make them match. I just thought if we are untagging and sending frames untagged over EoIP anyway, then it wouldn’t make any difference.

Bridging /interface vlan and EoIP was what I had in mind at first. Something like this (even with matching VLAN IDs):

       bridge2                 bridge2
       |      |                |     |
       |      |                |     |
       |      |                |     |
   vlan10    eoip --> wg --> eoip   vlan10
     |                                  |
     |                                  |
     |                                  |
  bridge1                            bridge1
   |   |                              |   |
 eth1 eth2                          eth1 eth2

What worries me is the unexpected behavior described in https://help.mikrotik.com/docs/display/ROS/Layer2+misconfiguration#Layer2misconfiguration-VLANonabridgeinabridge
Quote (I substituted ether3 in the example to eoip):

But since MAC learning is only possible between bridge ports and not on interfaces that are created on top of the bridge interface, packets sent from ether2 to ether3 (eoip) will be flooded in bridge1.
Also if a device behind ether3 (eoip) is using (R)STP, then ether1 and ether2 will send out tagged BPDUs which violates the IEEE 802.1W standard. Because of the broken MAC learning functionality and broken (R)STP this setup and configuration must be avoided.

Would disabling RSTP on both bridge2 bridges avoid the issues? The documentation doesn’t offer that as a solution and suggests avoiding this configuration altogether. I suppose the flooding will still take place.

sindy · March 12, 2022, 4:36pm

Yes, disable RSTP on bridge2s, as there is no risk of looping (at least until you add another site and connect it to both). RSTP should live tagless on bridge1, so it will not leak to bridge2 via /interface vlan vlan-id=10, but it would leak from bridge2 to bridge1 and get tagged on the way.

Other than that, the “router facing port of bridge1” is a port like any other, and so is the EoIP port of bridge2, so MAC address learning will work on both (all 4) and only traffic for the remote devices on vlan10 will be tunneled (plus broadcast/multicast one of course).

anserk · March 15, 2022, 2:24am

Thank you for confirming.

I also figured out the issue with the bridge rules. It was my logical error during testing. I only had the rule in the forward chain, but generated traffic by pinging from the router itself. After I looked at the bridging diagram in the documentation, I realized I needed the same rule in both forward and output chains in order to block packets from other devices (forward) and also originating from the router itself (output). With both rules in place the behaviour was exactly as I had expected.

So now I have two possible solutions.