Struggling with VLANs on MikroTik CRS305

Hi folks,

I’m having a bit of trouble getting VLANs to work on my MikroTik CRS305-1G-4S+ after swapping it into my network.

A little bit of background first - my current setup is pretty straightforward (see the attachment). I’ve got an AP that tags its wireless networks with VLANs 30 (trusted), 40 (guest), and 50 (IoT), while its physical ports remain untagged. The AP’s trunk port (wan) connects to the switch trunk port (eth1), and then traffic is aggregated through another trunk (eth8) to my pfSense router (eth1), which handles IP Services and all that. All other ports on the switch (eth2-eth7) are access ports for VLAN 30, with untagged traffic just for management (yeah, I know it’s bad practice, but I kept locking myself out and decided to deal with a MGMT VLAN later). It’s working flawlessly.

I wanted to swap in the CRS305 for a fiber backbone, so I disabled all VLAN configs and started simple: AP to ether1, router to sfp-sfpplus1, workstations to sfp-sfpplus2 and sfp-sfpplus3. With VLANs disabled, untagged traffic worked right out of the box, so all good so far.

Things got tricky when I tried re-enabling VLANs. In WinBox, I added all 3 VLANs under Bridge/VLANs, setting sfp-sfpplus1 and ether1 as tagged for each. Additionally, for VLAN 30, I set sfp-sfpplus2, sfp-sfpplus3, and sfp-sfpplus4 as untagged and then changed PVID to 30 under Bridge/Ports for each of them respectively. I then enabled VLAN filtering on the bridge, and everything broke.

I can get the physical ports (sfp-sfpplus2 etc.) to tag correctly, but the traffic from the AP on ether1 either doesn’t get through at all (no IP assignment from pfSense) or works intermittently—probably dropping packets somewhere like crazy. Weirdly, if I turn off VLAN filtering on the bridge, VLANs from the AP reach pfSense somehow and they get their respective addresses, but then the physical ports on the switch no longer tag traffic. I’ve tried all sensible, and a few insensible combinations for ether1, sfp-sfpplus1, and the bridge under tagged/untagged for VLANs but I just can’t figure it out. I’ve attached the ‘current’ setup (WinBox only for now :sweat_smile:) and that one doesn’t work for the AP traffic at all (short of getting to pfSense assigning appropriate IPs to the respective VLANs, no traffic afterwards) but the workstations are properly tagged and operating normally as far as I can tell.

I’m clearly missing something about how MikroTik handles trunking compared to my old TP-Link switch. Any help would be appreciated!

Thanks!
bridge_vlans.png
bridge_ports.png
current_setup.png

You may want to read, review and digest this excellent post, considered to be the DE FACTO VLAN bible for ROS:
http://forum.mikrotik.com/t/using-routeros-to-vlan-your-network/126489/1

Thanks for the suggestion, I already looked at it, and digested as much as my limited networking knowledge would allow me :sweat_smile:

That’s where I got the info on bridge VLANs, prior to that I was trying to set them as interfaces. :see_no_evil_monkey: The closest thing to what I’m looking for is the ‘Switch with a separate router’ example and I think I quite mimicked what was done there (practically the switch.rsc example), sans setting the management VLAN on it, and something is still very wrong :face_with_diagonal_mouth: What baffles me is that the devices on my AP get proper IPs and all that, so their tagged frames must be going well both ways, it’s just that once they do - it looks like something starts blocking traffic (for example can’t ping anything, incl. their gateways). I’ve triple-checked the FW settings on my pfSense and they are identical to what I use when using the TP-Link switch in the middle in the same configuration.

So I feel I’m either missing something glaringly obvious or there is something wrong with some of my equipment - before I go down that route, I’d like to rule out the MikroTik side of things, as I’ve been using the others in the same setting for quite some time without any issues.

Why do you have “tag-stacking” enabled on sfp-sfpplus1?

An accidental leftover from attempting anything, sensible or insensible, with the idea that if I tweak all the knobs in all combinations something is bound to start working :sweat_smile:. I did notice it afterwards and removed it, still no difference.

And did you perfrom a cold reboot of switch after setting all things up? My experience goes that sometimes actual running switch chip config diverges from what it’s supposed to be and a good cold boot (i.e. cut the power) sorts this out.
Not necessarily a solution to your problem though.

Did several reboots with just the switch and across the whole chain (the router, the switch and the AP) similarly suspecting something doesn’t get initialized properly. Unfortunately, it didn’t help one iota. :frowning: I also left the setup to ‘cook’ for a good hour once, still wouldn’t budge. With the extended period of testing I did make some reliable observations, tho:

  • Physical ports on the MikroTik switch work flawlessly, get on the proper subnet, no packet loss internally or on the interwebs etc. every time!
  • There seem to be bursts of proper working VLANs on the AP as well - as in, suddenly the traffic will appear to flow normally, internally (at least pinging, the ‘working bursts’ don’t last long enough to run iperf or something of the sort) and towards internet at large, and then just as fast it will go into 100% packet loss mode.
  • Even when internal traffic starts flowing, devices on the VLAN pinging their own gateway take ~5ms, and my apartment is not large enough to warrant such latencies
    • Devices on the CRS’s physical ports exhibit no such issues

Assuming my setup does not have a glaring omission (which somebody would hopefully call out by now :sweat_smile:), I’m starting to think that there might be something wrong with the way my AP handles VLANs that might incidentally work well with my TP-Link switch but doesn’t play well with MikroTik :thinking: Not sure how could I check that, tho, any ideas? I’m suspecting the AP (Asus RT-AC68U) because I remember it was notoriously difficult to get it to tag its guest SSIDs, but then again I really haven’t had any issues with it ever since, that is until I decided to upgrade my home network to fiber :face_with_diagonal_mouth:

First thing which comes to mind:
Usage of untagged vlan 1.

Some brands accept it. Some don’t.

Best to avoid vlan1 completely.

Hey folks, just to close the thread - it was the AP all along as I suspected. For some reason, the ‘hack’ to make it properly handle VLANs for different SSIDs was working well with TP-Link but Mikrotik didn’t like that. Replaced the AP with UniFi U7-Pro and all started working automagically.

Now on to set a proper MGMT VLAN, maybe my previous dozens of failed attempts could also be attributed to the misbehaving AP :sweat_smile:

Thanks for the help and suggestions!

And to close everything on a more light tone:

Mikrotik admin rules:

  1. You do not use VLAN1
  2. You DO NOT use VLAN1
  3. You do not use Quickset
  4. You do not use detect internet
    5)…