RouterOS 7 Bridge VLAN/DHCP client issue after upgrade

I have a fairly simple config which works great on RouterOS 6.

Upgrading to RouterOS 7, breaks this config, no IP address is obtained through the trunk VLAN port, no communication (also access-port connected devices cannot communicate).

What might be wrong, needs to be changed?

# feb/24/2022 00:00:00 by RouterOS 6.49.2
# software id = 84GZ-779U
#
# model = RB760iGS
# serial number = E1F10EB756FC
/interface bridge
add comment="VLAN filtered Bridge" name=bridge pvid=10 vlan-filtering=yes
/interface ethernet
set [ find default-name=ether1 ] comment=Uplink
set [ find default-name=ether2 ] comment="Guest VLAN"
set [ find default-name=ether3 ] comment="IoT VLAN"
set [ find default-name=ether4 ] comment="MacBook - Private VLAN"
set [ find default-name=ether5 ] comment="COMP1 - Private VLAN"
set [ find default-name=sfp1 ] disabled=yes
/interface wireless security-profiles
set [ find default=yes ] supplicant-identity=MikroTik
/interface bridge port
add bridge=bridge interface=ether1
add bridge=bridge interface=ether2 pvid=20
add bridge=bridge interface=ether3 pvid=30
add bridge=bridge interface=ether4 pvid=10
add bridge=bridge interface=ether5 pvid=10
/interface bridge vlan
add bridge=bridge tagged=ether1 untagged=ether4,ether5 vlan-ids=10
add bridge=bridge tagged=ether1 untagged=ether2 vlan-ids=20
add bridge=bridge tagged=ether1 untagged=ether3 vlan-ids=30
/ip dhcp-client
add disabled=no interface=bridge
/system clock
set time-zone-name=Europe/Budapest
/system identity
set name=hexS
/system package update
set channel=upgrade
/system routerboard settings
set auto-upgrade=yes

Thanks!

I would never use pvid other than the default for the bridge, not sure why you do that??
In any case you need to tag the bridge in bridgevlan settings.

/interface bridge vlan
add bridge=bridge tagged=ether1,bridge untagged=ether4,ether5 vlan-ids=10
add bridge=bridge tagged=ether1,bridge untagged=ether2 vlan-ids=20
add bridge=bridge tagged=ether1,bridge untagged=ether3 vlan-ids=30

I used a different pvid other than the default on the bridge, because that seemed to work as a default PVID (for example for DHCP client).

Anyway, if I set a DHCP client on that bridge, how would it know what VLAN ID should it set for the DHCP packets?

Hard to say because you provided an incomplete config posting.
All the rules work together so if you leave some out, I would only be guessing…

default PVID is 1, and it should be left so.
If you do use vlans, then use vlans and the bridge should do nothing else but be the bridge.

You have not defined the vlans
You have not defined the iP pools
You have not defined the DCHP server
You have not defined the DHCP server network…

You have made your bridge a WAN client to boot, which is in most cases WRONGO!
The WAN is usually set via IP DHCP CLIENT, or other specific menus such as for instances of pppoe.

Take a long hard look at this example I made for another post, similar to yours except they have added wifi on top.
http://forum.mikrotik.com/t/no-internet-on-lan-port-with-vlan20/156184/1

suggest you copy it down and then see what you can do…
As for the persons firewall rules, they are factory default and work.
Much better is the Novice setup described below, very similar but easier to understand and better security.
See Item B here: - https://forum.mikrotik.com/viewtopic.php?t=182373

This router should one be a “switch” which has multiple access-ports, each access-port providing access to different VLANs.

The router (which handles the VLANs) is a different router, already set-up.

This is the complete config, which runs fine on 6.49.3

# feb/27/2022 21:47:09 by RouterOS 6.49.3
# software id = 84GZ-779U
#
# model = RB760iGS
# serial number = E1F10EB756FC
/interface bridge
add comment="VLAN filtered Bridge" name=bridge pvid=10 vlan-filtering=yes
/interface ethernet
set [ find default-name=ether1 ] comment=Uplink
set [ find default-name=ether2 ] comment="Guest VLAN"
set [ find default-name=ether3 ] comment="IoT VLAN"
set [ find default-name=ether4 ] comment="MacBook - Private VLAN"
set [ find default-name=ether5 ] comment="COMP1 - Private VLAN"
set [ find default-name=sfp1 ] disabled=yes
/interface wireless security-profiles
set [ find default=yes ] supplicant-identity=MikroTik
/interface bridge port
add bridge=bridge interface=ether1
add bridge=bridge interface=ether2 pvid=20
add bridge=bridge interface=ether3 pvid=30
add bridge=bridge interface=ether4 pvid=10
add bridge=bridge interface=ether5 pvid=10
/interface bridge vlan
add bridge=bridge tagged=ether1 untagged=ether4,ether5 vlan-ids=10
add bridge=bridge tagged=ether1 untagged=ether2 vlan-ids=20
add bridge=bridge tagged=ether1 untagged=ether3 vlan-ids=30
/ip dhcp-client
add disabled=no interface=bridge
/system clock
set time-zone-name=Europe/Budapest
/system identity
set name=hexS
/system package update
set channel=upgrade
/system routerboard settings
set auto-upgrade=yes
/system scheduler
add interval=1d name=backup_conf on-event="/system script run BackupConf;" \
    policy=ftp,reboot,read,write,policy,test,password,sniff,sensitive,romon \
    start-date=jan/01/2021 start-time=00:00:00
/system script
add dont-require-permissions=no name=BackupConf owner=admin policy=\
    ftp,reboot,read,write,policy,test,password,sniff,sensitive,romon source="/ex\
    port file=export.rsc\r\
    \n/tool fetch mode=ftp address=10.0.0.4 port=21 user=admin pass=XXXXX\
    2 upload=yes src-path=export.rsc dst-path=\"backup/mikrotik-hex/\$[/system i\
    dentity get name].rsc\"\r\
    \n\r\
    \n/system backup save name=router.backup\r\
    \n/tool fetch mode=ftp address=10.0.0.4 port=21 user=admin pass=XXXXX\
    2 upload=yes src-path=router.backup dst-path=\"backup/mikrotik-hex/\$[/syste\
    m identity get name].backup\""

Is the “trunk link” connected to ether1 using a native vlan 10? i.e. a “hybrid” in MikroTik terminology.
If so, make a backup (because I have not tried this myself) and copy it to a pc you can restore from.
Also export and save.
Go into safe mode, just in case you loose connection to the hEX
Then try setting the following for vlan-id 10
/interface bridge port
set bridge=bridge interface=ether1 pvid=10
/interface bridge vlan
set bridge=bridge untagged=bridge,ether1,ether4,ether5 vlan-ids=10

Here is my thought process, please let me know where it is incorrect.
You have specified that you want the bridge interface to get an ip address from a dhcp server that is on the native (untagged) vlan 10 that is connected to ether1. Since the bridge is untagged, and you have set the bridge pvid to 10, when the hEX S wants to connect to get an ip address, it will send a dhcp discover broadcast on the bridge device, and because it has pvid 10, it will send the broadcast out vlan 10 ports, and ether1 also has pvid 10 and is untagged on egress, so the dhcp server on the untagged vlan will receive the broadcast.
I don’t think the bridge woutd have to have pvid 10, unless ether1 was tagged for vlan 10, and the dhcp server was on vlan 10 (perhaps connected to an access port for vlan 10 on a switch connected via a trunck port to the trunk port on hEX ether1).
I also don’t understand why the bridge needs to be connected to vlans other than 10, as long as the hEX is just acting as a switch with a management interface on vlan 10. But I am ROS noob, so I am willing to learn why it would be needed. At least that’s the way I understand the VLAN Example - Trunk and Access Ports section of Bridge VLAN Filtering

@rkrisi starting from v7.1rc5 bridge vlan filtering is done using the switch chip in MT7621 devices, from the 7.1rc5 changelog:
*) bridge - added HW offload support for vlan-filtering on MT7621 switch chip (hEX, hEX S, RBM33G, RBM11G, LtAP);
Prior to that release (including RouterOS 6) bridge vlan filtering was done by the CPU.
So the same config might not work anymore.
I have not messed with VLANs on these devices yet, but the notes from above, plus some extra info from when support for RTL8367 was added:
http://forum.mikrotik.com/t/v7-1rc1-development-is-released/151250/1
http://forum.mikrotik.com/t/v7-1rc1-development-is-released/151250/1
& these guides:
https://help.mikrotik.com/docs/display/ROS/Bridge#Bridge-VLANExample-TrunkandAccessPorts
https://help.mikrotik.com/docs/display/ROS/Bridge#Bridge-Managementaccessconfiguration
Should help you figure it out.

Thanks! I will try this!

Anyway, safe mode seems to not to help so much in these cases, because usually the connection is interrupted, causing the router to reset the changes even though they are correct (just need to wait a few secs).

Thank you!

Thanks! This explains why the config breaks after updating. I will look into this soon.

Thanks for your help!

The original configuration using setting a PVID on the bridge-to-CPU interface making it an access port is absolutely fine, others may not have realised this is not your router and so does not require all of the VLANs trunked to the CPU.

You can test the hardware-offloaded switching hypothesis by setting hw=no on the bridge ports which will force the bridging to be handled by the CPU, as it is in v6.

I suspect it may be a v7 bug, the latest 7.2rc4 has a changelog entry *) bridge - fixed IP address on untagged bridge interface when vlan-filtering is enabled (introduced in v7.2rc2);

I also thought about this when scrolling through the changelog, but it says ‘introduced in 7.2rc2’, so I assume that the 7.1.3 that I tried does not have this issue?!

If it was working in v6, I would first backup, then try tdw’s suggestion to turn off hw bridge and see if that changes the behavior. Then if that does work, and you are not currently running the testing version 7.2rc4, you could reset to hw (or restore your back) and try updating to v7.2rc4, and see if that works.
If this worked with v6, and nothing else changed but the upgrade to v7, then I don’t think my suggestion about changing pvid to 10 on ether1 should be needed. My guess is that when you set your bridge pvid to 10, it would set the ports on router and switch to also use untagged pvid 10 for the CPU/switch RGMII link.
Since your config appears to be using tagged vlan 10 on the ether1 trunk, unless you know for sure you are using an untagged “native” vlan 10, I now don’t think you should set pvid 10 on bridge port ether1.

But thinking a bit more about this, and the CPU/integrated switch ASIC RGMII link, if all tagging/untagging is being done by the CPU, then ROS must be creating a trunk link between the switch ASIC and the CPU “under the covers”. I am trying to figure out how this is done. Since it is being done by the CPU, then it seems every frame received would have to be processed by the CPU, even when not routing, at least all frames when vlan-filtering is on. It seems to me the only way it would be possible would be by using vlans on the ASIC, one for each port, and then “double tagging” the frames sent out of the CPU for any “tagged” frame that was egressing a specific ethernet port. That’s my hypothesis about how the Ubiquiti ER-X (also based on the same SoC as the hEX S, the MediaTek MT7621A) “removes a port” from the switch0 interface (the rough analog of the bridge interface), but in the ER-X they have reserved vlans for each physical port. I haven’t established a hypothesis of how the hEX S provides direct access to ports when vlan-filtering is off, and the ER-X has supported vlan-aware in hardware for at least as the last 5 years, for longer than I have used the ER-X. I am not sure how the ER-X handles the case where someone has removed a port from the switch0 interface (to “work off the bridge” in MikroTik dialect), and then create a vlan subinterface on the removed member (for example to support a trunk (aka hybrid in MikroTik dialect) connection to an ISP, for example if IPTV was being delivered on a separate vlan).

To me “trunk” means a port carrying multiple vlans, and if one is an untagged “native” vlan, I still consider that a trunk. Terminology is a problem in networking, because each vendor has their own dialect. What HP Procurve calls a hybrid is not the same thing as what MikroTik means by hybrid. Even the term trunk is ambiguous, some vendors use this in reference to bonded “LAG”. “Link aggregation is called trunking on HP E-Series switches.” These differences in terminology can make conversations between different tribes confusing.

Certainly for the Atheros/Qualcomm switch chips & SoC the CPU - switch link can be programmed with an extra header added on egress from / processed on ingress to the switch. Amongst other things this indicates the port a packet is received on or to be sent to. This can be used to multiplex port-based VLANs (not to be confused with 802.1Q VLANs) so the user is presented with ether1-5 port physical ports 1-5, it is possible to mix both port-based and 802.1Q VLANs with appropriate configuration.

If you load OpenWrt onto a supported Mikrotik device you only get one ethernet interface, for the CPU - switch link, you have no choice but to configure the switch and VLANs to separate traffic on individual ports.

Documentation for all the switch chips is difficult to find as it is typically under NDA, I suspect that the reason the MediaTek switch with RouterOS v6 does not support VLANs or filters is the lack of support in drivers for elderly kernels and scarcity of examples as the chip itself does.

I had a little time to experiment with this.

So I turned off hw offload for all bridge ports first.

Then I tried to update to 7.1.3. Same as before, only minor change is that the device got an IP address through the DHCP client!
But nothing else worked. Even after this, I was unable to ping the router (from where the IP address was received…).

I restarted it for the firmware to be updated. Now I was unable to get an IP address as well.

Then I updated to 7.2rc4… Same as before, nothing worked.

Reverting back to 6.49.4, everything works correctly.

The only thing which I haven’t tried (I forgot) is to change the bridge port pvid to 10.

Maybe this gives any of you some clue what might be the problem (so this might not be a hw offload problem?!)

Is the extract still accurate in this post?
I have an RB760iGS in a lab environment with v7.2rc4, and I could try loading from a clean state to see if I can reproduce what you see.
I would use my ER-X as the "router" with dhcp server.

mar/06/2022 20:54:33 by RouterOS 7.2rc4

model = RB760iGS

But I do need to know, is ether1 tagged on vlan 10? That is what the config has.
With 6.49.4 you say everything works correctly. What does that mean? That the bridge device gets an ip address from the dhcp server connected to vlan 10?
And the devices connected to ether4 and ether5 also get ip addresses from the dhcp server on vlan 10, and they can see each other without the need for routing? I.e. if you ping from the PC on ether5, the TTL reported by ping to the bridge is the same as the TTL pinging the the MacBook on ether4 (The TTL is set by the remote device, and depending on the OS of the device you are pinging, you wlll get different values. A windows system sends ping responses with TTL 128, linux normally sends ping responses with TTL 64), and when pinging the device on ether3 or ether2 the TTL is less than when pinging the bridge (assuming your router allows traffic from vlan10 to vlan20 and vlan30). The TTL gets decremented with each hop over a router.
When running 72.rc4 you state: "Then I updated to 7.2rc4... Same as before, nothing worked." what does that mean? Do any of the pings work?
Edit: when you disable hw, it did or did not change the behavior? Your post doesn't say.
If we can get a reproducer, then it is much more likely to be fixed soon.

In a VLAN setup the bridge interface should never get a direct IP. Thats a nogo ! IPs should only be distributed on the VLAN IP Interfaces directly !
See https://administrator.de/contentid/367186 for a detailed example. Unfortunately German but the describing screenshots are pretty self explaining.



Is the extract still accurate in this > post> ?

Yes, same config is used. Didn’t changed anything even after upgrade, I let RouterOS migrate the config.

But I do need to know, is ether1 tagged on vlan 10? That is what the config has.

Yes. VLAN10 is tagged on ether 1. I consider the link connected to ether1 (which is a direct link to my router) as a trunk link. Every vlan I have at the router is sent to this switch (hex S) as tagged. Then hex S handles this, forwarding each vlan to an appropriate port, removing the vlan header, hence making it an access port (I don’t really know all the mikrotik terms here, I hope this is correct what I say and understandable).

And the devices connected to ether4 and ether5 also get ip addresses from the dhcp server on vlan 10, and they can see each other without the need for routing? I.e. if you ping from the PC on ether5, the TTL reported by ping to the bridge is the same as the TTL pinging the the MacBook on ether4 (The TTL is set by the remote device, and depending on the OS of the device you are pinging, you wlll get different values. A windows system sends ping responses with TTL 128, linux normally sends ping responses with TTL 64), and when pinging the device on ether3 or ether2 the TTL is less than when pinging the bridge (assuming your router allows traffic from vlan10 to vlan20 and vlan30). The TTL gets decremented with each hop over a router.

Yes, exactly this is how it works. I did not test the TTL before, but that is also correct I tested it right now, in both ways… So everything you stated in the quote above, it is how it works in reality (at least in v6).

When running 72.rc4 you state: “Then I updated to 7.2rc4… Same as before, nothing worked.” what does that mean? Do any of the pings work?
Edit: when you disable hw, it did or did not change the behavior? Your post doesn’t say.
If we can get a reproducer, then it is much more likely to be fixed soon.

As far as I did test this, really nothing. The switch once (on the first startup of 7.1.3) got an ip address from the dhcp server and I saw the default route added by the dhcp client, but then nothing else worked. Tried pinging the router from the clients, and also from the hex, no response received (I think it showed timeout, but I can’t remember that).
Disabling hw does not changed anything. I tried toggling it a few times, nothing changed, same behaviour, even pings did not worked.
Also, after upgrading I can only access the hex through mac winbox, because it does not have an ip address (even at the first time, when it showed an ip address, I was unable to use it that way).

Thanks for your help!

I can’t find any example on the linked thread where a dhcp client is used on a Mikrotik VLAN enabled switch.

What is the reason for this restriction? What won’t work if you do put an address on the base bridge interface?

In other words, is it because you are less likely to have other problems if you don’t understand it a low level, or that it really doesn’t work. Is it like saying “never use a red wire for ground” or “never use a black wire for VCC”, not because it won’t work, but because it goes against convention, and can lead to mistakes that could be costly?

I am still learning RouterOS on my RB760iGS currently using 7.2rc4.

I haven’t tried this on the RB760iGS, but from what I have read, including the link you just posted, I haven’t seen anywhere that explains why, or what doing so breaks.

For reference, this is the part of the https://administrator.de/contentid/367186 document that says not to put an address on the base interface (I chose to let google translate the page for me) and fixed a few spacing issues.

ATTENTION !! Important notes on IP addressing in general:

  • The dedicated routing port ether1 must NOT be a (port) member of the VLAN bridge! This port is routed directly, has the IP directly on the port and is therefore not a member of the bridge ports! That’s exactly why it doesn’t appear in the above mentioned Bridgeport member list, because it has nothing to do with the VLAN config per se!


  • Furthermore, the bridge interface itself NOT be assigned a direct IP address. IP addresses are only to be set on ether1 and the VLAN interfaces. This is often set up in the default config or “quick set” and must be removed if you adapt an existing default config for the VLAN setup.

I wouldn’t put in so definite words as @lfoerster did … because it is possible to use bridge interface directly … the problem is that needed configuration to make things work correctly is in other places than it’s for the rest of bridge ports.

In ROS (and Linux in general for that matter), L3 can only work over untagged (logical or physical) interfaces. So for tagged traffic, one needs vlan interface (from /intreface vlan) anchored to a tagged interface. So most IP addresses on inter-VLAN router will be set on vlan interfaces.

We all know, that we can have untagged port (e.g. etherX) configured as access port for certain VLAN … which is defined by setting property pvid= on /interface bridge port elements.

The complication comes from the fact that bridge has two personalities (this explanation lists even more, but personally I see two principal ones): 1) switch-like entity moving frames between member ports and 2) interface which allows ROS software to interact with traffic passing brdige switch-like entity. The entity #2 is created implicitly for every bridge we create. And most properties are configured in /interface bridge, even properties which logically belong to interface entity. Examples of such properties are: mtu, admin-mac, pvid, frame-types, ingress-filtering, etc. (in the same configuration stanza, we define a few properties which clearly belong to switch-like entity, such as protocol-mode, priority, igmp-snooping, ether-type, vlan-filtering, etc.).
Let me re-emphasize it: quite a few settings of items in /interface bridge are not affecting bridges as switch-like entity, they are not default settings for member ports, those settings are about bridge interface only.

And then there’s /interface bridge vlan which has to properly reference the bridge interface in tagged= and untagged= properties in exactly the same manner as other bridge ports.

Lastly, MT default configuration sets all bridge ports (bridge interface included) with pvid=1 … which, without further configuration change, makes bridge interface member of untagged LAN (as configured or rather not-configured for the rest of bridge ports, e.g. etherX).

I guess it’s the confusion about the multi-personality of bridge that makes many people think that bridge interface should never be used as untagged interface after vlan-filtering is enabled.

And the time to negate myself: I always advise to use bridge interface as fully tagged interface … just because it makes life easier because all the config is then uniformly done. Personally I follow my own advice :wink: