Possible bridge leak problem

Hi folks,

I seek for some advice and thoughts. Recently I’m having this problem with this bridge setup: multiple VLAN’s on a couple interfaces + 1 plain ethernet interface are bridged to one single bridge. Those VLAN’s then terminate at Access Points and are bridged there to wlan interfaces. Bridge ports belong to the same horizon, so traffic between them is not passed in L2. Most of the time everything works fine, but recently this problem started happening time to time: all bridge ports start to transmit exactly the same traffic (~6-8mbps) saturating the ether interface (that VLAN’s belong to). I then disable few bridge ports for a couple seconds and everything returns to normal.

Here is the graphical representation of problem happening:
Clipboard020555.png
vlan’s on ether1, plain ether2 and vlan’s on ether3 are bridge1 ports.

Torching various interfaces/bridge ports doesn’t reveal anything obvious, only lots of ordinary connections/traffic, so it’s difficult to determine any oddities. There are few vlan’s that aren’t used at the moment and there should be no traffic, IMO. But torching these unused vlan’s (that are ports of bridge1) show connections to/from IP’s that clearly should originate from other ports and they change constantly as if it suggested some sort of leaked traffic. I then tried to torch a plain vlan (not bridged to anything) at AP side, it still receives traffic it shouldn’t.

Here’s another screenshot:
Clipboard016996.png
Above is a normal state s-shot, when everything works ok. Notice the selected line how it’s sending 220kbps somewhere. It’s not broadcast traffic. This amount seen in screenshot above sums up from ordinary connections like http/https to facebook or youtube (according to torch), etc.

Can anyone explain why that port emits traffic when it shouldn’t?

Sorry if I was unclear at certain spots. Please ask, and I shall fill the missing info. Thanks!

I observed similar behavior several times. Before a few of days I observered an RB2011iLS with 6.29.1 which was flooding all the bridge interfaces by equal amount of traffic. I discovered that some customers MAC addresses were missing from the bridge MAC address table. I was not able to find these MAC addresses in Winbox’s Bridge/Hosts listing. Since I have such experiences that Winbox sometimes shows not complete mac-address table I tried to use '/int bridge host print where mac-address=“xx:yy:…” but the addresses simply were not in bridge host table.
In the case the bridge has to sent a packet to a MAC address which it knows nothing about it flood the packet to all interfaces in bridge (except the one the packet came from). Usually then the MAC address sends a packet back and the bridge adds a record into mac-address table so next time it knows the port the MAC is behind. But in this case the ROS is not able to add (because of some bug) the MAC info into MAC address-table so it still floods all the ports.

I rebooted the device and problem disappeared. Later I upgraded to 6.30.4. The problem can be something related to FastPath so you can try to disable FastPath first…

Thanks, dada! Seems like this is the issue indeed. I’ve checked the hosts toward which traffic is sent and they are indeed the ones that recently gone offline - ARP entry still exists in the router, but no such MAC address in bridge hosts table. And this doesn’t only apply to offline hosts, I witnessed few IPs that were actually online, but their MAC address wasn’t in FDB table, so traffic to these IPs were flooded throughout all bridge ports.

What are my possibilities to prevent such flooding to all ports when destination MAC address is not found in hosts table? I suppose this is the way it works? It has at first to send the traffic to all the ports to actually find the host and the entry is then added in hosts table for further forwarding decisions?

In my case the ARP entries were present too. Changing the ARP timeout will not help IMHO. The bridge just ignores the MAC and don’t add it to table. It must be a bug in bridge code…

What version of ROS and what device type are you using? I have a supout file but because I knew that first support’s response would be upgrade to latest ROS I didn’t bother wirh message to support yet.

I wonder if this is related to the routing slowdown when the port is in a bridge.

This is happening on CCR1036. I’ve experienced this in v6.27 and then upgraded to v6.32.1. I’m not sure, but I think I’ve seen an entry saying something about “fixed bridge port leak” in some recent ROS changelog. But I cannot find it anymore. Still, it didn’t fix my problem.

Currently I’m thinking to reconfigure network setup and offload VLAN segmentation to switches, but that would involve quite lot of work and take away nice review of traffic flow to APs in ROS interface window. I would like some easier solution for this.

I can confirm that this “ghost traffic” is indeed directed to such IPs with MACs that have no bridge corresponding FDB entry. Since my ARP entries are added when DHCP lease is issued, they fade out only when DHCP lease time expires. Removing ARP entry manually from ARP table effectively kills the flooded traffic, so I managed to fix this problem by setting FDB entry timeout higher than DHCP lease expiration time (which is the same as ARP timeout).

I bet 100% this is same bug, experienced the same, however fixed it :

http://forum.mikrotik.com/t/serious-bug-casuing-network-ddos-in-routeros-v5-20-and-maybe-others-didnt-tested-yet/89336/1

Maybe check it out, it’s something wrong with MAC mechanisms

Well, unfortunately this is how L2 switching works and not much can be done about that, IMHO. And it’s where network engineering starts to become more difficult. You should consider reworking your network topology to be more routed than bridged if possible.

Hi,

I am wondering if anyone is seeing the problem of not properly populated mac address table (the bridge is not learning and caching for 5 minutes all the mac addresses it sees) causing traffic flood on all bridge ports (a consequence of not having a mac address to port binding)? I have a RB2011iLS (mentioned in this thread) that has this symptom on every single version I tested (e.g. 6.30.4, 6.32.4, 6.34.5, 6.35.2).

It’s using the simplest configuration possible, e.g a single bridge that has had all ports added to it. With 5 simple bridge firewall rules that filter based on ports (their presence should disable fast-path - but even with fast-path disabled manually I still see flooding).

Wondering if this is a hardware issue on this particular model I have a bunch of RB750 that do not have this symptom.

I have experineced this issue on my 2011UiAS-2HnD. In my case i was observing bridge flooded with traffic going from WAN and intended to one of computers connected to LAN ethernet interface. As a result all such traffic was mirrored to wlan.

First entry in ROS changelog for 6.39rc7 says:

*) bridge - fixed MAC address learning from switch master-port;

I’m about to upgrade and will let you know if 6.39rc7 fixes the issue.

EDIT: seems like it does.