Hi!
What can cause traffic in an L2 network to stop working after a number of bridge hops? I have a network that looks like this:
device <-ethernet-> hEX S- <-fiber/VLAN-> CRS106 <-wireless bridge1/VLAN-> RB960PGS <-wireless bridge2/VLAN-> managed switch <-ethernet-> router/gw to WAN
All devices can ping their two nearest neighbors without any problems. I can also ping with large packet size like 5k without problems, latency is fine even with 5k pings max 30ms.
Now if I try to ping the managed switch from my hEX S it will drop about 99% of all the packets. I see ICMP going out on the correct port on the RB960PGS but nothing back from the managed switch. While at the same time I can have the CRS106 ping the managed switch without any problems. I also tried to ping from the managed switch and it works fine to the CRS106 but the hEX S is hard to reach. I have no clue at all what can cause this, it feels a bit unsteady too and not completely consequent, it can work better for a short time and then back to drop almost all pings. But I never have and problems with the two nearest devices in any combination.
I don’t have a clue where to start looking
It is like if the signals were analog and traveled too far and amplified to many times to be interpreted anymore… But I know, that is not how digital signals work so that is not the answer:)
Start with the configurations of each of the devices. Export and post each configuration and post them here so we have a clue what you have done to break it.
To export and paste your configuration (and I’m assuming you are using WebFig or Winbox), open a terminal window, and type (without the quotes) “/export hide-sensitive file=any-filename-you-wish”. Then open the files section and right click on the filename you created and select download in order to download the file to your computer. It will be a text file with whatever name you saved to with an extension of .rsc. Suggest you then open the .rsc file in your favorite text editor and redact any sensitive information. Then in your message here, click the code display icon in the toolbar above the text entry (the code display icon is the 7th one from the left and looks like a square with a blob in the middle). Then paste the text from the file in between the two code words in brackets.
Hi!
I can sure try to “clean” the configs and post here but I thought I just try to get som hints first what I should look closer at
Using UBNT gigabeams and airmax 5AC for bridges, maybe not my favorites but UBNT was selected by our IT department long before I got involved hehe.
I had tried to understand the ARP table but I did not understand it right away but now I think I got a clue on what is going on…
I run several tagged and one untagged network from the CRS106 to the hEX S. I do not use bridge filtering but instead use VLAN interfaces connected to separate bridges to bridge each VLAN into it’s own bridged network. I do this because I think it is the easiest way but I know it is not really the fastest or the recommended way…
Normally I do not use untagged when I do this but the CRS106 gave me troubles with dropping traffic from my primary network when I used a tagged network this way on it so I switched that over to untagged to get HW offloading to the switch chip. (It dropped traffic even at almost no traffic, another of my post here on that one previously. I guess it might not have the horsepower to do it correctly or data between switch chip and CPU is not prioritized correctly because this is a “switch” and should not have much data going this way…)
But now back to the problem, the network that gave me troubles now is a tagged one and when I examined the ARP table I see that IPs from this network is not only on the VLAN interface bridge port where I expected it to be but also directly on the fiber bridge port that is the hybrid trunk port between the switches. This must be because this bridge that I want to handle the untagged network has no clue it should not handle the tagged frames since VLAN filtering is not enabled.. The tagged IPs then probably “spills” into this untagged network bridge and confuses the devices That is my theory right now.
I wonder if I can just enable VLAN filtering and maybe with ingress filtering too on this bridge used for the untagged network and it will stop caring about tagged frames and let the VLAN interfaces alone do that instead?
For example this:
I don’t think I should find this IP subnet directly on sfp1 since it should only be sent tagged on 601…
And I can’t explain everything fully with this yet. I did see ICMP pings from two hosts going out the correct interface towards the managed switch but it only replied to one of the hosts, the closest one. I don’t think I should see it going out on that correct interface if ARP tables was so messed up that packets don’t make it through.
The airMAX radios in bridge mode will by default pass all tagged VLANs and provide management access to the radios untagged. If you are seeing unexpected VLANs that will be down to your switch configurations.
There is a long-standing bug in airMAX radios operating in point-to-multipoint (even if there is only a single station) when VRRP is used on the network which will cause very similar symptoms to what you are seeing.
Yes I know (you can activate tagged mgmt too without going into advanced network setup and manually create bridges for that
Aha, lucky I don’t use VRRP then Their software development is not the best, see how they handled their DFS issues with later firmware (where they introduced a hupersensitive DFS algorithm because authorities in different EU countries had remarks on how insensitive the old one was and they even got a short sales ban in the UK
I activated VLAN filtering and ingress filtering too just to be sure for the bridge on the hEX S that I want untagged traffic too. That seems to work, now I don’t see those IPs anymore on the untagged/physical interface. When I do that on the CRS106, the untagged bridge SFP ports loses their H status and I immediately start to get lots of packet drops I guess that is because it does not support doing HW accelerated bridge VLAN filtering, I guess switch VLAN config has to be used.
I will do some more testing on another CRS106 to see if I can figure out why it dislike handling traffic through the CPU so much…
I did some testing converting the config to bridge VLANs but this device can’t handle that in hardware either so problem with dropped packets persisted. I know, docs say use switch config but just wanted to know if I could stil do it using bridge VLAN handling instead of bridging VLAN interfaces. I have no problem doing stuff non hw accelerated since total throughput is very low but I can’t have huge amounts of dropped packets even even at kbit/s So I guess I have to learn how to use the switch config now then to get everything done in hw