Hi folks. I’ve noticed on at least one of my devices, I’m seeing very odd behavior regarding the QCA8337 switching functionality. Intermittently, traffic entering the MT router via one of the switch ports is not forwarded. It appears that the switch is intermittently entering a state in which it fails to apply the the proper 802.1q tag before passing the frame to the switch-cpu port. Even in this state, the issue only affects certain frames.
From laptop emitting ICMP pings via untagged frames connected directly to ether5 on the MT, I see the following via an MT capture (Packet Sniffer):
No. Time VLAN Source Destination Protocol Length Info
1 0.000000 172.17.30.149 172.17.20.10 ICMP 98 Echo (ping) request id=0x767a, seq=109/27904, ttl=64 (no response found!)
2 5.001437 172.17.30.149 172.17.20.10 ICMP 98 Echo (ping) request id=0x767a, seq=110/28160, ttl=64 (no response found!)
3 5.887692 30 172.17.30.149 8.8.8.8 ICMP 102 Echo (ping) request id=0xed7a, seq=0/0, ttl=64 (no response found!)
4 6.887910 30 172.17.30.149 8.8.8.8 ICMP 102 Echo (ping) request id=0xed7a, seq=1/256, ttl=64 (no response found!)
5 7.891628 30 172.17.30.149 8.8.8.8 ICMP 102 Echo (ping) request id=0xed7a, seq=2/512, ttl=64 (no response found!)
6 8.896095 30 172.17.30.149 8.8.8.8 ICMP 102 Echo (ping) request id=0xed7a, seq=3/768, ttl=64 (no response found!)
7 9.900500 30 172.17.30.149 8.8.8.8 ICMP 102 Echo (ping) request id=0xed7a, seq=4/1024, ttl=64 (no response found!)
8 10.005374 172.17.30.149 172.17.20.10 ICMP 98 Echo (ping) request id=0x767a, seq=111/28416, ttl=64 (no response found!)
9 10.904655 30 172.17.30.149 8.8.8.8 ICMP 102 Echo (ping) request id=0xed7a, seq=5/1280, ttl=64 (no response found!)
10 15.005530 172.17.30.149 172.17.20.10 ICMP 98 Echo (ping) request id=0x767a, seq=112/28672, ttl=64 (no response found!)
Note that the packets without 802.1q tags are the ones being dropped. This is confirmed by a similar capture on the vlan30 interface during that time:
No. Time VLAN Source Destination Protocol Length Info
1 0.000000 172.17.30.149 8.8.8.8 ICMP 98 Echo (ping) request id=0xf17a, seq=0/0, ttl=64 (reply in 2)
2 0.006498 8.8.8.8 172.17.30.149 ICMP 98 Echo (ping) reply id=0xf17a, seq=0/0, ttl=112 (request in 1)
3 1.004280 172.17.30.149 8.8.8.8 ICMP 98 Echo (ping) request id=0xf17a, seq=1/256, ttl=64 (reply in 4)
4 1.010528 8.8.8.8 172.17.30.149 ICMP 98 Echo (ping) reply id=0xf17a, seq=1/256, ttl=112 (request in 3)
5 2.009309 172.17.30.149 8.8.8.8 ICMP 98 Echo (ping) request id=0xf17a, seq=2/512, ttl=64 (reply in 6)
6 2.015495 8.8.8.8 172.17.30.149 ICMP 98 Echo (ping) reply id=0xf17a, seq=2/512, ttl=112 (request in 5)
7 3.011402 172.17.30.149 8.8.8.8 ICMP 98 Echo (ping) request id=0xf17a, seq=3/768, ttl=64 (reply in 8)
8 3.017432 8.8.8.8 172.17.30.149 ICMP 98 Echo (ping) reply id=0xf17a, seq=3/768, ttl=112 (request in 7)
9 4.015589 172.17.30.149 8.8.8.8 ICMP 98 Echo (ping) request id=0xf17a, seq=4/1024, ttl=64 (reply in 10)
10 4.021430 8.8.8.8 172.17.30.149 ICMP 98 Echo (ping) reply id=0xf17a, seq=4/1024, ttl=112 (request in 9)
11 5.019150 172.17.30.149 8.8.8.8 ICMP 98 Echo (ping) request id=0xf17a, seq=5/1280, ttl=64 (reply in 12)
12 5.024961 8.8.8.8 172.17.30.149 ICMP 98 Echo (ping) reply id=0xf17a, seq=5/1280, ttl=112 (request in 11)
However, during a time period when the loss is not occurring, we see the correct tags on the ether5 traffic:
No. Time VLAN Source Destination Protocol Length Info
1 0.000000 30 172.17.30.149 172.17.20.10 ICMP 102 Echo (ping) request id=0x767a, seq=690/45570, ttl=64 (no response found!)
2 5.003249 30 172.17.30.149 172.17.20.10 ICMP 102 Echo (ping) request id=0x767a, seq=691/45826, ttl=64 (no response found!)
3 7.181256 30 172.17.30.149 8.8.8.8 ICMP 102 Echo (ping) request id=0xce7b, seq=0/0, ttl=64 (no response found!)
4 8.183193 30 172.17.30.149 8.8.8.8 ICMP 102 Echo (ping) request id=0xce7b, seq=1/256, ttl=64 (no response found!)
5 9.185224 30 172.17.30.149 8.8.8.8 ICMP 102 Echo (ping) request id=0xce7b, seq=2/512, ttl=64 (no response found!)
6 10.007561 30 172.17.30.149 172.17.20.10 ICMP 102 Echo (ping) request id=0x767a, seq=692/46082, ttl=64 (no response found!)
7 10.187223 30 172.17.30.149 8.8.8.8 ICMP 102 Echo (ping) request id=0xce7b, seq=3/768, ttl=64 (no response found!)
8 11.190535 30 172.17.30.149 8.8.8.8 ICMP 102 Echo (ping) request id=0xce7b, seq=4/1024, ttl=64 (no response found!)
9 12.191940 30 172.17.30.149 8.8.8.8 ICMP 102 Echo (ping) request id=0xce7b, seq=5/1280, ttl=64 (no response found!)
10 15.009029 30 172.17.30.149 172.17.20.10 ICMP 102 Echo (ping) request id=0x767a, seq=693/46338, ttl=64 (no response found!)
11 20.014121 30 172.17.30.149 172.17.20.10 ICMP 102 Echo (ping) request id=0x767a, seq=694/46594, ttl=64 (no response found!)
12 25.017964 30 172.17.30.149 172.17.20.10 ICMP 102 Echo (ping) request id=0x767a, seq=695/46850, ttl=64 (no response found!)
13 30.022025 30 172.17.30.149 172.17.20.10 ICMP 102 Echo (ping) request id=0x767a, seq=696/47106, ttl=64 (no response found!)
And, likewise, we see both ping destinations showing up correctly on a capture of vlan30 traffic:
No. Time VLAN Source Destination Protocol Length Info
1 0.000000 172.17.30.149 8.8.8.8 ICMP 98 Echo (ping) request id=0xd27b, seq=0/0, ttl=64 (reply in 2)
2 0.006248 8.8.8.8 172.17.30.149 ICMP 98 Echo (ping) reply id=0xd27b, seq=0/0, ttl=112 (request in 1)
3 1.005186 172.17.30.149 8.8.8.8 ICMP 98 Echo (ping) request id=0xd27b, seq=1/256, ttl=64 (reply in 4)
4 1.011153 8.8.8.8 172.17.30.149 ICMP 98 Echo (ping) reply id=0xd27b, seq=1/256, ttl=112 (request in 3)
5 2.009529 172.17.30.149 8.8.8.8 ICMP 98 Echo (ping) request id=0xd27b, seq=2/512, ttl=64 (reply in 6)
6 2.015652 8.8.8.8 172.17.30.149 ICMP 98 Echo (ping) reply id=0xd27b, seq=2/512, ttl=112 (request in 5)
7 2.383880 172.17.30.149 172.17.20.10 ICMP 98 Echo (ping) request id=0x767a, seq=701/48386, ttl=64 (reply in 8)
8 2.384287 172.17.20.10 172.17.30.149 ICMP 98 Echo (ping) reply id=0x767a, seq=701/48386, ttl=63 (request in 7)
9 3.014246 172.17.30.149 8.8.8.8 ICMP 98 Echo (ping) request id=0xd27b, seq=3/768, ttl=64 (reply in 10)
10 3.020182 8.8.8.8 172.17.30.149 ICMP 98 Echo (ping) reply id=0xd27b, seq=3/768, ttl=112 (request in 9)
11 4.018495 172.17.30.149 8.8.8.8 ICMP 98 Echo (ping) request id=0xd27b, seq=4/1024, ttl=64 (reply in 12)
12 4.024680 8.8.8.8 172.17.30.149 ICMP 98 Echo (ping) reply id=0xd27b, seq=4/1024, ttl=112 (request in 11)
13 5.021400 172.17.30.149 8.8.8.8 ICMP 98 Echo (ping) request id=0xd27b, seq=5/1280, ttl=64 (reply in 14)
14 5.027179 8.8.8.8 172.17.30.149 ICMP 98 Echo (ping) reply id=0xd27b, seq=5/1280, ttl=112 (request in 13)
15 7.383943 172.17.30.149 172.17.20.10 ICMP 98 Echo (ping) request id=0x767a, seq=702/48642, ttl=64 (reply in 16)
16 7.384349 172.17.20.10 172.17.30.149 ICMP 98 Echo (ping) reply id=0x767a, seq=702/48642, ttl=63 (request in 15)
Relevant configuration is as follows:
# jul/12/2021 00:12:34 by RouterOS 6.47.10
# software id = KQ9B-0VZ3
#
# model = RB3011UiAS
/interface ethernet switch port
set 0 vlan-mode=secure
set 1 vlan-mode=secure
set 2 default-vlan-id=20 vlan-mode=secure
set 3 vlan-mode=secure
set 4 default-vlan-id=30 vlan-mode=secure
set 5 vlan-mode=secure
set 6 vlan-mode=secure
set 7 vlan-mode=secure
set 8 vlan-mode=secure
set 9 vlan-mode=secure
set 10 vlan-mode=secure
set 11 vlan-mode=secure
/interface ethernet switch vlan
add independent-learning=yes ports=switch1-cpu,ether1,ether2 switch=switch1 vlan-id=1
add independent-learning=yes ports=switch1-cpu,ether1,ether2,ether3 switch=switch1 vlan-id=20
add independent-learning=yes ports=switch1-cpu,ether1,ether2,ether5 switch=switch1 vlan-id=30
add independent-learning=yes ports=switch1-cpu,ether1,ether2 switch=switch1 vlan-id=40
add independent-learning=yes ports=switch1-cpu,ether1,ether2 switch=switch1 vlan-id=41
add independent-learning=yes ports=switch1-cpu,ether1,ether2 switch=switch1 vlan-id=50
add independent-learning=yes ports=switch1-cpu,ether1,ether2 switch=switch1 vlan-id=2
add independent-learning=yes ports=switch2-cpu,ether6,ether7,ether8,ether9,ether10 switch=switch2 vlan-id=1
/interface bridge
add name=lan priority=0x2000
add arp=disabled fast-forward=no name=loopback protocol-mode=none
add name=wan protocol-mode=none
/interface bridge port
add bridge=wan interface=wan-ports
add bridge=lan interface=lan-ports internal-path-cost=1000000 path-cost=1000000
/interface vlan
add interface=lan name=vlan2 vlan-id=2
add interface=lan name=vlan20 vlan-id=20
add interface=lan name=vlan30 vlan-id=30
add interface=lan name=vlan40 vlan-id=40
add interface=lan name=vlan41 vlan-id=41
add interface=lan name=vlan50 vlan-id=50
/ip address
add address=172.17.10.10/24 interface=wan network=172.17.10.0
add address=172.17.2.2/24 interface=vlan2 network=172.17.2.0
add address=172.17.0.1 interface=loopback network=172.17.0.1
add address=172.17.1.2/24 interface=lan network=172.17.1.0
add address=172.17.1.1 interface=vrrp1 network=172.17.1.1
add address=172.17.20.1 interface=vrrp20 network=172.17.20.1
add address=172.17.20.2/24 interface=vlan20 network=172.17.20.0
add address=172.17.30.2/24 interface=vlan30 network=172.17.30.0
add address=172.17.40.2/24 interface=vlan40 network=172.17.40.0
add address=172.17.41.2/24 interface=vlan41 network=172.17.41.0
add address=172.17.50.2/24 interface=vlan50 network=172.17.50.0
add address=172.17.30.1 interface=vrrp30 network=172.17.30.1
add address=172.17.40.1 interface=vrrp40 network=172.17.40.1
add address=172.17.41.1 interface=vrrp41 network=172.17.41.1
add address=172.17.50.1 interface=vrrp50 network=172.17.50.1
Any ideas? Please let me know if other captures or output would be useful in debugging this.