Hello,
Feel free to ask questions about the issue if I have forgotten to provide any important information.
Thank you in advance for your help and suggestions.
Infrastructure:
I have 2 switchs with two MLAGS.
- Switch A (SA): CRS312-4C+8XG - version: 7.19.1 (stable)
- Switch B (SB): CRS326-24S+2Q+ - version: 7.20.4 (stable)
I have two Ubuntu servers that act as client A and client B. (CA and CB)
My Peer link: SA: ether1 ←> SB: sfp-sfpplus1
CA is connected to SA and SB: SA: ether3 ; SB: sfp-sfpplus2 (client-bond1)
CB is connected to SA and SB: SA: ether4; SB: sfp-sfpplus10 (client-bond2)
I disabled hardware offload on the LACP ports on the switches in order to be able to see the ARP requests. (ether3, ether4, sfp-sfpplus2,sfp-sfpplus10)
What I expect:
When I disconnect the cable between CA and SA, which is the cable used by my ping initiated from CB to CA, the ping should be continue seamlessly.
What I get:
When I disconnect the cable between CA and SA, my ping is interrupted and I get a timeout. The ARP Requests sent by CB for CA do reach CA, but the ARP Reply cannot find a path back.
With reference to the diagram in the infrastructure section, CB sends an ARP Request on the CB→SA link. SA then forwards this request over the peer link SA→SB. SB broadcasts the request on its bridge and it enters the SB→CA bond. The CA server replies with an ARP Reply over the CA→SB link. In its Forwarding Database, SB sees the MAC address of CB’s LACP interface on its local port. Therefore, it forwards the ARP Reply from the CA–SB bond to the local CB–SB bond, but SB drops the packet before sending it out on the physical interface of the CB–SB bond.
From the servers CA and CB, I use tcpdump to see the traffics.
From the switch SA and SB I use the tool sniffer: /tool/sniffer/quick mac-protocol=arp interface=ether1,client-bond,client-bond2,ether3,ether4,bridge1
I have already tried setting the Spanning Tree protocol to None, but nothing changed.
Here the information I get from the sniffer: (We can wee the ARP Reply is drop after client-bond2 logical interface and doesn’t go on the physical interface sfp-sfpplus10)
INTERFACE TIME NUM DIR SRC-MAC DST-MAC VLAN SRC-ADDRESS PROTOCOL SIZE CPU
client-bond 0.361 3 -> D4:AE:52:92:ED:B7 FF:FF:FF:FF:FF:FF 192.168.42.2: who has 192.168.42.1? arp 60 0
sfp-sfpplus2 0.361 5 -> D4:AE:52:92:ED:B7 FF:FF:FF:FF:FF:FF 192.168.42.2: who has 192.168.42.1? arp 60 0
bridge1 0.361 1 <- D4:AE:52:92:ED:B7 FF:FF:FF:FF:FF:FF 42 192.168.42.2: who has 192.168.42.1? arp 64 0
sfp-sfpplus2 0.361 6 <- D4:AE:52:92:ED:BA D4:AE:52:92:ED:B7 192.168.42.1: at D4:AE:52:92:ED:BA arp 60 0
client-bond 0.361 4 <- D4:AE:52:92:ED:BA D4:AE:52:92:ED:B7 192.168.42.1: at D4:AE:52:92:ED:BA arp 60 0
client-bond2 0.361 2 -> D4:AE:52:92:ED:BA D4:AE:52:92:ED:B7 192.168.42.1: at D4:AE:52:92:ED:BA arp 60 0
sfp-sfpplus1 1.385 13 <- D4:AE:52:92:ED:B7 FF:FF:FF:FF:FF:FF 42 192.168.42.2: who has 192.168.42.1? arp 64 0
client-bond 1.385 9 -> D4:AE:52:92:ED:B7 FF:FF:FF:FF:FF:FF 192.168.42.2: who has 192.168.42.1? arp 60 0
sfp-sfpplus2 1.385 11 -> D4:AE:52:92:ED:B7 FF:FF:FF:FF:FF:FF 192.168.42.2: who has 192.168.42.1? arp 60 0
bridge1 1.385 8 <- D4:AE:52:92:ED:B7 FF:FF:FF:FF:FF:FF 42 192.168.42.2: who has 192.168.42.1? arp 64 0
sfp-sfpplus2 1.385 12 <- D4:AE:52:92:ED:BA D4:AE:52:92:ED:B7 192.168.42.1: at D4:AE:52:92:ED:BA arp 60 0
client-bond 1.385 10 <- D4:AE:52:92:ED:BA D4:AE:52:92:ED:B7 192.168.42.1: at D4:AE:52:92:ED:BA arp 60 0
client-bond2 1.385 14 -> D4:AE:52:92:ED:BA D4:AE:52:92:ED:B7 192.168.42.1: at D4:AE:52:92:ED:BA arp 60 0
The Forward Database on my bridge after I disconnected the physical interface CA→SA (ether3):
SA:
[admin@peer1] > /interface/bridge/host/print
Flags: D - DYNAMIC; L - LOCAL; E - EXTERNAL
Columns: MAC-ADDRESS, VID, ON-INTERFACE, BRIDGE
# MAC-ADDRESS VID ON-INTERFACE BRIDGE
0 DL 2C:C8:1B:9B:15:A5 client-bond2 bridge1
1 D E F4:1E:57:88:74:91 ether1 bridge1
2 D E F4:1E:57:88:74:99 client-bond2 bridge1
3 D E F4:1E:57:88:74:91 1 ether1 bridge1
4 DL 2C:C8:1B:9B:15:A2 42 ether1 bridge1
5 DL 2C:C8:1B:9B:15:A5 42 client-bond2 bridge1
6 D E D4:AE:52:92:ED:B7 42 client-bond2 bridge1
7 D E D4:AE:52:92:ED:BA 42 ether1 bridge1
8 D E F4:1E:57:88:74:91 42 ether1 bridge1
9 D E F4:1E:57:88:74:99 42 client-bond2 bridge1
SB:
[admin@peer2] > /interface/bridge/host/print
Flags: D - DYNAMIC; L - LOCAL; E - EXTERNAL
Columns: MAC-ADDRESS, VID, ON-INTERFACE, BRIDGE
# MAC-ADDRESS VID ON-INTERFACE BRIDGE
0 DL F4:1E:57:88:74:91 client-bond bridge1
1 DL F4:1E:57:88:74:99 client-bond2 bridge1
2 D E D4:AE:52:92:ED:B7 42 client-bond2 bridge1
3 D E D4:AE:52:92:ED:BA 42 client-bond bridge1
4 DL F4:1E:57:88:74:90 42 sfp-sfpplus1 bridge1
5 DL F4:1E:57:88:74:91 42 client-bond bridge1
6 DL F4:1E:57:88:74:99 42 client-bond2 bridge1
My configuration:
SA:
[admin@peer1] > /interface/bridge/print
Flags: D - dynamic; X - disabled, R - running
0 R name="bridge1" mtu=auto actual-mtu=1500 l2mtu=1584 arp=enabled arp-timeout=auto mac-address=2C:C8:1B:9B:15:A2 protocol-mode=rstp fast-forward=yes igmp-snooping=no auto-mac=yes
ageing-time=5m priority=0x9000 max-message-age=20s forward-delay=15s transmit-hold-count=6 vlan-filtering=yes ether-type=0x8100 pvid=1 frame-types=admit-all ingress-filtering=yes
dhcp-snooping=no port-cost-mode=long mvrp=no max-learned-entries=auto
[admin@peer1] > /interface/bridge/port/print detail
Flags: X - disabled, I - inactive; D - dynamic; H - hw-offload
0 H interface=ether1 bridge=bridge1 priority=0x80 edge=auto point-to-point=auto learn=auto horizon=none hw=yes auto-isolate=no restricted-role=no restricted-tcn=no pvid=1
frame-types=admit-all ingress-filtering=yes unknown-unicast-flood=yes unknown-multicast-flood=yes broadcast-flood=yes tag-stacking=no bpdu-guard=no trusted=no
mvrp-registrar-state=normal mvrp-applicant-state=normal-participant multicast-router=temporary-query fast-leave=no
1 interface=client-bond bridge=bridge1 priority=0x80 edge=auto point-to-point=auto learn=auto horizon=none hw=no auto-isolate=no restricted-role=no restricted-tcn=no pvid=42
frame-types=admit-all ingress-filtering=yes unknown-unicast-flood=yes unknown-multicast-flood=yes broadcast-flood=yes tag-stacking=no bpdu-guard=no trusted=no
mvrp-registrar-state=normal mvrp-applicant-state=normal-participant multicast-router=temporary-query fast-leave=no
2 interface=client-bond2 bridge=bridge1 priority=0x80 edge=auto point-to-point=auto learn=auto horizon=none hw=no auto-isolate=no restricted-role=no restricted-tcn=no pvid=42
frame-types=admit-all ingress-filtering=yes unknown-unicast-flood=yes unknown-multicast-flood=yes broadcast-flood=yes tag-stacking=no bpdu-guard=no trusted=no
mvrp-registrar-state=normal mvrp-applicant-state=normal-participant multicast-router=temporary-query fast-leave=no
[admin@peer1] > /interface/bridge/vlan/print detail
Flags: X - disabled, D - dynamic
0 bridge=bridge1 vlan-ids=42 tagged=ether1 untagged=client-bond,client-bond2 mvrp-forbidden="" current-tagged=ether1 current-untagged=client-bond2,client-bond
1 D ;;; added by pvid
bridge=bridge1 vlan-ids=1 tagged="" untagged=bridge1,ether1 mvrp-forbidden="" current-tagged="" current-untagged=bridge1,ether1
[admin@peer1] > /interface/bridge/mlag monitor
status: connected
system-id: 2C:C8:1B:9B:15:A2
active-role: primary
[admin@peer1] > /interface/bonding/print detail
Flags: X - disabled; R - running
0 name="client-bond" mtu=1500 mac-address=2C:C8:1B:9B:15:A4 arp=enabled arp-timeout=auto slaves=ether3 mode=802.3ad primary=none link-monitoring=mii arp-interval=100ms arp-ip-targets=">
mii-interval=100ms down-delay=0ms up-delay=0ms lacp-rate=1sec transmit-hash-policy=layer-2 min-links=0 mlag-id=10 lacp-mode=active
1 R name="client-bond2" mtu=1500 mac-address=2C:C8:1B:9B:15:A5 arp=enabled arp-timeout=auto slaves=ether4 mode=802.3ad primary=none link-monitoring=mii arp-interval=100ms
arp-ip-targets="" mii-interval=100ms down-delay=0ms up-delay=0ms lacp-rate=1sec transmit-hash-policy=layer-2 min-links=0 mlag-id=20 lacp-mode=active
SB:
[admin@peer2] > /interface/bridge/print
Flags: D - dynamic; X - disabled, R - running
0 R name="bridge1" mtu=auto actual-mtu=1500 l2mtu=1584 arp=enabled arp-timeout=auto mac-address=F4:1E:57:88:74:90 protocol-mode=rstp fast-forward=yes igmp-snooping=no auto-mac=yes
ageing-time=5m priority=0x9000 max-message-age=20s forward-delay=15s transmit-hold-count=6 vlan-filtering=yes ether-type=0x8100 pvid=1 frame-types=admit-all ingress-filtering=yes
dhcp-snooping=no port-cost-mode=long mvrp=no max-learned-entries=auto
[admin@peer2] > /interface/bridge/port/print detail
Flags: X - disabled, I - inactive; D - dynamic; H - hw-offload
0 H interface=sfp-sfpplus1 bridge=bridge1 priority=0x80 edge=auto point-to-point=auto learn=auto horizon=none hw=yes auto-isolate=no restricted-role=no restricted-tcn=no pvid=1
frame-types=admit-all ingress-filtering=yes unknown-unicast-flood=yes unknown-multicast-flood=yes broadcast-flood=yes tag-stacking=no bpdu-guard=no trusted=no
mvrp-registrar-state=normal mvrp-applicant-state=normal-participant multicast-router=temporary-query fast-leave=no
1 interface=client-bond bridge=bridge1 priority=0x80 edge=auto point-to-point=auto learn=auto horizon=none hw=no auto-isolate=no restricted-role=no restricted-tcn=no pvid=42
frame-types=admit-all ingress-filtering=no unknown-unicast-flood=yes unknown-multicast-flood=yes broadcast-flood=yes tag-stacking=no bpdu-guard=no trusted=no
mvrp-registrar-state=normal mvrp-applicant-state=normal-participant multicast-router=temporary-query fast-leave=no
2 interface=client-bond2 bridge=bridge1 priority=0x80 edge=auto point-to-point=auto learn=auto horizon=none hw=no auto-isolate=no restricted-role=no restricted-tcn=no pvid=42
frame-types=admit-all ingress-filtering=no unknown-unicast-flood=yes unknown-multicast-flood=yes broadcast-flood=yes tag-stacking=no bpdu-guard=no trusted=no
mvrp-registrar-state=normal mvrp-applicant-state=normal-participant multicast-router=temporary-query fast-leave=no
[admin@peer2] > /interface/bridge/vlan/print detail
Flags: X - disabled, D - dynamic
0 bridge=bridge1 vlan-ids=42 tagged=sfp-sfpplus1 untagged=client-bond,client-bond2 mvrp-forbidden="" current-tagged=sfp-sfpplus1 current-untagged=client-bond,client-bond2
1 D ;;; added by pvid
bridge=bridge1 vlan-ids=1 tagged="" untagged=bridge1,sfp-sfpplus1 mvrp-forbidden="" current-tagged="" current-untagged=bridge1,sfp-sfpplus1
2 D ;;; added by switch-cpu
bridge=bridge1 vlan-ids=42 tagged=bridge1 untagged="" mvrp-forbidden="" current-tagged=bridge1 current-untagged=""
[admin@peer2] > /interface/bridge/mlag monitor
status: connected
system-id: 2C:C8:1B:9B:15:A2
active-role: secondary
[admin@peer2] > /interface/bonding/print detail
Flags: X - disabled; R - running
0 R name="client-bond" mtu=1500 mac-address=F4:1E:57:88:74:91 arp=enabled arp-timeout=auto slaves=sfp-sfpplus2 mode=802.3ad primary=none link-monitoring=mii arp-interval=100ms
arp-ip-targets="" mii-interval=100ms down-delay=0ms up-delay=0ms lacp-rate=1sec transmit-hash-policy=layer-2 min-links=0 mlag-id=10 lacp-mode=active
1 R name="client-bond2" mtu=1500 mac-address=F4:1E:57:88:74:99 arp=enabled arp-timeout=auto slaves=sfp-sfpplus10 mode=802.3ad primary=none link-monitoring=mii arp-interval=100ms
arp-ip-targets="" mii-interval=100ms down-delay=0ms up-delay=0ms lacp-rate=1sec transmit-hash-policy=layer-2 min-links=0 mlag-id=20 lacp-mode=active
Steps to reproduce the problem:
Configure a peer link mlag between two switchs.
On each switch, create two LACPs bonds.
Initiate a ping from CB to CA. Make sure that the ping is successfully passing between CB → SA and SA → CA.
Disconnect or disable the physical interface for SA → CA.
Your will see an ARP Reply dropped on SB between the logical bond interface and the physical interface.
Thank you for taking the time to read this,
Erwan DUFOUR, Withings
