To whom it may concern.
I have straightforward setup with a CSS610 connecting to the home network WAPs and having an upstream LAGG towards the router.
Using two ethernet connections between parties, there is a breakup of the LACP group at random time intervals frequently during the day.
Reconfiguring the LACP group to have only one member, the breakup of the LAG is almost deterministic every 5 hours with bursts of 3 flaps.
This happens regardless of the active/passive setting on the SwOS side. There are no apparent other configurable options under SwOS.
At the Router side the configuration is:
No fast timeout
No flowid
Hash layers: L2 only, no l3, no l4
Use strict: yes
Here is the LACP breakup behaviour as observed from the router.
# ix1 is single member of the LACP lagg group (PCIe Intel X520-DA2 NIC, 1G SFP, MM fiber cable)
2024-03-24 23:16:30 notice kernel: ix1: Interface stopped DISTRIBUTING, possible flapping
2024-03-24 23:16:31 notice kernel: <6>lagg0: link state changed to UP
2024-03-24 23:18:00 notice kernel: ix1: Interface stopped DISTRIBUTING, possible flapping
2024-03-24 23:18:01 notice kernel: <6>lagg0: link state changed to UP
2024-03-24 23:19:32 notice kernel: ix1: Interface stopped DISTRIBUTING, possible flapping
2024-03-24 23:19:33 notice kernel: <6>lagg0: link state changed to UP
2024-03-25 04:16:34 notice kernel: ix1: Interface stopped DISTRIBUTING, possible flapping
2024-03-25 04:16:35 notice kernel: <6>lagg0: link state changed to UP
2024-03-25 04:18:05 notice kernel: ix1: Interface stopped DISTRIBUTING, possible flapping
2024-03-25 04:18:06 notice kernel: <6>lagg0: link state changed to UP
2024-03-25 04:19:35 notice kernel: ix1: Interface stopped DISTRIBUTING, possible flapping
2024-03-25 04:19:36 notice kernel: <6>lagg0: link state changed to UP
2024-03-25 09:17:09 notice kernel: ix1: Interface stopped DISTRIBUTING, possible flapping
2024-03-25 09:17:10 notice kernel: <6>lagg0: link state changed to UP
2024-03-25 09:18:39 notice kernel: ix1: Interface stopped DISTRIBUTING, possible flapping
2024-03-25 09:18:40 notice kernel: <6>lagg0: link state changed to UP
2024-03-25 09:20:10 notice kernel: ix1: Interface stopped DISTRIBUTING, possible flapping
2024-03-25 09:20:11 notice kernel: <6>lagg0: link state changed to UP
Could anybody please confirm that:
SwOS lite uses a 30s HB interval. Nothing in the documentation.
SwOS lite does NOT use LACP FAST. Nothing in the documentation.
Can any logs be retrieved from the SwOS about what it is busy with every 5 hours?
Can anything be done about getting LACP up and running between these two devices?
Can somebody confirm that a similar setup (SwOS lite ↔ FreeBSD) is working?
I have now further investigated this topic increasing the loglevel on the FreeBSD side.
There is a clear pattern which seems to point toward a problem at the MikroTik end.
Looking at the lacpdu transmits, there two cycles can be observed. The previously reported 5h cycle with breaking connection, but also there is a 27 minute cycle with a near-miss break.
During these 27 minute cycles, the MikroTik box takes an ever longer amount of time to answer a lacpdu_transmit, Starting with 0-1 seconds until reaching 30 seconds delay. 30 seconds is the heartbeat interval, i.e. a delay of this duration is considered a HB miss by FreeBSD.
The 27 minutes cycles do NOT break the LACP connection, as the retransmissions manage to re-sync the two devices within the 3 attempts.
Now every 5 hours, the 3rd attempt is also missed and the FreeBSD decides to drop/restart the LACP.
As the MikroTik end is not observable (no logs, no nothing), I have no idea what could cause the ever longer delay in answering that is observed. But until proven otherwise, I hold the switch to be responsible for the break in the protocol.
Here are the cleaned up and arranged logs captured today. The last column shows the ever increasing time that the MikroTik switch takes to answer the LACPDU transmit.
The switch gets fresh again at 09:29:35 AM after an initial retransmission, but then again degrades in answering time. These 27 minute cycles go on and on, with the re-sync something using all 3 attempts. Eventually after 11 cycles, the third attempt is also missed the the LAG destroyed.
2024-03-26 09:11:00 AM transmt 1 00:00:27
2024-03-26 09:11:13 AM receive 1 00:00:31 00:00:13
2024-03-26 09:11:30 AM transmt 1 00:00:30
2024-03-26 09:11:44 AM receive 1 00:00:31 00:00:14
2024-03-26 09:12:00 AM transmt 1 00:00:30
2024-03-26 09:12:14 AM receive 1 00:00:30 00:00:14
2024-03-26 09:12:30 AM transmt 1 00:00:30
2024-03-26 09:12:45 AM receive 1 00:00:31 00:00:15
2024-03-26 09:13:01 AM transmt 1 00:00:31
2024-03-26 09:13:15 AM receive 1 00:00:30 00:00:14
2024-03-26 09:13:31 AM transmt 1 00:00:30
2024-03-26 09:13:46 AM receive 1 00:00:31 00:00:15
2024-03-26 09:14:01 AM transmt 1 00:00:30
2024-03-26 09:14:17 AM receive 1 00:00:31 00:00:16
2024-03-26 09:14:31 AM transmt 1 00:00:30
2024-03-26 09:14:47 AM receive 1 00:00:30 00:00:16
2024-03-26 09:15:01 AM transmt 1 00:00:30
2024-03-26 09:15:18 AM receive 1 00:00:31 00:00:17
2024-03-26 09:15:31 AM transmt 1 00:00:30
2024-03-26 09:15:48 AM receive 1 00:00:30 00:00:17
2024-03-26 09:16:02 AM transmt 1 00:00:31
2024-03-26 09:16:19 AM receive 1 00:00:31 00:00:17
2024-03-26 09:16:32 AM transmt 1 00:00:30
2024-03-26 09:16:50 AM receive 1 00:00:31 00:00:18
2024-03-26 09:17:02 AM transmt 1 00:00:30
2024-03-26 09:17:20 AM receive 1 00:00:30 00:00:18
2024-03-26 09:17:32 AM transmt 1 00:00:30
2024-03-26 09:17:51 AM receive 1 00:00:31 00:00:19
2024-03-26 09:18:02 AM transmt 1 00:00:30
2024-03-26 09:18:21 AM receive 1 00:00:30 00:00:19
2024-03-26 09:18:32 AM transmt 1 00:00:30
2024-03-26 09:18:52 AM receive 1 00:00:31 00:00:20
2024-03-26 09:19:02 AM transmt 1 00:00:30
2024-03-26 09:19:22 AM receive 1 00:00:30 00:00:20
2024-03-26 09:19:32 AM transmt 1 00:00:30
2024-03-26 09:19:53 AM receive 1 00:00:31 00:00:21
2024-03-26 09:20:03 AM transmt 1 00:00:31
2024-03-26 09:20:24 AM receive 1 00:00:31 00:00:21
2024-03-26 09:20:33 AM transmt 1 00:00:30
2024-03-26 09:20:54 AM receive 1 00:00:30 00:00:21
2024-03-26 09:21:03 AM transmt 1 00:00:30
2024-03-26 09:21:25 AM receive 1 00:00:31 00:00:22
2024-03-26 09:21:33 AM transmt 1 00:00:30
2024-03-26 09:21:55 AM receive 1 00:00:30 00:00:22
2024-03-26 09:22:03 AM transmt 1 00:00:30
2024-03-26 09:22:26 AM receive 1 00:00:31 00:00:23
2024-03-26 09:22:33 AM transmt 1 00:00:30
2024-03-26 09:22:57 AM receive 1 00:00:31 00:00:24
2024-03-26 09:23:03 AM transmt 1 00:00:30
2024-03-26 09:23:27 AM receive 1 00:00:30 00:00:24
2024-03-26 09:23:33 AM transmt 1 00:00:30
2024-03-26 09:23:58 AM receive 1 00:00:31 00:00:25
2024-03-26 09:24:03 AM transmt 1 00:00:30
2024-03-26 09:24:29 AM receive 1 00:00:31 00:00:26
2024-03-26 09:24:33 AM transmt 1 00:00:30
2024-03-26 09:24:59 AM receive 1 00:00:30 00:00:26
2024-03-26 09:25:04 AM transmt 1 00:00:31
2024-03-26 09:25:30 AM receive 1 00:00:31 00:00:26
2024-03-26 09:25:34 AM transmt 1 00:00:30
2024-03-26 09:26:00 AM receive 1 00:00:30 00:00:26
2024-03-26 09:26:04 AM transmt 1 00:00:30
2024-03-26 09:26:31 AM receive 1 00:00:31 00:00:27
2024-03-26 09:26:34 AM transmt 1 00:00:30
2024-03-26 09:27:01 AM receive 1 00:00:30 00:00:27
2024-03-26 09:27:04 AM transmt 1 00:00:30
2024-03-26 09:27:32 AM receive 1 00:00:31 00:00:28
2024-03-26 09:27:34 AM transmt 1 00:00:30
2024-03-26 09:28:03 AM receive 1 00:00:31 00:00:29
2024-03-26 09:28:04 AM transmt 1 00:00:30
2024-03-26 09:28:33 AM receive 1 00:00:30 00:00:29
2024-03-26 09:28:34 AM transmt 1 00:00:30
2024-03-26 09:29:04 AM transmt 0 00:00:31
2024-03-26 09:29:04 AM receive 1 00:00:30 00:00:00
2024-03-26 09:29:35 AM transmt 1 00:00:31
2024-03-26 09:29:35 AM receive 1 00:00:31 00:00:00
2024-03-26 09:30:05 AM transmt 1 00:00:30
2024-03-26 09:30:05 AM receive 1 00:00:30 00:00:00
2024-03-26 09:30:35 AM transmt 1 00:00:30
2024-03-26 09:30:36 AM receive 1 00:00:31 00:00:01
2024-03-26 09:31:05 AM transmt 1 00:00:30
2024-03-26 09:31:07 AM receive 1 00:00:31 00:00:02
2024-03-26 09:31:35 AM transmt 1 00:00:30
2024-03-26 09:31:37 AM receive 1 00:00:30 00:00:02
2024-03-26 09:32:05 AM transmt 1 00:00:30
2024-03-26 09:32:08 AM receive 1 00:00:31 00:00:03
2024-03-26 09:32:35 AM transmt 1 00:00:30
2024-03-26 09:32:39 AM receive 1 00:00:31 00:00:04
2024-03-26 09:33:05 AM transmt 1 00:00:30
2024-03-26 09:33:10 AM receive 1 00:00:31 00:00:05
2024-03-26 09:33:36 AM transmt 1 00:00:31
2024-03-26 09:33:40 AM receive 1 00:00:30 00:00:04
2024-03-26 09:34:06 AM transmt 1 00:00:30
2024-03-26 09:34:11 AM receive 1 00:00:31 00:00:05
2024-03-26 09:34:36 AM transmt 1 00:00:30
2024-03-26 09:34:42 AM receive 1 00:00:31 00:00:06
2024-03-26 09:35:06 AM transmt 1 00:00:30
2024-03-26 09:35:12 AM receive 1 00:00:30 00:00:06
2024-03-26 09:35:36 AM transmt 1 00:00:30
2024-03-26 09:35:43 AM receive 1 00:00:31 00:00:07
2024-03-26 09:36:06 AM transmt 1 00:00:30
2024-03-26 09:36:14 AM receive 1 00:00:31 00:00:08
2024-03-26 09:36:36 AM transmt 1 00:00:30
2024-03-26 09:36:45 AM receive 1 00:00:31 00:00:09
2024-03-26 09:37:06 AM transmt 1 00:00:30
2024-03-26 09:37:15 AM receive 1 00:00:30 00:00:09
2024-03-26 09:37:36 AM transmt 1 00:00:30
2024-03-26 09:37:46 AM receive 1 00:00:31 00:00:10
2024-03-26 09:38:07 AM transmt 1 00:00:31
2024-03-26 09:38:17 AM receive 1 00:00:31 00:00:10
2024-03-26 09:38:36 AM transmt 1 00:00:29
2024-03-26 09:38:47 AM receive 1 00:00:30 00:00:11
2024-03-26 09:39:07 AM transmt 1 00:00:31
2024-03-26 09:39:18 AM receive 1 00:00:31 00:00:11
2024-03-26 09:39:37 AM transmt 1 00:00:30
2024-03-26 09:39:49 AM receive 1 00:00:31 00:00:12
2024-03-26 09:40:07 AM transmt 1 00:00:30
2024-03-26 09:40:20 AM receive 1 00:00:31 00:00:13
2024-03-26 09:40:37 AM transmt 1 00:00:30
2024-03-26 09:40:50 AM receive 1 00:00:30 00:00:13
2024-03-26 09:41:07 AM transmt 1 00:00:30
2024-03-26 09:41:21 AM receive 1 00:00:31 00:00:14
2024-03-26 09:41:37 AM transmt 1 00:00:30
2024-03-26 09:41:52 AM receive 1 00:00:31 00:00:15
2024-03-26 09:42:07 AM transmt 1 00:00:30
2024-03-26 09:42:22 AM receive 1 00:00:30 00:00:15
2024-03-26 09:42:37 AM transmt 1 00:00:30
2024-03-26 09:42:53 AM receive 1 00:00:31 00:00:16
2024-03-26 09:43:07 AM transmt 1 00:00:30
2024-03-26 09:43:24 AM receive 1 00:00:31 00:00:17
2024-03-26 09:43:38 AM transmt 1 00:00:31
2024-03-26 09:43:54 AM receive 1 00:00:30 00:00:16
2024-03-26 09:44:08 AM transmt 1 00:00:30
2024-03-26 09:44:25 AM receive 1 00:00:31 00:00:17
2024-03-26 09:44:38 AM transmt 1 00:00:30
2024-03-26 09:44:56 AM receive 1 00:00:31 00:00:18
2024-03-26 09:45:08 AM transmt 1 00:00:30
2024-03-26 09:45:26 AM receive 1 00:00:30 00:00:18
2024-03-26 09:45:38 AM transmt 1 00:00:30
2024-03-26 09:45:57 AM receive 1 00:00:31 00:00:19
2024-03-26 09:46:08 AM transmt 1 00:00:30
2024-03-26 09:46:28 AM receive 1 00:00:31 00:00:20
2024-03-26 09:46:39 AM transmt 1 00:00:31
2024-03-26 09:46:58 AM receive 1 00:00:30 00:00:19
2024-03-26 09:47:09 AM transmt 1 00:00:30
2024-03-26 09:47:29 AM receive 1 00:00:31 00:00:20
2024-03-26 09:47:39 AM transmt 1 00:00:30
2024-03-26 09:48:00 AM receive 1 00:00:31 00:00:21
2024-03-26 09:48:09 AM transmt 1 00:00:30
2024-03-26 09:48:30 AM receive 1 00:00:30 00:00:21
2024-03-26 09:48:39 AM transmt 1 00:00:30
2024-03-26 09:49:01 AM receive 1 00:00:31 00:00:22
2024-03-26 09:49:09 AM transmt 1 00:00:30
2024-03-26 09:49:32 AM receive 1 00:00:31 00:00:23
2024-03-26 09:49:39 AM transmt 1 00:00:30
2024-03-26 09:50:02 AM receive 1 00:00:30 00:00:23
2024-03-26 09:50:09 AM transmt 1 00:00:30
2024-03-26 09:50:33 AM receive 1 00:00:31 00:00:24
2024-03-26 09:50:39 AM transmt 1 00:00:30
2024-03-26 09:51:04 AM receive 1 00:00:31 00:00:25
2024-03-26 09:51:09 AM transmt 1 00:00:30
2024-03-26 09:51:34 AM receive 1 00:00:30 00:00:25
2024-03-26 09:51:39 AM transmt 1 00:00:30
2024-03-26 09:52:05 AM receive 1 00:00:31 00:00:26
2024-03-26 09:52:10 AM transmt 1 00:00:31
2024-03-26 09:52:36 AM receive 1 00:00:31 00:00:26
2024-03-26 09:52:40 AM transmt 1 00:00:30
2024-03-26 09:53:06 AM receive 1 00:00:30 00:00:26
2024-03-26 09:53:10 AM transmt 1 00:00:30
2024-03-26 09:53:37 AM receive 1 00:00:31 00:00:27
2024-03-26 09:53:40 AM transmt 1 00:00:30
2024-03-26 09:54:07 AM receive 1 00:00:30 00:00:27
2024-03-26 09:54:10 AM transmt 1 00:00:30
2024-03-26 09:54:38 AM receive 1 00:00:31 00:00:28
2024-03-26 09:54:40 AM transmt 1 00:00:30
2024-03-26 09:55:09 AM receive 1 00:00:31 00:00:29
2024-03-26 09:55:10 AM transmt 1 00:00:30
2024-03-26 09:55:39 AM receive 1 00:00:30 00:00:29
2024-03-26 09:55:40 AM transmt 1 00:00:30
2024-03-26 09:56:10 AM receive 1 00:00:31 00:00:30
2024-03-26 09:56:11 AM transmt 1 00:00:31
2024-03-26 09:56:41 AM transmt 0 00:00:31
2024-03-26 09:56:41 AM receive 1 00:00:30 00:00:00
2024-03-26 09:57:11 AM transmt 1 00:00:30
2024-03-26 09:57:11 AM receive 1 00:00:30 00:00:00
2024-03-26 09:57:41 AM transmt 1 00:00:30
2024-03-26 09:57:42 AM receive 1 00:00:31 00:00:01
2024-03-26 09:58:11 AM transmt 1 00:00:30
2024-03-26 09:58:12 AM receive 1 00:00:30 00:00:01
Firmware in use is SwOS Lite 2.18.
Unless somebody from MikroTik can elaborate here on this I am tempting to take this as a support problem.
I can confirm this behaviour, CSS-610 and pfSense 2.72 & 2.8
Version 2.14 works, 2.17, 2.18 and 2.19 behave exactly as described above.
Had to go back to v2.14 last night because v2.19 doesn’t work either.