Community discussions

MikroTik App
 
joan10
just joined
Topic Author
Posts: 4
Joined: Sun Oct 06, 2019 11:57 am

Short but periodic packet loss on 2xSXTSq AC link

Thu May 27, 2021 1:15 pm

Hello all,

We set up a link with a pair of SXT Sq AC (RBSXTsqG-5acD) covering a distance of 330m for serving some clients. After adjusting both units, selecting proper channels and frequencies we put this link on production with an overall CCQ above 90% (attaching a graph of gathered CCQs of the last 7 days). TX power in both units is below the maximum recommended for AC at MCS9 and RX signal is -58/-59 dBm. We monitored the spectrum and it seems we're the only ones on this channel, so link should not have any problems and work perfectly. Bandwith test shows a througput of about 200mbps. Not very worried about that.

Shot of CCQ graph:
mktik1.jpg
However, after some days in production we realised that the link losses some packets when it has moderate traffic (around 30mbits/second down and 4-5 up). This losses happen approximately every 2 minutes and only last for about 2-3 seconds. When there's no traffic, this doesn't happen. What surprises me more is the fact that not only icmp packets are lost, but all packets (even the ones causing the moderate traffic) and with a suspiciously stable periodicity and duration. Replicating this problem is not easy but yesterday I managed to catch a screenshot of this behaviour:
mktik2.jpg
The characteristics of this incident (like its periodicity and the complete loss of the links throughput) and the fact that all wireless metrics seem OK makes me think that it could be more a hardware/ software problem.

Sure it's not going to ruin the bussines, but for real time applications like videocalls or online gaming the truth is that bothers a bit.

I'm also attaching a /export of the whole configuration of both antennas and a "print advanced" output of both sides.

Some facts:
  • Link distance: 330m
  • Device: RBSXTsqG-5acD
  • Protocol used: 802.11
  • Frequency: 5805
  • RX signal: -59/-59
  • TX power: 16 dBm both
  • TX and RX rate: 400Mbps-40MHz/2S/SGI both
  • rOS version: 6.48.1 on both
  • SNR: 48dB
  • CPU usage: around 10%
Things we tried and fixed nothing:
  • Use only 20Mhz channels
  • Use different channels
  • Disable station-roaming (fixed other problems but not this)*
  • Transmit at more power
  • Transmit at less power
* Station-roaming was causing a similar problem but even more constant and foreseeable. We disabled it and we got more stability but apparently was not enough.

Things we didn't try because we don't think are going to fix anything or have no sense (but we may be wrong)
  • Upgrade to 6.48.3 (no improvements on this direction)
  • Remove wireless security
  • Try different protocols
  • Change antennas and radios
  • Disable SGI
So, some questions I would like to ask you:
  • Is it an expected behaviour and we're exagerating?
  • Do you believe like us that it's a hardware/software problem more than a link configuration issue?
  • Has anybody here experienced the same problem ever before?
  • If so, does anybody has some clue about how to fix it? (apart from trying different devices)
Thanks all,
Joan

EDIT: Also checked CPU usage and seems normal.
You do not have the required permissions to view the files attached to this post.
 
User avatar
bpwl
Forum Guru
Forum Guru
Posts: 2978
Joined: Mon Apr 08, 2019 1:16 am

Re: Short but periodic packet loss on 2xSXTSq AC link

Thu May 27, 2021 7:07 pm

What's in the LOG about those hiccups? (/system logging topics=wireless) . EG: Disconnect, reassociation, sending station leaving reason (1,3,8), excessive data loss, received deauth, group key exchange ..... etc etc etc

If it not the wireless, next thing to check is the spanning tree. (STP,RSTP ...). Timing could correspond with a spanning tree transition, even triggered by some other device in the same broadcast domain.
 
joan10
just joined
Topic Author
Posts: 4
Joined: Sun Oct 06, 2019 11:57 am

Re: Short but periodic packet loss on 2xSXTSq AC link

Fri May 28, 2021 12:05 pm

Thanks for your answer.

Yesterday I managed to replicate the event. Made a bandwith test up, down and both. Things get interesting because it seems to only happen when the traffic is on both TX and RX.
Replying in lines:
What's in the LOG about those hiccups? (/system logging topics=wireless) . EG: Disconnect, reassociation, sending station leaving reason (1,3,8), excessive data loss, received deauth, group key exchange ..... etc etc etc
Nothing. Activated debug logs with:
topics=!ssh,!snmp,debug action=disk 
I managed to replicate the event and only one message was shown. In addition this message did not apear all times:
19:51:58 route,debug,event Interface change 
19:51:58 route,debug,event     interface=wlan1 
19:51:58 route,debug,event     status=UP,RUNNING 
19:51:58 route,debug,event     mtu=1500 
19:51:59 route,debug,calc Begin calculation 
19:51:59 route,debug,calc End calculation 
19:52:02 route,debug,event Interface change 
19:52:02 route,debug,event     interface=wlan1 
19:52:02 route,debug,event     status=UP,RUNNING 
19:52:02 route,debug,event     mtu=1500 
19:52:02 route,debug,calc Begin calculation 
19:52:02 route,debug,calc End calculation 
I doubt this is a "cause message" but a consequence of the error.
If it not the wireless, next thing to check is the spanning tree. (STP,RSTP ...). Timing could correspond with a spanning tree transition, even triggered by some other device in the same broadcast domain.
If it was spanning tree or L2/L3 related, wouldn't it happen even without traffic on the wireless link?
 
User avatar
bpwl
Forum Guru
Forum Guru
Posts: 2978
Joined: Mon Apr 08, 2019 1:16 am

Re: Short but periodic packet loss on 2xSXTSq AC link

Fri May 28, 2021 2:36 pm

"Interface change" as cause ?
Maybe other forum entries might help with ideas ... like: viewtopic.php?t=105456
 
tedroco
just joined
Posts: 1
Joined: Sat May 29, 2021 9:23 pm

Re: Short but periodic packet loss on 2xSXTSq AC link

Sat May 29, 2021 9:25 pm

"Interface change" as cause ?
Maybe other forum entries might help with ideas ... like: viewtopic.php?t=105456
Thanks!
 
AmirFarro
newbie
Posts: 34
Joined: Tue Feb 09, 2021 5:22 pm

Re: Short but periodic packet loss on 2xSXTSq AC link

Mon Jun 28, 2021 11:35 am

Hello all,

We set up a link with a pair of SXT Sq AC (RBSXTsqG-5acD) covering a distance of 330m for serving some clients. After adjusting both units, selecting proper channels and frequencies we put this link on production with an overall CCQ above 90% (attaching a graph of gathered CCQs of the last 7 days). TX power in both units is below the maximum recommended for AC at MCS9 and RX signal is -58/-59 dBm. We monitored the spectrum and it seems we're the only ones on this channel, so link should not have any problems and work perfectly. Bandwith test shows a througput of about 200mbps. Not very worried about that.

Shot of CCQ graph:
mktik1.jpg

However, after some days in production we realised that the link losses some packets when it has moderate traffic (around 30mbits/second down and 4-5 up). This losses happen approximately every 2 minutes and only last for about 2-3 seconds. When there's no traffic, this doesn't happen. What surprises me more is the fact that not only icmp packets are lost, but all packets (even the ones causing the moderate traffic) and with a suspiciously stable periodicity and duration. Replicating this problem is not easy but yesterday I managed to catch a screenshot of this behaviour:

mktik2.jpg

The characteristics of this incident (like its periodicity and the complete loss of the links throughput) and the fact that all wireless metrics seem OK makes me think that it could be more a hardware/ software problem.

Sure it's not going to ruin the bussines, but for real time applications like videocalls or online gaming the truth is that bothers a bit.

I'm also attaching a /export of the whole configuration of both antennas and a "print advanced" output of both sides.

Some facts:
  • Link distance: 330m
  • Device: RBSXTsqG-5acD
  • Protocol used: 802.11
  • Frequency: 5805
  • RX signal: -59/-59
  • TX power: 16 dBm both
  • TX and RX rate: 400Mbps-40MHz/2S/SGI both
  • rOS version: 6.48.1 on both
  • SNR: 48dB
  • CPU usage: around 10%
Things we tried and fixed nothing:
  • Use only 20Mhz channels
  • Use different channels
  • Disable station-roaming (fixed other problems but not this)*
  • Transmit at more power
  • Transmit at less power
* Station-roaming was causing a similar problem but even more constant and foreseeable. We disabled it and we got more stability but apparently was not enough.

Things we didn't try because we don't think are going to fix anything or have no sense (but we may be wrong)
  • Upgrade to 6.48.3 (no improvements on this direction)
  • Remove wireless security
  • Try different protocols
  • Change antennas and radios
  • Disable SGI
So, some questions I would like to ask you:
  • Is it an expected behaviour and we're exagerating?
  • Do you believe like us that it's a hardware/software problem more than a link configuration issue?
  • Has anybody here experienced the same problem ever before?
  • If so, does anybody has some clue about how to fix it? (apart from trying different devices)
Thanks all,
Joan

EDIT: Also checked CPU usage and seems normal.
I have same problem, were you able to solve it?
 
AmirFarro
newbie
Posts: 34
Joined: Tue Feb 09, 2021 5:22 pm

Re: Short but periodic packet loss on 2xSXTSq AC link

Sat Jul 03, 2021 8:47 pm

Bump!
 
joan10
just joined
Topic Author
Posts: 4
Joined: Sun Oct 06, 2019 11:57 am

Re: Short but periodic packet loss on 2xSXTSq AC link

Thu Sep 16, 2021 3:42 pm

Hello all,

I could solve it, at least with some sort of a workaround. In fact I confirmed that I no longer have this problem some hours ago. As bpwl said it was related to spanning tree protocol. To solve it I simply disabled rstp in both SXTSq's bridges:
/interface bridge set  protocol-mode=none br0
However, I still don't know the root cause of the problem. I think it's due to a bad L2 setup caused by a bad configuration of the CRS112P that is connected to one of the SXTs, but I am still not sure. I tried to reproduce it in other links with no luck so far so I am worried about the possibility of having the same problem in other links of my network. I think I'll need to look for my degree notes.

L2 topology is the following:
RB3011 --- SXTSq 1 (- - - - -) SXTSq 2 --- CRS112 --- RB2011

Link between CRS112 and RB2011 has vlan tags (both ports are trunk ports). Attaching configuration of both devices.

Thanks
Joan
You do not have the required permissions to view the files attached to this post.
Last edited by joan10 on Fri Sep 17, 2021 10:53 am, edited 1 time in total.
 
cdemers
Member Candidate
Member Candidate
Posts: 224
Joined: Sun Feb 26, 2006 3:32 pm
Location: Canada
Contact:

Re: Short but periodic packet loss on 2xSXTSq AC link

Thu Sep 16, 2021 8:43 pm

I found that out also, it happens on many devices from my use. Spanning tree kept causing me problems with dropping packets mostly OSPF on PTP wireless links. I now only enable spanning tree on bridges that require it and haven't really had issues with it since. But if you forget and accidentally make a loop it will cause problems. I run a fully routed network, with redundant links using spanning tree for fast fail over (mostly 60ghz links with 5ghz backup).
 
User avatar
bpwl
Forum Guru
Forum Guru
Posts: 2978
Joined: Mon Apr 08, 2019 1:16 am

Re: Short but periodic packet loss on 2xSXTSq AC link

Thu Sep 16, 2021 11:23 pm

with redundant links using spanning tree for fast fail over (mostly 60ghz links with 5ghz backup).

Heh ... the Cube60 runs with "bonded interfaces for 60 and 5 GHz" as fastest failover (viewtopic.php?t=165681#p831792)

AFAIK with bonded interfaces there is no need for spanning tree protection in a pure tree network using bonded links, but you still have the redundancy based on the bonding.
No spanning tree needed if you don't create redundant loops with the bonded links.

Bonding versus redundant loops: Most interesting with stacked switches or virtual switches (one logical switch over multiple hardware switches), when the physical links run between different hardware for one bonded interface.

If one combines bonding and redundant loops and VLAN, do use MSTP not (R)STP. MSTP can recognize that while the loop exists, it has been contained by VLAN settings and does not disable the loop link.
 
User avatar
bpwl
Forum Guru
Forum Guru
Posts: 2978
Joined: Mon Apr 08, 2019 1:16 am

Re: Short but periodic packet loss on 2xSXTSq AC link

Thu Sep 16, 2021 11:36 pm

If it was spanning tree or L2/L3 related, wouldn't it happen even without traffic on the wireless link?
I don't know the details of this setup, but know that a WLAN interface on an AP is DOWN if there is no connected client.
I discovered in our large industrial network that the (R)STP root bridge election was often triggered by some engineers device that started sending BPDU messages when connected. Since then, all open ports are set with "BPDU guard", requiring manual intervention from the IT department to re-enable that port.
 
AmirFarro
newbie
Posts: 34
Joined: Tue Feb 09, 2021 5:22 pm

Re: Short but periodic packet loss on 2xSXTSq AC link

Tue Jul 12, 2022 11:14 am

Thanks for your answer.

Yesterday I managed to replicate the event. Made a bandwith test up, down and both. Things get interesting because it seems to only happen when the traffic is on both TX and RX.
Replying in lines:
What's in the LOG about those hiccups? (/system logging topics=wireless) . EG: Disconnect, reassociation, sending station leaving reason (1,3,8), excessive data loss, received deauth, group key exchange ..... etc etc etc
Nothing. Activated debug logs with:
topics=!ssh,!snmp,debug action=disk 
I managed to replicate the event and only one message was shown. In addition this message did not apear all times:
19:51:58 route,debug,event Interface change 
19:51:58 route,debug,event     interface=wlan1 
19:51:58 route,debug,event     status=UP,RUNNING 
19:51:58 route,debug,event     mtu=1500 
19:51:59 route,debug,calc Begin calculation 
19:51:59 route,debug,calc End calculation 
19:52:02 route,debug,event Interface change 
19:52:02 route,debug,event     interface=wlan1 
19:52:02 route,debug,event     status=UP,RUNNING 
19:52:02 route,debug,event     mtu=1500 
19:52:02 route,debug,calc Begin calculation 
19:52:02 route,debug,calc End calculation 
I doubt this is a "cause message" but a consequence of the error.
If it not the wireless, next thing to check is the spanning tree. (STP,RSTP ...). Timing could correspond with a spanning tree transition, even triggered by some other device in the same broadcast domain.
If it was spanning tree or L2/L3 related, wouldn't it happen even without traffic on the wireless link?
It's been one and a half years that I've been suffering from that problem and it got solved by simply changing RSTP to STP on my AP. Thank you so much!
I get some weird and illogical ping spikes, I wonder if you have any idea about that as well.

Who is online

Users browsing this forum: Ahrefs [Bot] and 42 guests