3KM Link Disconecting intermitently

Hello,

I have deployed several mikrotik point to point link ranging from 1km to 28km and are all working fine, some have been working for over 4years but just deployed a 3km link recently and have been battling to get it to carry 1meg of data all to no avail.

The Details:
RB433AH at both end,
R53Hn wireless card at both side
27dbi grid antenna in use at both end
1m Low Loss Jumper cable between radio and antenna
Installation carried out at 100ft above water level at both side,
Signal Strength is -56/-58,
Configuration:
Just a simple bridge of all the interfaces,

The link is working but disconnect with the following error messages at will (see Attachment).
Screenshot.png
The problem is the link will work well for like 30min or less and disconnection and reconnection process will start during which users connection will be timing out.

I have tried the following:
I have changed the board to RB411, Cards to R52, no solution.
Have tried all frequency starting from 5180 - 5805 all the same. the above is the snap short taken yesterday.

help is needed urgently please.

There seems to be several users experiencing this problem. Some users get
lost connection: no beacons received
or
lost connection: not polled for too long
or
lost connection: medium access timeout

These errors appear to be associated with the beacon packets (or lack of beacon packets), not with a loss of signal or noise. I am working with Mikrotik support to correct the beacon packet problem now.

You might want to generate a supout.rif and email it to support(at)mikrotik.com with an explanation of the problem. So far, they are having a bit of a challenge believing me. :confused:

SurferTim

I will send support.rif to mikrotik right away, it just that they hardly respond to issues like these on time especially if they know that the problem is from them. Until they get a fix, they will not respond.

But this problem does not exist on other existing links in used? some even using the same router OS version with these problematic ones, ver 4.11, I have use ver 4.10 and it is still the same.


Thanks do!

Oh contrar, mon frer! They do not get away from me that easy! It sometimes takes them a while to see the problem, but if you stay on top of it, they will find the problem. They did with the PayPal problem. :smiley:

Any news on this issue yet.. Since I migrated to 4.16 with NV2 I have been seeing the same disconnects with ALL the messages that you mention for no apparent reasons. Even when cpe has snr of 30-40dB and signal levels of -50 to -70, it makes no difference, the AP just seems to pick on random cpe’s regardless of signal status..

7 clients on the AP.

OK. lets stay on top of it. I have some PtP link suffering the same, with NV2 package. Thus my log tells me “lost connection, control frame timeout” on a further very strong stable link… that worked fine for years in 802.11a
Will send e-mail to MT too…

OK, then i don’t make new thread.
Same problem here, waiting for some advice…

@pmarton: Can you post your log from the AP and the station? If we are going to get MT to fix this, we need all the documentation we can get. A section of both the AP and station log would help, along with signal strengths and snr for both.

Creating a supout.rif file and emailing it to support(at)mikrotik.com would help too.

There is another thread about this too.
http://forum.mikrotik.com/t/log-reassociating-disconnected-ok-connected/44784/1

ADD: I am interested in what times of day these happen. Is it all the time, or limited to certain times of day?

And what was the estimated traffic volume during the disconnects? My disconnects happened mostly in the evenings during heavy station uploads (transmissions). I am at the beach, so in the evenings, the tourists email pictures home to friends and family.

For me, anytime night or day, even at 4 AM… There appears to be NO relationship regarding whether CPE’s are downloading either… TOTALLY at random!

Sometimes, once a day, sometimes twice… Before I migrated to NV2 I had CPE’s registered on this AP for 42 days without 1 single interruption in radio service, and the ONLY break in service after this time was when I installed 4.16 etc etc.

please install RouterOS v4.17 or RouterOS v5.0rc11 as it has lot of fixes for Nv2 protocol that should increase the stability of the Nv2 protocol.

I presume you mean both AP and CPE? Last time I did upgrade the AP to v5.0rc10 I lost two CPE rb433’s. One is still dead on my desk and the other only came alive after manual hardware reset (screwdriver!).

So until someone can tell me the irregular but common disconnects are disappearing after upgrading to one of these two upgrades I am not gambling any more on my network.. First please someone tell me they found the solution…

I have one rb333 with 3 R52 cards running 4.16NV2. Two of these cards both serve two rb433AH’s and have the nv2 protocol enabled. One card serves only one other rb433 board and is running the old 802.11.
These 5 remote rb433’s are all AP’s themselves (so they have two radio cards, one for backhaul and one as AP) and some of them run the nv2 protocol (4).
On each one of the double cards it is only one link that is disconnecting at random, regularly or irregular.
I tried all kinds of freq’s to try to stay away fm interferences, I set some AP’s to 10Mhz channel instead of default, I make channel distances at least 40Mhz. All but no results…

Sometimes, like this Saturday, I thought to have it under control, links stayed stable for almost 24 hours, high traffic or idle, this Sunday one link started playing up again so I decided to go for that 10Mhz channels on the AP’s to even further diminish possible interference issues, and yes, since Sunday afternoon no more disconnects. :smiley:
Woke up this morning to see the disconnects are just back again! :confused: On two links!
Sometimes it happens after 15 mins, sometimes only after 3 hours… but went on all day. One of the links worse then the other. And only on one link of each radio’s serving two links…

My suspicion lies with the rb333 as AP. Since for the two dual serving radios the processor has to switch between the two remote radio’s all the time it can be some error comes in play here.
But the contradiction is then that only one link out of each of the two dual ones is giving problems. The other link from the same radio is stable as a rock.
I also noticed last week when I tweaked on remote radio of one of these dual serving radios suddenly the other link became unstable while the unstable one became rock steady! So the issue jumped from one ´client´ link to the other ´client´link, but both from the same radio! So than my suspicion went to these ´client´ boards. (Also because I lost 3 of these while upgrading the rb333). On these ´client´ boards I used 4.16NV2 and the 5.0rc10 but that made no difference.

Error message I see in the rb333 is “disconnected, control frame timeout” or “disconnected, not responding”.

In ´client´ I see “lost connection, medium-access timeout” or “lost connection, control frame timeout”

Usually ´client’s´ connect within 5 secs again. Signal levels for all links are between -40 (best link) and - 65 (worst link) and data rates are set to 48Mb only both supported and basic.
Link distances are 6-8 km.
Cell Radius is set to 10 on the AP for all nv2 links and TDMA Period Size to 2 (but I tried 1 and 3 as well. No difference..)

All links with same boards and setup ran for 2 years under normal 802.11a and RTS/CTS fine. Uptimes of 40 days or longer… It is only to keep the upcoming competition away (spectrum ´war´) and recent growth in traffic demand combined with the good result I have on 4.16nv2 on AP’s and the hurray stories I have been reading I decided to use it on these backhauls.
(Back to 802.11 is no option. I tried it but my thougputs fall back to nothing more then 4-6Mb! While they could carry 12Mb easy some months ago! Now with nv2 at least my througputs are still in the 12-15Mb regions…

Lost board issue:
Last time I updated this rb333 AP to 5.0rc10 I lost one rb433 ´client´ board completely. It is dead and only option left is net-install which I didn’t find time for yet.
Next day tried same upgrade of the rb333 again only to find me in the situation that now TWO of the ´client´ rb433AH’s became unreachable. (Even the Ethernet ports stayed dead! Even after a power cycle!)
Only after removing them to my workdesk and completely stripping them from the radio’s one worked normal again. (After screwdriver reset). The other one is still in the box waiting for forensics. But the radio’s of all failing boards are still fine! Some already work in other board without any problems!

So, here we all go, plenty to digest. Questions? Remarks?

We see this problem with RouterOS v5.0rc11
This is just a test link (1 km) so i can try anything. I will post logs as i can.

Logs (clocks not in sync):

Cli:
22:53:41 wireless,info 00:0C:42:**:**:41@bg02 cli: lost connection, control frame timeout
22:53:46 wireless,debug bg02 cli: must select network
22:53:46 wireless,debug 00:0C:42:**:**:8A: on 5240 AP: yes SSID Ap******01 caps 0x421 rates 0xff00 basic 0x100 MT: yes
(...)
22:53:46 wireless,debug 00:0C:42:**:**:40: on 5680 AP: no SSID bg04 caps 0x0 rates 0x0 basic 0x0 MT: no
22:53:46 wireless,debug bg02 cli: no network that satisfies connect-list,  by default choose with strongest signal
22:53:46 wireless,debug bg02 cli: must select network
22:53:46 wireless,debug 00:0C:42:**:**:8A: on 5240 AP: yes SSID Ap******01 caps 0x421 rates 0xff00 basic 0x100 MT: yes
(...)
22:53:46 wireless,debug 00:0C:42:**:**:40: on 5680 AP: no SSID bg04 caps 0x0 rates 0x0 basic 0x0 MT: no
22:53:46 wireless,debug bg02 cli: no network that satisfies connect-list,  by default choose with strongest signal
22:53:46 wireless,debug bg02 cli: failed to select network
22:53:51 wireless,debug bg02 cli: must select network
22:53:51 wireless,debug 00:0C:42:**:**:8A: on 5240 AP: yes SSID Ap******01 caps 0x421 rates 0xff00 basic 0x100 MT: yes
(...)
22:53:51 wireless,debug 00:0C:42:**:**:40: on 5680 AP: no SSID bg04 caps 0x0 rates 0x0 basic 0x0 MT: no
22:53:51 wireless,debug 00:0C:42:**:**:41: on 5580 AP: no SSID bg02 caps 0x0 rates 0x0 basic 0x0 MT: no
22:53:51 wireless,debug bg02 cli: no network that satisfies connect-list,  by default choose with strongest signal
22:53:52 wireless,info 00:0C:42:**:**:41@bg02 cli established connection on 5580, SSID bg02

AP:
20:28:43 wireless,info 00:0C:42:**:**:77@bg02_ap: disconnected, not responding
20:29:02 wireless,debug bg02_ap: 00:0C:42:**:**:77 not in local ACL, by default accept
20:29:02 wireless,info 00:0C:42:**:**:77@bg02_ap: connected

What I noticed, what will I do:

  • install NTP :slight_smile:
    firmware on cli is strange (current-firmware: “”), will try to upgrade
    signal on Ch0 is asymmetric, crew have to climb and check cables and alignment
    WDS is a result of the standard 802.11n setup and is unnecessary, have to try with station

Will post progress report.
cli.JPG
ap.JPG
config.txt (6 KB)

It’s just a test link, so it have traffic only when we want to. (I can switch the traffic from our standard 802.11a link)
To answer your question: Yes, the disconnects only happen when “high” (more than 10 Mbps) traffic goes through.

Yes, I can feel your pain for being at the beach. :laughing:

@pmarton: I was getting “no beacons received” messages rather than your “lost connection, control frame timeout”. It appears to be the connection management packets, not data packets, that are being dropped. Are you getting any “lost connection: extensive data loss” disconnects? By your signal and snr, I would guess the answer is “no”.

When I was troubleshooting mine, I put the mac address of the AP in the station (client) wireless connect-list (connect=yes). Then this part of the message went away:

22:53:46 wireless,debug bg02 cli: no network that satisfies connect-list, by default choose with strongest signal

Then, encouraged by that, I set the wireless “default-authentication=no”. The disconnects stopped. I am not sure if it was just a coincidence, but after weeks of disconnects, they stopped and I can’t get them to come back. OH NO!! :laughing:

If I were at Mikrotik troubleshooting this, I would check the section of code in the access point at this decision:
“What if the station is transmitting data packets when it is time for me to send a connection management packet? What do I do?”

And surfing is not easy if ya don’t live at the beach! :smiley:

I have tried all the suggestions excerpt the “default-authentication” which i have not tried setting to no. I will try that today and see the result.

However, it will be fine if mikrotik look into some of this issues critically and timely as these type of experience can discourage a new entrant into the world of mikrotik. To sustain the customer, i have to acquire four teletronics radios for the point to point links while we find solutions to the issue.

Regard to all your efforts.

@JP_Wireless: If the “default-authentication=no” in the station works, I believe that somehow it is just ignoring the “no connection management packets” error, somehow keeping the station from roaming. Please let me know how it works, one way or the other.

At first I thought it was a problem with the station. But if you read the post at the following link, mxfull has tried several different routers with DD-WRT as a station, and they all have disconnect problems when the access point is Mikrotik. No disconnects with a DD-WRT access point.
http://forum.mikrotik.com/t/log-reassociating-disconnected-ok-connected/44784/1

The link remain stable for as long as no reasonable traffic is passed, the disconnect gets more frequent as traffic increases, i did a lot of traffic test on the links,@ 2m/2m, the link disconnect more often than @ 512k/512k or less.

This stuff is now working like virus! I have a link that has been working fine for over two years, all of a sudden it started behaving like those news ones that is disconnecting and reconnecting the moment a traffic of 500k is put on it. The board there is one of those old board (133) and card is R52 65mwatt, signal is -65 to -70 with snr of 20-25db. distance is 28km, antenna installed at 220ft high above sea level. So right now, i have this problem on 3km link, 18km link and 28kl link. Big problem! isn’t it?

Honestly, the whole thing is getting frustrating as one has to keep monitoring or trying one thing or the other on certain links without reasonable success. Try one thing, it will looks like it is over and all of a sudden it will surface again in no time. This might be an online bug from mikrotik synchronous server. (Pardon me, I am not a programmer anyway, just suggesting)

It may be of little interest, or it may be of great significance.

The situation continues on one of my AP’s. However this time, all the cpe’s that lost their connections (4 out of 7), bar one, reported in their logs that the AP as “not a MT”, and that Ap “uses TDMA skip”

1st time ever for these weird reports, still it makes a change from ALL the other error messages that the cpe’s have been saying in the last few weeks since using NV2!!

But what was incredibly interesting was that 1 cpe alone, actually rebooted in the same minute that the other 3 disconnected.. From the logs and studying the timings it seemed that the cpe which totally rebooted (router was rebooted without proper power shutdown), happened less than a minute before the others disconnected.

Could it be in some way, that with NV2 on some specific RB’s (wrap 2e in this case), that the radio card does not re-start cleanly, ie a continuous burst of data without any framing, causing other cpe’s to miss their time slots and momentarily loose connection.

That said, 3 other cpe’s that were connected to same AP, were unaffected!

MT, are you on top of this situation? I see nothing in the change log of 4.17 that would suggest any significant radio changes! so I am somewhat loathe to upgrade again. It anything I am biased to disable NV2 and return to 80211 and rts/cts whereby this AP had customers connected for 40 days without 1 single break in wireless service.

currently there are some issues with Nv2 protocol on x86 architecture. For now we suggest to use Routerboard hardware till we find a solution for the x86 hardware running Nv2 protocol.