911G-5HPacD crash/reboot when sending 1M+ over wlan1

Hello all
I need your help with an issue i am experiencing today. I noticed that today but it could have started yesterday. FYI, Yesterday there was a power outage.

I have a bridged link between two 911G-5HPacD. Using nv2. I use the bridge to share an Internet connection.

On my end I have the station. On far end there is the AP.

[ My house LAN ] ------ [Station] ~~~~~~~~~[AP] ------- [telco modem/router]

Classic deployment. All bridged. It was working perfectly until yesterday. No changes in config made.

Today I was running a speedtest from my computer in my house and noticed that the Station rebooted every time the speed test website was testing the UP speed. It was consistently reproducible. So I logged into the Station and run a btest from there, sending data towards the AP. Same issue, crash and reboot.
I have run multiple tests and it looks like the issue occurs where sending more than 1M traffic over the wlan1 connection. Receiving is fine.

I need help for troubleshooting the issue. Logs only show that there was an unexpected reboot. Nothing more. How can I see what cause the crash/reboot?

/interface ethernet
set [ find default-name=ether1 ] advertise=100M-full
/interface bridge
add fast-forward=no name=bridge1
/interface wireless security-profiles
set [ find default=yes ] eap-methods=“” supplicant-identity=MikroTik
add authentication-types=wpa2-psk eap-methods=“” mode=dynamic-keys name=WPA
supplicant-identity=“” wpa2-pre-shared-key=XXXX
/interface wireless
set [ find default-name=wlan1 ] adaptive-noise-immunity=ap-and-client-mode
antenna-gain=23 band=5ghz-onlyn basic-rates-a/g=54Mbps country=italy
disabled=no frequency=5620 frequency-mode=superchannel hw-retries=10
mode=station-bridge nv2-preshared-key=XXXX nv2-security=
enabled radio-name=XXXX rate-set=configured rx-chains=0,1 scan-list=
5620 security-profile=WPA ssid=XXX supported-rates-a/g=54Mbps tx-chains=
0,1 tx-power=8 tx-power-mode=all-rates-fixed vht-supported-mcs=mcs0-9
wireless-protocol=nv2-nstreme-802.11
/interface bridge filter
add action=accept chain=forward disabled=yes in-interface=wlan1
out-interface=ether1 src-mac-address=B4:A5:EF:79:43:00/FF:FF:FF:FF:FF:FF
add action=drop chain=forward disabled=yes in-interface=wlan1 out-interface=
ether1
/interface bridge port
add bridge=bridge1 interface=wlan1
add bridge=bridge1 interface=ether1
/interface wireless access-list
add interface=wlan1 mac-address=E4:8D:8C:21:E4:07 vlan-mode=no-tag
/interface wireless align
set audio-min=-125 audio-monitor=E4:8D:8C:21:E4:07 receive-all=yes ssid-all=
yes
/interface wireless connect-list
add interface=wlan1 mac-address=E4:8D:8C:21:E4:07 security-profile=default
/ip address
add address=192.168.2.254/24 interface=wlan1 network=192.168.2.0
/ip dns
set servers=192.168.2.1
/ip route
add distance=1 gateway=192.168.2.1
/ip service
set telnet disabled=yes
set ftp disabled=yes
set www disabled=yes
set api-ssl disabled=yes
/system clock
set time-zone-name=Europe/Rome
/system identity
set name=XXXX
/system routerboard settings
set init-delay=0s
/tool sniffer
set file-limit=10000KiB only-headers=yes

Following up on my previous post.

i have investigated the issue further and found that the issue is related to the the TX chains in the Station radio.
If i enable both tx chains ( tx-chains=0,1) the issue is reproducible.
If I only use 0 or 1 all is fine.

Once again, it was working fine until yesterday.

Do you think this is a hardware issue?
Anything else i could try, other than buying a new card ?

thanks
gianrico

Going crazy.

I was trying to troubleshoot further so I thought i gave it a try and set tx-chains=0,1 with rx-chains=0 (or 1)

No issues.

After that i was about to reproduce for having a look at the disk logs, so I have set tx-chains=0,1 and rx-chains=0,1.
Surprise issue was gone.

what the hell …

gianrico

Today the issue resurfaced.

Is there any way to understand what is causing the issue? I logged to disk but there is nothing shown there.

Is there a core dump that i can send to MT for analysis?

I need to know if this is a hardware issue or i can workaround it in any way.

Anyone from MT that could answer pleas?

thanks

gianrico

How did you power the Unit?
How Long is the Cable run?
Did you crimped the cable by your Own?

How did you power the Unit?
POE
How Long is the Cable run?
the POE part is about 20 meters

Did you crimped the cable by your Own?
yes

All has worked with no flaws for 5 months.

Very interesting thing I found out is that the issue is only reproducible during the day (i am still working to understand a more accurate time frame).
Let me elaborate a bit more.

Yesterday after 6 pm I noticed that the issue was gone.
During the day the issue is there. I must disable one of the 2 TX channels to avoid the crash.

This is a SICE CPE with a panel antenna built like this : http://www.sicetelecom.it/wp-content/uploads/galleries/83/SICE_ATRH0591_CPE_5GHz_MIMO_TDMA_3.jpg (<-- website not reachable at the moment but should be soon).
Considered the position of the Ethernet connector, the RB911G-5HPacD should be mounted with the face up when looking at the back of the panel.

I am wondering if this is a heat issue, maybe leading to dilatation.

I live in Italy (center). During the day, at 6 am the sun will start hitting the antenna panel to the front. Temperature when sun is shining can range from 30 to 36 Celsius (depending on time of the day).
Today at 8 am I can say that the issue was there.
Yesterday at 6 pm the issue was gone.
At 6 pm the sun is facing the back of the antenna, hitting the CPE to the back.

I will run more thorough tests today to understand at what time exactly the issue disappears.

If you have any ideas/thoughts please let me know.

thanks
gianrico

Set all the Lotions in System Logging to Disk, the post it HERE when it Happens

Thank you. i do not see anything interesting in that file anyway.
log.0.txt (8.61 KB)

Is your POE injector supplying enough amps to handle full load? Try a higher power adapter.

Yes, my First idear too, change power supply

Thank you guys .

Currently the unit is powered by an 18v 1A power supply. But it could also be damaged.
I will give it a try with a PS with more Amps.
I also have a 24v 0.8 A that i could try, but i guess what matters the most are the Amps .


gianrico

Over PoE I’d prefer higher voltage over current (assuming current meets minimum requirement).

What is the output of /system health print before and after change of power supply?

Thanks again to all of you guys.
I am not an expert and did not know about the existence of the “/system health” command . And it looks like the power supply was somewhat damaged.

Failed unit (station):

fan-switch: on
fan-on-threshold: 40C
voltage: 7.5V
temperature: 41C


Working unit (peer AP):

fan-switch: on
fan-on-threshold: 40C
voltage: 17.2V
temperature: 46C

After changing the power supply I have .

Failed unit, now working (station):

fan-switch: on
fan-on-threshold: 40C
voltage: 23.2V
temperature: 48C

thanks
gianrico