RB411 Ethernet Freezes (No receive)

I have an installation with several wireless bridges.
All of them are composed of RB411 routerboards running Mikrotik RouterOS v3.10
In one of them, the Ethernet interface freezes (there is no receive on that interface).
The only way to solve the problem is to reboot the routerboard. Disabling and enabling the interface doesn’t help.

In the beginning I thought it could be a hardware problem so I changed the RB411 with a new one, but still the same problem persists!

The configuration is as follows:
The routerboard has 1 wireless connection configured as “station wds”, 1 ethernet connection, and one bridge interface that bridges the 2 previous interfaces. There is also a “VLAN” that is tied to the Bridge interface.

The problem appears randomly, but it happens quite often, several times per day.
Temporarily I wrote a script to reboot the routerboard when the problem appears.

Does anyone have any idea why this is happening?
Is this a known bug or problem?

I think you don’t upgrade the routerboard firmware. Download it from http://www.routerboard.com

You are right, I thought upgrading the RouterOS would be enough but it wasn’t
I was using firmware 2.12 and I upgraded now to 2.15
I will monitor to see if the problem persists.

Thank you very much for your help!!

I thin k the ros 3.10 puts the firmware in the routerboard but don’t activate ir. You have to make routerboard upgrade in telnet.

You cound see the version in routerboard print.

good luck!

Yes that was exactly the case.
RouterOS 3.10 put the firmware 2.15 but didn’t activate it until I gave the command:
system routerboard upgrade

I have bad news.
I upgraded the firmware to 2.16 (latest) and the problem still exists!
Any other ideas?

I have problems like this with 411s and I fix it upgrading the routerboard firmware to 2.15 and later ROS 3.10.

I have disabled Nstream Framer Policy, which was configured at best fit before and I am optimistic that this will solve the problem.

With extensive debugging I found out the following:

Net2 - Bridge 2 <-------> Bridge 1 - Net1

Bridge2 is the point that has the problem.

From Net1, pinging with big packets (4096bytes) to Bridge 2 there was no problem.
From Net1, pinging with big packets (4096bytes) to Net 2 , I had very big packet loss.
So I concluded that the problem has something to do with the ethernet connection.
Also it only happens with big packets.
My first thought was that Nstream was combining packets to bigger frames, and somehow this was creating a problem. By disabling Nstream, I found out that the packet loss totally disappeared!

I hope that disabling Nstream will also solve the ethernet freezing problem, but I’ll just have to wait and see!

Update:

It appears that disabling Nstream does solve the ethernet freeze problem.
But my wireless connection is experiencing problems without Nstream (very high roundtrip times). So I am forced to use Nstream and unfortunately the problem reappeared.
The ethernet freezing problem also appeared at another routerboard that has the same configuration but at another site.

I am able to reproduce the problem easily, by sending big packets (bigger than the MTU of the ethernet interface which is 1500). After a few packets, the ethernet interface freezes.

Does anyone have any idea why this is happening and how it can be solved?

PS. I have the latest firmware installed (2.16) at the routerboards.

any way to solve this? I have this problem on my RB433 interface

yes, disable auto-negotiation of the ethernet interface in both the mikrotik and the switch (or other device) and place the static values, for example 100Mbps full duplex.
This solved my problem

It is a problem with autonegotiation between the Routerboard(s) and the switch(es). I have experienced same problems with (many) RB 493 AH and RB 433 AH. After some time and without serious traffic load, suddenly only the Tx seems to work, while Rx is stuck in zero (0). It makes no difference if you disable/enable the ethernet port in Routerboard (I have repeated more than 50 times this method in each rb). It makes no difference if you change port to the switch. It makes no difference if you connect the “crashed” ethernet port to ANYWHERE (another routerboard, switch, ethernet card etc). The specific Routerboard ethernet is simply dead (!!!). All the other ethernet ports are working ok, unless if you try to connect them to the switch where they also “crash” after a few hours. ONLY if you restart the Routerboard the “crashed” ethernet port goes back alive, until the next crash. The crash seems to occure ONLY when you connect an ethernet port (any port) of a Routerboard to a switch. It does not occure with every switch make in the market, so far I have experienced this problem with some 3Com switches. It does not occure with any other appliances so far as I have experienced (like pc-boxes with ethernet, routerboards etc).
A workaround with this problem seems to be a manually setup for the speed of the ethernet AND the switch (if possible,only when you have managed switch). Try 100 Mbps Full Duplex in both devices. In some cases (3com baseline switch like 2948,2924,2916) there is NO solution so far. Even with Routerboot 2.19 and RouterOs 3.22 THE PROBLEM STILL REMAINS!! Of course you see NOTHING in logs, and supout is practically useless since it also shows NOTHING. It is not a matter of “complex” configuration, since by adding 2 ip’s in routerboard (one on each ethernet) and a gateway and putting it to work, the ethernet that is connected to the switch crashes again after a few hours.
You can “solve” 100% this problem by replacing the Routerboard with a pc box. As far as I can tell many Routerboard models are affected with this issue, but RB 600 seems un-affected so far. I cannot recommend any other solution for your problem. THIS is a SERIOUS problem…only Mikrotik can find a solution.

Just wanted to add our experiences…

We are having a very similar issue. We have RB493AHs that we have been using since it seems we cannot get daughterboards for the RB600s (which have had NO problems for us on probably 30 towers or so). We are experiencing the port lockups, but even more, about 75% of the time that one of the ports lock up the whole router becomes “locked” and a reboot is the ONLY way we can get them stable again. Here is what I have found to cause these lockups:

Plugging in a new piece of equipment (even my laptop to any port)
Rebooting an existing piece of equipment that is plugged into the router (trango back hauls/canopy APs)
Re-inserting a plug (any piece of equipment)

Now, as these things do not always cause a lockup on the port or router, it is more often than not, and any for that matter is too much in the field. Sometimes we can reboot equipment with no problem- the next day or week rebooting the same equipment causes either the port to lock up or the whole router. Funny thing is, I cannot get them to lock up this way in a lab environment.

Some have reported that using lower voltage PSUs (other than the 24V we are using) solved the lockup problem, however, this did not help us even when we used 12V PSUs. I set all the ports on the mikrotik to 100 full at one of our troublesome towers (though, the trango BHs cannot be changed from auto as far as I know). That router locked up last night again…

Hopefully this problem is solved quickly as we basically cannot use these at all with this problem, but they seem to be the only boards available with enough ports.

Hi have the same problem,

I have a network of 60 AP-Routers and i use RB500,RB600 and i have over of 20 RB433AH, ALL RB433AH have this issue, the ethernet stop of work , and i must go in the tower to restart manually the Devices… This is a very urgent problem, we have a lot of disservice and ours customer are very furious, we need other radio for expand our network but we can’t buy other 433 if this problem not is solved.

It’s ridicule that a wisp company have problem in to ethernet connection :frowning:

Please stop to enhance wireless protocol and spend two cents for fix this trouble, because this is very urgent :cry:

Bye FR4

P.s. disable autonegotiation not solve the problem

I think this is related to software.
and not just auto negotiation, we had the ethernet problem,
then another 433AH also had the same problem on WLAN interface.

WLAN1 at 5.8 Ghz if was not communicating in both direction while WLAN2 at 2.4
was communicating.

we had to replace the board with ubiquiti routerstation to get rid of it.

Yes, but this problem seems to be only for 433AH 411H and 493AH that i think is based on same architecture :frowning:
You know if is possible install other OS in this boards ? So i can test if is a hardware problem …

I think that this boards is very sensible to radiofrequency because in some tower the problem is more frequent

Bye

WHAT??? No more posts after 21th of April!

Is this problem now solved? I have the same problem in all its variants seen on several of my rb411’s and some 433’s.
Also my only two 493AH suffered from it and since some weeks my 2 rb1000’s developed the same problems.

It happened in both 3.30, 4.10 as 5b2 + 5b4 ROS versions.

Boards always come back fully operational after power cycle. Not always after soft reboot. (no power down).
The rb1000’s have no radio’s near them and are in a basement were even my mobile doesn’t work any more due lack of signal so it is not a RF problem.

Failing ports DO speak to laptop or switch but not to other routers.
I always upgrade firmware after software upgrade.
All wireless units have MT cards in them, but since it even happens on units without radio’s that should not make a difference either.

On the 493A it happened that first one port failed, plugged cable in next port and that worked for some days to fail again. Up to 3 ports lost that way. After power cycle only two came back. Other stayed dead, even for switch or laptop.
Next day, without any reboot, ALL port functions again!

In my opinion this has something to do with the Ethernet ports. Software or hardware don’t know.
One vague idea I have is in the direction of bad grounding together with the mixed use of shielded/unshielded utp cables.
But the rb1000’s can’t even been grounded! According CE norm the 220V power adapter should also be connected to ground but it is not! On several boards we measured 3V low amp over the 12/24 power of the boards and the ground.
Can this have any influence?

But, in the ´old´ days, of rb112’s, 133c’s, 532A’s, 333’s and ROS 2.x I never saw these problems?

MT should really start making a solution since it reaches a level I am not willing to cope with any longer…

To switch off auto negotiation and set speed is not the solution, it is just suppressing the fever, not cure the disease!

Sounds very similar to the Ethernet lockups on the UBNT Rocket when that came out.

Same Atheros processor isn’t it ?

Hi…

we have deployed few hundred RB433 out there..
and we found this issue just can solve by reboot it..

We do the stupid way is, using the watch dog to monitor the next ip behind the routerboard.
and, we facing no solution from mikrotik as well…

i believe they are looking at this forum, but nothing they can do as they have no solution on it yet.
btw, may we can looking for alternative way to solve this issue permanently.

deployed few hundred RB433

Do you have any that never lock up ?

You have so many deployed that you must have many different types of installation.

If you can find anything different between the ones that lock up and the ones that do not, maybe you can help Mikrotik find the reason.

Unfortunately Atheros do not release their firmware source code or details of their chips to the Public, and make vendors sign an NDA, so Mikrotik cannot give out any Atheros details.

Personally i suspect that there is either an issue with the Atheros silicon, or bugs in the proprietory Atheros code, or both, that MT either have to work around, or pay Atheros to fix.