Page 1 of 1

951G-2HnD problem with 6.xx version of RouterOS

Posted: Wed Feb 05, 2014 3:33 pm
by vpsanych
Hello.

I am using Mikrotik RB951G-2HnD.

This is interface configuration:

[admin@MikroTik] > interface print
Flags: D - dynamic, X - disabled, R - running, S - slave
# NAME TYPE MTU L2MTU MAX-L2MTU
0 R ether1-gateway ether 1500 1598 4074
1 R ether2-master-local ether 1500 1598 4074
2 ether3-slave-local ether 1500 1598 4074
3 ether4-slave-local ether 1500 1598 4074
4 ether5-slave-local ether 1500 1598 4074
5 wlan1 wlan 1500 2290
6 R bridge-local bridge 1500 65535

Where ether1 and ether2 are master and ether3/4/5 are slave of ether2.
ether2 and wlan1 are bridged with bridge-local.
ether1 - interface to provider.

This configuration works fine with RouterOS 5.XX (5.20-5.26).
But when I upgrade to 6.9 (also with 6.1, 6.2, 6.7 and 6.9) with a high traffic between ether1 and ether2 - ether5 system stop working properly eg:

PC connected to ether2-5 can only ping routerboard and other local PC.
From routerboard I can ping pc on ether2 - ether5 or wlan1.
External resources are not available.

After 5-10 minutes or disable/enable ether1 (or reboot router) again system works fine until reach high traffic between ether1/ether2-5 or wlan1.

Please, can you help me to find where is the problem?

Thank you.

Re: 951G-2HnD problem with 6.xx version of RouterOS

Posted: Wed Feb 05, 2014 4:44 pm
by pjhb34fg
I've got exactly the same problem, high traffic wan port rx/tx dies - occurs only on 6.x versions, sitting on 5.26 atm
Same RouterBoard model
resetting configuration and making new manually doesn't help at all

Re: 951G-2HnD problem with 6.xx version of RouterOS

Posted: Fri Feb 07, 2014 12:33 am
by jandafields
I've been using this model for months, never had a problem. Did you also update the firmware?

Re: 951G-2HnD problem with 6.xx version of RouterOS

Posted: Fri Feb 07, 2014 5:12 pm
by vpsanych
I have friends who do not have such problems with this model router (951G-2HnD).
But I have a problem with the performance and operability when upgrade router to 6.XX version of RouterOS.
May be detail information is needed to determine what the problem is at work on 6.XX version of RouterOS?
Thanks.

Re: 951G-2HnD problem with 6.xx version of RouterOS

Posted: Sun Aug 03, 2014 11:17 am
by Wintermute
Hello guys,

i have the exactly same problem and it's driving me mad. Router loses ability to route in random occasions. Disabling+Enabling ether1-gateway interface makes the problem to fade away temporarily.

Some observations:
  • General:
  • Happens with all RouterOS 6.x range. Currently at 6.18. Besides RouterOS, firmware is kept updated as well.
  • RoutersOS setup is very simple, nothing fancy, just a NAT to ISP for both ethernet and WiFi.
  • All 5 eth ports are populated. Got ovpn interface that's permanently disabled.
  • Occures in unpredictable and unexpected times. Sometimes it's not coming in weeks/month and sometimes like yesterday it occures like 10x.
  • After the incident:
  • Any attempt to get through WAN fails with "timeout". That affects clients connected to ether2-5, wifi and also router itself (ping using terminal fails). Router can't even ping the default gateway.
  • Routes list seems normal. All interfaces listed as reachable, including ether1-gateway.
  • Router is able to see neighbour Mikrotik routers no problem.
  • Router is able to perform IP Scan on ether1-gateway no problem.
  • It's just the routing itself on ether1-gateway that's broken.
  • CPU load and memory occupation are at low values.
  • Vent:
  • With each and every new release of RouterOS I have hopes of the issue being addressed, but no. Often negligible or silly stuff is noted in changelog but this particular awful "bug" remains.
  • Temperamental b!tch. What else can I say? For comparison - in the office I have been running RB750 for years, under bigger stress, nice uptime, different setup, same RouterOS versions. Never experienced this problem on it. Never even had a reason to reboot it in order to recover from incident.
Has anyone figured out what may be the root cause of the problem and how to avoid it? I am not convinced it's related exclusively to load/traffic as it has occured to me in idle periods as well.

Is there anything particular I can do to:
1. Diagnose the problem further. Pick up evidence, set up logging in a certain way.
2. Help the issue get fixed by Mikrotik officials.
3. Get rid of the terrible issue once for good?

Shame no single word from Mikrotik officials on this subject.

Cheers,
Tomas

Re: 951G-2HnD problem with 6.xx version of RouterOS

Posted: Mon Aug 04, 2014 9:05 am
by sanitycheck
I've been fighting a problem that's only somewhat similar, but fixing the eth1 speed (turning off auto-negotiate) seems to have addressed it. I wonder if that fix would help here.

Re: 951G-2HnD problem with 6.xx version of RouterOS

Posted: Mon Aug 04, 2014 10:54 am
by Wintermute
Thanks for the hint. I've unplugged unnecessary ports to relieve the router, left connected just bare minimum 1xWAN, 1xLAN and disabled auto-negotiate on WAN interface. I wonder how long it takes for the issue to resurface again. Hopefully it never will :)

Reading your topic, there might be a pattern. My ISP doesn't support 1gbit link speed hence the router even though 1gbit capable connects at 100mbit full-duplex (I am well aware and okay with that). Still I incline to believe it's more routing problem than link transport problem. Router succeeeds to peform IP-Scan on WAN by the time of incident, hence it is capable of sending/receiving (certain) packets over the link.

Re: 951G-2HnD problem with 6.xx version of RouterOS

Posted: Wed Aug 06, 2014 9:25 pm
by Wintermute
No fails since the last report.

However I have no other choice than to enable auto-negotiation again. Something fishy is going on here.

With Auto-negotiation on
(auto set) Rate 100Mbps
(auto set) Full Duplex On
Actual transfer speeds:
Down 95Mbps / Up 95Mbps

With Auto-negotiation off
(hand set) Rate 100Mbps
(hand set) Full Duplex On
Actual transfer speeds:
Down 15Mbps / Up 75Mbps

UPDATE: situation gets even more interesting.

With Auto-negotiation off
(hand set) Rate 100Mbps
(hand set) Full Duplex Off
Actual transfer speeds:
Down 85Mbps / Up 75Mbps

Is there any logical explanation for such behavior? Does the auto-negotiation set parameters besides link rate and duplex?

Re: 951G-2HnD problem with 6.xx version of RouterOS

Posted: Tue Sep 16, 2014 2:28 pm
by Wintermute
Problem strikes every now and then, like twice a month on average. It's very very irritating and I am quite sure it's related to some kind of a bug / vulnerability / fragility in RouterOS itself. What else would make RouterOS to behave like that repeatedly otherwise?

The most annoying part is that since the moment of failure, unit becomes inaccessible via WAN therefore it can't be brought back to normal remotely. No response to ping, not connectable from Winbox. That's insane. It really is. Device is not dead, locked, frozen or anything. It can be switched back to normal by disabling and re-enabling WAN interface (ether1-gateway) from LAN. So it doesn't even need to be rebooted or power cycled at that point. Pure software condition then.

Affected unit has RouterOS 6.19 installed.

This wall of ignorant silence from Mikrotik staff is deafening.

Re: 951G-2HnD problem with 6.xx version of RouterOS

Posted: Tue Sep 16, 2014 6:10 pm
by jarda
Have you checked the ports, cables, opposite devices? Have you sent a ticket to support and they ignored it? Post the number here...

Re: 951G-2HnD problem with 6.xx version of RouterOS

Posted: Tue Sep 16, 2014 6:51 pm
by Wintermute
jarda: other than unplugging unnecessary cables and replugging two remaining cables, no. Not sure what I am supposed to check in particular. The device seems to stop routing between LAN and WAN. There doesn't seem to be problem with cables or anything physical, except perhaps power spikes. Device is able to see neighbour Mikrotik devices on WAN network (=> that kind of communication works). Device is able to see computers in my LAN (=> that kind of communication therefore works as well). It just, under undefined and rare circumstances, fails to route between those two interfaces and fails to accept connections from WAN.

Again, device can be made fully operational by switching ether1-gateway interface off and on again. Reliably, by every attempt. If there was a problem with cables/ports/youname how a mere software operation (that's what disabling/enabling interface via RouterOS commands is) could instantly and reliably fix it as it actually does? I don't need to physically approach the device, touch cables and so on to revive it. Software command issued from Winbox@LAN is just enough to do the trick.

According to this statement I am not eligible for support (that 30 days limit). http://www.mikrotik.com/support.html. You're right though, I missed the fact this is a user / community forum with no Mikrotik staff obligation to respond.

Re: 951G-2HnD problem with 6.xx version of RouterOS

Posted: Tue Sep 16, 2014 8:52 pm
by Wintermute
Alright. As expected, forementioned eth1-gateway toggle trick solved the problem again. Considering I can't prevent the situation from happening in future I can at least attempt to make the router to self recover.

The idea is to use tools/netwatch facility to keep checking availability of ISP's gateway and when / if it goes down (in fact RouterOS fails to reach it), execute a recovery script like:
/log warning message="ISP gateway inaccessible"
/beep
/interface disable ether1-gateway
/delay delay-time=1
/interface enable ether1-gateway
/beep
/log warning message="eth1-gateway toggled"
This is a first RouterOS script (or rather command sequence) I've quickly put together so bear with me.

Despite the rant above there is still a chance that it's my ISP who is causing all the trouble as they're using (iirc) dreaded Alot system to control the flow in their network infrastructure. Either way, both of RouterOS internals and ISP quirks are beyond my control.

Re: 951G-2HnD problem with 6.xx version of RouterOS

Posted: Tue Sep 16, 2014 9:02 pm
by jarda
Even your support period passed, mikrotik stuff is waiting for reporting all errors that could happen. Normally they are helpful. From my experience such scripts help to overcome the problem. I needed to use them also sometimes. But rewiring or using another ports helped. Sometimes netinstall solves problems that you even would not expect. Just make supout file when you see the problem and send it to support with as good description as possible. Be patient, they will respond.

Re: 951G-2HnD problem with 6.xx version of RouterOS

Posted: Wed Sep 17, 2014 7:40 am
by lz1dsb
I configured a RB951G-2nHD at my brother's apartment half an year ago. Up until recently I've had similar issue. Though in my case, the outbound connection was dying once the device hits high load. It took me a while to figure it out, but I realized that the problem was that the RB was loosing it's ARP entry for the ISP's gateway, so the default gateway was effectively becoming inactive. I've had countless calls with the ISP as they didn't want to acknowledge the problem. But three weeks ago I was very persistent and spent few hours on the phone. Finally they've fixed their settings and now the connection is running flawlessly.
The only thing I did was to remove the MAC address spoofing on my WAN port and I asked the ISP to take the MAC address of the router.

Re: 951G-2HnD problem with 6.xx version of RouterOS

Posted: Wed Sep 24, 2014 1:08 pm
by vpsanych
My ISP uses a binding device by MAC-address.
When I configuring / reconfiguring Mikrotik every time, I changed the default MAC-address on the interface connected to the ISP to one that is registered with ISP.
After the next reset Mikrotik settings to default, I contacted the provider to change the MAC address in its database to the real MAC-address of my Mikrotik.
After that Mikrotik worked for over a month and the problem no longer occurs.
Мy problem is solved.
Thanks to everyone for the tips

Re: 951G-2HnD problem with 6.xx version of RouterOS

Posted: Wed Sep 24, 2014 7:41 pm
by jarda
It seems that mac clonning feature is broken somehow. Report it to mikrotik anyway, maybe they will find something and correct it for the future.

Re: 951G-2HnD problem with 6.xx version of RouterOS

Posted: Thu Sep 25, 2014 1:21 am
by lz1dsb
It seems that mac clonning feature is broken somehow. Report it to mikrotik anyway, maybe they will find something and correct it for the future.
I was thinking the same thing but while I was testing, i was using the sniffer embedded in the RouterOS. It was always showing that the MikroTik router was using the correct source MAC address. But still, there was something wrong. Unfortunately a deeper analysis should have involved the ISP, but in my case this was not an option..

Re: 951G-2HnD problem with 6.xx version of RouterOS

Posted: Sat Sep 27, 2014 10:15 am
by pjhb34fg
My ISP uses a binding device by MAC-address.
When I configuring / reconfiguring Mikrotik every time, I changed the default MAC-address on the interface connected to the ISP to one that is registered with ISP.
After the next reset Mikrotik settings to default, I contacted the provider to change the MAC address in its database to the real MAC-address of my Mikrotik.
After that Mikrotik worked for over a month and the problem no longer occurs.
Мy problem is solved.
Thanks to everyone for the tips
I confirm this behavior, have the same model and after this advice i got RouterOS 6.xx working with no problems at all.
Definitively there is a problem in mac-address cloning, or somewhere near.