Community discussions

MikroTik App
 
User avatar
hypernik
just joined
Topic Author
Posts: 21
Joined: Thu Mar 28, 2013 3:01 am

Faulty RB1100AHx2?

Sat Mar 07, 2015 3:32 am

I started this thread on 3/24/2013. Less than 24 months later, I'm back, and more convinced than ever that 1) the RB 1100AHx2 I purchased was bad from the start and/or 2) RB hardware is of inferior quality and my experience is simply par for the course.
The problems I had with the 1100AHx2 early on were so severe that I switched to another router (not Mikrotik/RouterOS) for the time being. Once or twice I fired up the 1100 to "play" with it and become more accustomed with RouterOS. I even updated RouterOS once or twice. But the 1100 spent most of its time powered off in a climate controlled closet. In late 2014, I decommissioned a router and temporarily put the 1100 in service. It mostly performed OK, but at least twice I needed to power cycle the 1100 to restore a lost Internet connection. Then 3/4/2015, our Internet went down and power cycling the 1100 didn't help. When I looked into it, there were no link lights on port 12 (used for WAN). I removed the 1100 from service and started investigating. I connected to the serial port and powered up. Sometimes the 1100 would boot successfully, but often it would appear to hang at "jumping to kernel code." Sometimes it would reboot ("crash" is probably more accurate) on its own. I also noticed when no devices (laptop, etc) are connected to any of the 1100's Ethernet ports, its fan spins up. Connect a device to one of the Ethernet ports, and the fan spins down... Weird.
Using the serial interface, I performed a memory diagnostic and started seeing screens full of memory errors.

Image

Image

I eventually managed to have it booted long enough to do a RouterOS and firmware upgrade (6.20/3.18 -> 6.27/3.22).

Port 12 still doesn't work but the 1100 at least seems to be booting consistently again.

I ran another memory diagnostic after upgrading:

Image

My understanding is the RB1100AHx2 is considered more high end and targets the enterprise market?! I feel like I threw away money on the router as it's only seen a few months of serious duty of any kind, and has performed unreliably at that. I feel gypped.

After re-configuring the router to use Port 10 for my WAN interface, the router is working again, for now, at least.

Mikrotik states that out-of-warranty service can be had for a fee, "2. MikroTik does not offer repairs for products that are not covered by warranty. Exceptions can be made for: CCR1016-12G, CCR1016-12G-BU, CCR1036-12G-4S, RB1100, RB1100AH, RB1100AHx2, RB1200, RB600, RB600A and RB800 as a paid service (fees apply)." If I wanted to have this router serviced how would I go about it and what cost could I expect?
 
User avatar
NathanA
Forum Veteran
Forum Veteran
Posts: 829
Joined: Tue Aug 03, 2004 9:01 am

Re: Faulty RB1100AHx2?

Sat Mar 07, 2015 4:48 am

So after you had this initial experience, you started a thread on a user forum, but then didn't follow up either with official MikroTik support channels (who even responded to your original thread with an invitation to contact them via e-mail) or with your distributor about the possibility of getting the router repaired or replaced under warranty, and you sat on it for 2 years, and now that the warranty has long been expired you are back to gripe about "inferior quality"?

In any case, memory errors could point to a defective board, but could also just as easily point to a bad memory module. Unlike the lower-end RouterBoard offerings, the RB1000-series has user-swappable DDR SO-DIMM modules. Perhaps the one that was installed on your board from either the factory or the distributor is bad (there have been known instances of distributors buying RB1xxx boards and changing or upgrading the RAM before the sale without properly testing the modules they are using for compatibility). It might be worth trying to swap the DIMM for another one of equivalent specs.

-- Nathan
 
User avatar
hypernik
just joined
Topic Author
Posts: 21
Joined: Thu Mar 28, 2013 3:01 am

Re: Faulty RB1100AHx2?

Sat Mar 07, 2015 6:37 am

you started a thread on a user forum, but then didn't follow up either with official MikroTik support channels (who even responded to your original thread with an invitation to contact them via e-mail) or with your distributor about the possibility of getting the router repaired or replaced under warranty
The router was purchased from EuroDK on 3/12/2013 and shipped from Latvia to the Eastern USA. On 4/4/2013 (obviously shortly after the router arrived), I sent an email with support output file to support@mikrotik.com, saying, "I recently purchased an RB1100AHx2 from EURO DK. Before upgrading to 5.24, this router was giving me a lot of problems. After upgrading to 5.24, it seems more stable, but still needs to power cycle every few days...
Please let me know if something can be done to improve stability or if the router needs to be repaired or replaced through RMA." I received the boiler-plate automated response almost immediately, followed by a "live" response a day later, "Before jumping to conclusion, log into the BIOs of the bard via serial console and run memory test. Currently we can see some kind of Kernel crash, but at this point it is unclear what exactly does wrong. We need at least few more panics to determine the problem precisely."

At this point I was underwhelmed with the flaky-out-of-the-box hardware/software/firmware and any possibility of getting honest-to-goodness support.

I have experience with various switches, routers and other network appliances of consumer, SOHO and business class. In my experience, Ethernet port failure is extremely rare, and only shows up on old, heavily-used equipment. A port failure on supposedly business-grade equipment less than two years from purchase and saw little action? Inexcusable, and suggests inferior hardware. Memory problems out of the box? Anyone hear of quality control?
 
User avatar
NathanA
Forum Veteran
Forum Veteran
Posts: 829
Joined: Tue Aug 03, 2004 9:01 am

Re: Faulty RB1100AHx2?

Sat Mar 07, 2015 7:45 am

...followed by a "live" response a day later, "Before jumping to conclusion, log into the BIOs of the bard via serial console and run memory test. Currently we can see some kind of Kernel crash, but at this point it is unclear what exactly does wrong. We need at least few more panics to determine the problem precisely."
I will admit that English is not their strong-suit, and that occasionally I run into language barrier issues communicating with them from time to time. But that doesn't make their engineers worse (or better) than anyone else's. And I don't think any less of them for not having perfect English, just as I hope that they would not think less of me for not knowing ANY Latvian!
At this point I was underwhelmed with the flaky-out-of-the-box hardware/software/firmware and any possibility of getting honest-to-goodness support.
And so you just quit trying after one e-mail, and let a USD ~$400 piece of hardware sit around and collect dust? If I suspected faulty hardware and I couldn't get satisfaction from the manufacturer, I would have called up the distributor and told them in no uncertain terms that they are taking this thing back and refunding me my money.

As far as honest-to-goodness support goes, where I come from, a 24-hour turnaround time on responses for hardware that doesn't even have any kind of support contract attached to it is pretty darn good. And as it turns out, MikroTik's suggestion -- regardless of how clumsily-worded it was -- that you try to run a RAM test from the bootloader was spot-on, and if you had bothered to run that test and report the results back to them, this would have confirmed to the support agent you were exchanging e-mail with that the problem you were fighting was faulty hardware and not a software bug (which is, I think, what he was getting at when he said "before jumping to conclusions"). It's not their fault that it took you nearly 2 years to act on their request for further information.
In my experience, Ethernet port failure is extremely rare, and only shows up on old, heavily-used equipment. A port failure on supposedly business-grade equipment less than two years from purchase and saw little action? Inexcusable...
I had forgotten about this part of the story, and yes, I agree that this is very odd. I would also suggest that it is very unusual. Let's see...at a quick, rough count, it looks like we have 20 RB1100-class routers (with a pretty even split between 1100, 1100AH, and 1100AHx2 models) in production on our network. Many of these are not in climate-controlled rooms and are deployed in much harsher environments than your standard NOC. It looks like one of these is currently reporting an uptime of 448 days! We have never had a single failed ethernet port on an RB1000-series router, at least that I can remember...and on RouterBoards where we have had ethernet port failure, I cannot think of an instance that was not explainable by an electrical event at a site where it was discovered that there was improper grounding (e.g., unshielded ethernet cable strung between the RouterBoard and an integrated ethernet-based wireless bridge + antenna mounted outside).
Memory problems out of the box? Anyone hear of quality control?
Right, because I've never, ever been delivered DOA solid-state electronics... :roll:

Since I now see that you have an AHx2 and not an earlier model, I doubt this is the issue since the timing doesn't line up, but as I said before, if your distributor upgraded the memory in the unit before shipping it to you, your ire should be directed at EuroDK anyway and not at MikroTik. (In spring 2013, there were other U.S.-based distributors that had stock of the AHx2 anyway, so I'm not sure why you made your purchase from them?)

In any case, if you really care about trying to get the router operational again and aren't just here to vent steam, try swapping the memory module. If that doesn't do the trick, I'm sorry to say that you are going to need to contact either MikroTik support by e-mail or a MikroTik distributor in your region to find out what your out-of-warranty repair options are and how to exercise them. This forum isn't an official channel for support from the manufacturer.

Good luck,

-- Nathan
 
InoX
Forum Guru
Forum Guru
Posts: 1966
Joined: Tue Jan 09, 2007 6:44 pm

Re: Faulty RB1100AHx2?

Sat Mar 07, 2015 7:18 pm

It might be worth trying to swap the DIMM for another one of equivalent specs.
This should be the first step in troubleshooting.
 
User avatar
hypernik
just joined
Topic Author
Posts: 21
Joined: Thu Mar 28, 2013 3:01 am

Re: Faulty RB1100AHx2?

Sun Mar 08, 2015 5:29 am

Swapped the RAM out with a scavenged Samsung-branded 2GB DDR2 SODIMM. All memory tests came back clear. For kicks, I decided to try port 12 again, and it works now. Maybe the port problem was related to memory address layout and how ports access/utilize memory :?:
Thanks for the responses and the fairly gentle "kick" in the right direction. :)
I think the problem could have been caught before shipment through proper burn-in. At least, I think MikroTik has an image problem: the chassis material and weight/feel, design decisions such as not offering an access panel to RAM and SD interface, or exposing and SD slot, the odd internally-mounted power adapter. Put everything together and it looks/feels like something designed and built in a hobbyist's shop or garage. And the stability problems upon arrival...altogether, it adds up to a bad first impression. Hopefully my first impressions are proven wrong.
 
User avatar
NathanA
Forum Veteran
Forum Veteran
Posts: 829
Joined: Tue Aug 03, 2004 9:01 am

Re: Faulty RB1100AHx2?

Sun Mar 08, 2015 8:48 am

For kicks, I decided to try port 12 again, and it works now. Maybe the port problem was related to memory address layout and how ports access/utilize memory :?:
Perhaps. Another thought I had had was that if you look at a block diagram for an RB1100, you will see that different ports are "lit" in different ways...some are directly attached to the SoC, many have a switch chip between them and the SoC, and a couple of ports are provided by means of additional PCIe-bus-based ethernet chips, including the one that you were having problems with. It could be that as the memory deteriorated, the order in which the system booted up and loaded drivers for various hardware components increased the odds that the code that drove those particular ethernet ports ended up being loaded into a memory address covered by the bad part of the RAM module.
At least, I think MikroTik has an image problem: the chassis material and weight/feel, design decisions [...]
MikroTik as a company has been evolving at a pretty rapid pace, and while it doesn't excuse the decisions you allude to, knowing their history might help put some of those decisions into context. They started out just doing software, selling licenses for RouterOS that you would install on commodity x86 hardware. (They still do this, and other hardware vendors make their own x86-based RouterOS solutions.) The hardware side is relatively recent, and the all-in-one, "no assembly required" hardware even more recent still. If you browse routerboard.com even today, you will see that a huge percentage of their offerings are single-board computers *without* a case or power supply included. System integrators were responsible for putting the boards in a case (which many times was a custom third-party enclosure), sourcing compatible power supplies, and so on.

All this to say that I think they are still learning, though in my opinion their hardware quality is pretty good. They used to use aluminum for their cases, which made them extremely light but which might contribute to the feeling that you interpret as being "cheap." The CCR series cases feel much more weighty and solid, and many of them have an exposed microSD slot on the front panel, etc. Also, I get the feeling you are trying to compare their gear at $300-400 to equivalent gear that you might get from a different vendor but which might cost 10x as much, so that isn't exactly an apples-to-apples comparison. Instead of comparing an RB1100 to, say, a Juniper SRX240 and concluding, "man, this MikroTik router strikes me as cheap junk," perhaps it would be more fair to conclude, "whoa, look how much functionality I get for the price!"

-- Nathan
 
User avatar
hypernik
just joined
Topic Author
Posts: 21
Joined: Thu Mar 28, 2013 3:01 am

Re: Faulty RB1100AHx2?

Mon Mar 09, 2015 5:53 pm

It could be that as the memory deteriorated, the order in which the system booted up and loaded drivers for various hardware components increased the odds that the code that drove those particular ethernet ports ended up being loaded into a memory address covered by the bad part of the RAM module...
Sounds like a logical explanation. Really appreciate your thorough response!
 
xezen
Long time Member
Long time Member
Posts: 628
Joined: Fri May 30, 2008 10:23 am
Location: South Africa

Re: Faulty RB1100AHx2?

Tue Dec 13, 2016 7:37 pm

can this be a reason that port 1-5 6-10 dont work as 11 and 12 are the only working ports
 
sadoon
just joined
Posts: 1
Joined: Mon Jan 02, 2017 8:08 pm

Re: Faulty RB1100AHx2?

Mon Jan 02, 2017 8:39 pm

I am having a problem with your MikroTik 1100 where he worked at the Ribot operating beep does not go out for a second run, and remains so, what action?

Who is online

Users browsing this forum: No registered users and 25 guests