Netinstall on RM3011 Fails need help (technical questions)

I have an RB3011 UiAS-RM cannot get a successfull netinstall to work.
RouterOS is 6.49.11 and is working on this unit.

The router works, I am able to upgrade/change firmware via the files functionality.
I cannot get the router to perform a netinstall.
I’m very familiar with the process and how it should work.
I can also netinstall witouth any problems on other RB3011 routers they work.
Do I have a bad router? corrupt or not working bootloader?
And is there any way to fix this?
I’m really puzzled and stuck on this one and I have tried a lot of things..
Updating the routeros via files, upgrading routerboot via winbox etc.
I seem to be able to update the router just fine this way. but cannot get a successful netinstall for any versiopn of routeros/netinstall.
I DO want to format/wipe the filesystem.
If I try to netinstall 6.49.11 it says “Installing” for about 5 seconds then switches to “READY” it never formats the drive or installs routeros.

If I try to netinstall 7.18.2 (using 7.18.2 netinstall software) it does TFTP transfer at the beginning and fails with “malformed packet” and the router never “shows up” in netinstall.


7.18.2 netinstall:

No.	Time	Source	Destination	Protocol	Length	Info
52063	925.877110	10.73.73.3	10.73.73.99	TFTP	60	Acknowledgement, Block: 2794
52064	925.877123	10.73.73.99	10.73.73.3	TFTP	1498	Data Packet, Block: 2795
52065	925.877863	10.73.73.3	10.73.73.99	TFTP	60	Acknowledgement, Block: 2795
52066	925.877876	10.73.73.99	10.73.73.3	TFTP	1498	Data Packet, Block: 2796
52067	925.878565	10.73.73.3	10.73.73.99	TFTP	60	Acknowledgement, Block: 2796
52068	925.878591	10.73.73.99	10.73.73.3	TFTP	1498	Data Packet, Block: 2797
52069	925.879239	10.73.73.3	10.73.73.99	TFTP	60	Acknowledgement, Block: 2797
52070	925.879251	10.73.73.99	10.73.73.3	TFTP	330	Data Packet, Block: 2798 (last)
52071	925.879494	10.73.73.3	10.73.73.99	TFTP	60	Acknowledgement, Block: 2798
52072	925.879571	10.73.73.99	10.73.73.3	BOOTP	46	Unknown BOOTP message type (112)[Malformed Packet]
52073	926.436643	ASUSTekCOMPU_09:b8:5e	Broadcast	ARP	42	Who has 10.73.73.31? Tell 10.73.73.99
52074	926.468835	10.73.73.99	10.73.73.3	UDP	46	5000 → 5001 Len=4[Malformed Packet]


6.49.11 netinstall 



7.1 Fail exits etherboot and boots OS

Try to install Routeros 6.49.11 (using netinstall 6.49.11 netinstall software)

Click "Install"
Says "INSTALLING" for 5 seconds then switches to "READY"  
Never completes.  

Small packet flow forever:  
UDP	197	5000 → 5000 Len=155   
About two packets per second.  
Netinstall never happens
I tried a number of other things (different routeros) different versions of netinstall etc.
And get similar (but different results) and netinstall not working for this unit.
Trying to keep this initial post short as possible and not including all of the other things that I tried that also did not work.

Looking for help and any other things I might try.


7.18.2 netinstall attempt: (router never shows up)

No.	Time	Source	Destination	Protocol	Length	Info
52063	925.877110	10.73.73.3	10.73.73.99	TFTP	60	Acknowledgement, Block: 2794
52064	925.877123	10.73.73.99	10.73.73.3	TFTP	1498	Data Packet, Block: 2795
52065	925.877863	10.73.73.3	10.73.73.99	TFTP	60	Acknowledgement, Block: 2795
52066	925.877876	10.73.73.99	10.73.73.3	TFTP	1498	Data Packet, Block: 2796
52067	925.878565	10.73.73.3	10.73.73.99	TFTP	60	Acknowledgement, Block: 2796
52068	925.878591	10.73.73.99	10.73.73.3	TFTP	1498	Data Packet, Block: 2797
52069	925.879239	10.73.73.3	10.73.73.99	TFTP	60	Acknowledgement, Block: 2797
52070	925.879251	10.73.73.99	10.73.73.3	TFTP	330	Data Packet, Block: 2798 (last)
52071	925.879494	10.73.73.3	10.73.73.99	TFTP	60	Acknowledgement, Block: 2798
52072	925.879571	10.73.73.99	10.73.73.3	BOOTP	46	Unknown BOOTP message type (112)[Malformed Packet]
52073	926.436643	ASUSTekCOMPU_09:b8:5e	Broadcast	ARP	42	Who has 10.73.73.31? Tell 10.73.73.99
52074	926.468835	10.73.73.99	10.73.73.3	UDP	46	5000 → 5001 Len=4[Malformed Packet]


Try to install Routeros 6.49.11 (using netinstall 6.49.11 netinstall software)
Router does show up
Click "Install"
Says "INSTALLING" for 5 seconds then switches to "READY"  
Never completes.  

Small packet flow forever:  
UDP	197	5000 → 5000 Len=155   
About two packets per second.  
Netinstall never happens

No.	Time	Source	Destination	Protocol	Length	Info
7867	67.624170	0.0.0.0	255.255.255.255	UDP	197	5000 → 5000 Len=155
7868	68.495311	ASUSTekCOMPU_09:b8:5e	Broadcast	ARP	42	Who has 10.73.73.31? Tell 10.73.73.99
7869	68.625474	0.0.0.0	255.255.255.255	UDP	197	5000 → 5000 Len=155
7870	69.500095	ASUSTekCOMPU_09:b8:5e	Broadcast	ARP	42	Who has 10.73.73.31? Tell 10.73.73.99
7871	69.626489	0.0.0.0	255.255.255.255	UDP	197	5000 → 5000 Len=155
7872	70.500616	ASUSTekCOMPU_09:b8:5e	Broadcast	ARP	42	Who has 10.73.73.31? Tell 10.73.73.99
7873	70.627937	0.0.0.0	255.255.255.255	UDP	197	5000 → 5000 Len=155
7874	71.495265	ASUSTekCOMPU_09:b8:5e	Broadcast	ARP	42	Who has 10.73.73.31? Tell 10.73.73.99
7875	71.629074	0.0.0.0	255.255.255.255	UDP	197	5000 → 5000 Len=155
7876	72.499828	ASUSTekCOMPU_09:b8:5e	Broadcast	ARP	42	Who has 10.73.73.31? Tell 10.73.73.99
7877	72.630377	0.0.0.0	255.255.255.255	UDP	197	5000 → 5000 Len=155
7878	73.004844	ASUSTekCOMPU_09:b8:5e	Broadcast	ARP	42	Who has 10.73.73.1? Tell 10.73.73.99
7879	73.631555	0.0.0.0	255.255.255.255	UDP	197	5000 → 5000 Len=155
7880	73.998166	ASUSTekCOMPU_09:b8:5e	Broadcast	ARP	42	Who has 10.73.73.1? Tell 10.73.73.99

7.1 netinstall attempt:  
Router never shows up:  
etherboot exits and roots routerOS on it's own soon after netinstall for 7.1 is ran.

I’m able to upgrade to routeros 7.18.2 using the file menu and system\routerboard upgrade.
But still unable to do a netinstall on this router using netinstall 7.18.2
The router never shows up, the dataflow (TFTP) stops and I get the malformed packet displayed in my packet capture when it stops.

This process works fine on a different RB3011 router. and I do not get any “malformed packet” when I try to do this on a different RB3011 router.

When you say Netinstall works to one RB3011 but not to this other one, are you actually doing exact apples-to-apples comparison/testing? In other words, you are plugging eth1 of both RB3011s directly into the exact same ethernet port of the exact same PC? It’s not like you are physically hooking one of them up to the PC differently than the other, or that you are using 2 different PCs in your tests, etc?

Assuming this is a fair comparison where the ONLY variable that is changing is the RB3011 itself, the only thing that occurs to me to ask is, are the working and non-working RB3011s both running the same version of RouterBOOT or not, and is RouterBOOT configured identically on both? (Have you tried resetting RouterBOOT settings to defaults on the non-working one?)

Thank you for responding and thank you for your help!

Yes I have done this very carefully and I have done everything exactly identical as you have questioned, no variances whatsoever.
Just for perfect clarification - what exactly do you mean by “Have you tried resetting RouterBOOT settings to defaults”
Yes I believe I have (I’ve rest everything to defaults). Multiple times for various attempts.
Both via the reset button/powerup method and via Winbox reset configuration.
I’m pretty sure you are referring to the reset button reset factory defaults method.
And I will try it yet again just to make absolute sure I tried that or tried it “enough times”.
I know in some rares cases certain hardware needs to be “reset twice” to get everything.
But I have indeed tried it at least once with the routeros and routerboot that are on it now. (7.18.2).
"
Plug in the power: While still holding the reset button, plug the power cord back into the router.
Wait for the LED to blink: Continue holding the reset button until the ACT (activity) LED starts flashing.
Release the reset button: Once the LED starts flashing, release the reset button.
Wait for the reset process to complete: The router will now reset to its factory default configuration"

Take a look at the docs on this page, to see how you reset RouterBoot to default settings, and how RouterBoot works.

https://help.mikrotik.com/docs/x/SQAoC

Hi,

Perhaps you or someone has enabled protected routerboot at some stage.

https://help.mikrotik.com/docs/spaces/ROS/pages/40992878/RouterBOARD#RouterBOARD-Protectedbootloader

Routerboot/etherboot is not disabled.
It looks like I may need to try this with a serial console:
r - reset booter configuration
Is this different than anything you do by pressing the button to reset config?
And the only way to do this?
Does it do any more than just clear this setting?
b booter options Select which bootloader to use by default
There’s only two and you can select them using the reset button on powerup.

I’m also seeing more extreme options:
e format storage Destroys all data on the NAND, including RouterOS configuration and license
I’d need to get familiar with how to get it back, backup and restore the license etc.
I’m guessing this clears everything including the boot loader.
And I’d need to figure out how to get it all back.
Probably a good learning exercise anyway, and may point out a hardware problem if there really is one.
It also flys by and mentions “CAPS mode” but does not at all explain what that is.

In any case the thing is broken and I could use to learn something new.
Oddly the router seems to be fine functionally.
It worked when I got it with 6.49.11 on it but netinstall would not work.
I was able to upgrade it to 7.18.2 (current release) using winbox and it works but still can;t perform a netinstall.
I hope to get to the bottom of whatever this issues actually is.
Thanks for the pointers, I’m still a bit lost but will eventually figure this out.
If you have more pointers or suggestions on if doing those tings I mentioned above will be --um educational just let me know what’s in store.
Thanks!

Check the protected routerboot option.

[admin@450] > system/routerboard/settings/print

auto-upgrade: no
baud-rate: 115200
boot-delay: 2s
enter-setup-on: any-key
boot-device: nand-if-fail-then-ethernet
cpu-frequency: auto
boot-protocol: bootp
enable-jumper-reset: yes
force-backup-booter: no
silent-boot: yes
*** protected-routerboot: enabled ***
reformat-hold-button: 20s
reformat-hold-button-max: 10m

You can disable it, then turn unit off within 60S to get disable to stick.
(Or you can just hold the reset button on power up for a time in this case between 20S and 10M)

auto-upgrade: no                        
                 baud-rate: 115200                    
                boot-delay: 2s                        
            enter-setup-on: any-key                   
               boot-device: nand-if-fail-then-ethernet
         preboot-etherboot: disabled                  
  preboot-etherboot-server: any                       
             cpu-frequency: 1400MHz                   
             boot-protocol: bootp                     
       enable-jumper-reset: yes                       
       force-backup-booter: no                        
               silent-boot: no                        
      protected-routerboot: disabled                  
      reformat-hold-button: 20s                       
  reformat-hold-button-max: 10m

What is "jumper reset?
I have the cover off and have not located any jumper or what looks like pads.
Or is this just terminology for the reset switch that also enabled etherboot?

At this point I mildly suspect a hacked or otherwise messed with bootloader(etherboot) and is one of the reasons I’d really like to to a netinstall on it.
And of course that’s not working.
I’m open to completely wiping the flash/NAND format etc. and having to rebuilt it if that’s an option, and a good learning experience if I can find the steps to do it.
netinstall does indeed work fine for me on a different RB3011 with the same exact network/ Windows running netinstall arrangement.
BOTH came to me used and in “good working service” from sites with 6.49.11 installed and working in service on them.
The “bad” one seems to "work fine and I upgraded it to7.18.2 via Winbox uploading rebooting and upgrading routerboot within Winbox and it seems to work fine.
Butt I don’t totally trust it due to the netinstall will not work on it ordeal. And I’m wanting to learn something here.

Mikrotik devices have TWO bootloaders.
Which one is selected at boot depends on the exact way you are resetting it.
Did you try both ways?

https://help.mikrotik.com/docs/spaces/ROS/pages/24805498/Reset+Button

How to reset configuration

  1. Unplug the device from power;
  2. Press and hold the button > right after > applying power;
    Note: hold the button until the LED will start flashing;
  3. Release the button to clear the configuration;

BUT:

Loading the > backup > RouterBOOT loader
Hold this button > before > applying power, and release it after three seconds since powering, to load the backup boot loader.

Yes, I am long past trying things with the two different bootloaders.
The problem is with etherbooot which is separate from the two bootloaders as far as I understand.
And etherboot is separate of the two bootloader options so loading the alternate bootloader/router boot has no affect on my problem with netinstall.

Semi-random ideas/things to try/check.

But you have normal access to the router, don’t you?
So you can try to manually set etherboot in configuration:

/system routerboard settings set boot-device=try-ethernet-once-then-nand

Even if your netinstall setup is correct and works for other router(s), you can also try to insert between the devices a “dumb” switch as this has been reported to somehow allow netinstall on pesky devices.

Also, double check and triple check your current firmware and exact netinstall version, just in case of some mismatch causes the issues.

What is the “ASUSTekCOMPU_09xxxx” that appears in your report?
Could it be that those ARP requests disturb netinstall?

I’m far past the putting a hub and dumb switch between the two devices, already done this.
Also the problem is NOT starting or getting etherboot to run, it runs with the button press method.
The problem is it blowing up and crashing or stopping when it is in the middle of the initial TFTP from the bootp server.
The single pc running netinstall is the bootp server.
The ARP requests are the PC itself somehow doing these and it only happens with netinstall is running.
Netinstall (bootp server) has it’s own IP address aside from the PC ip address on the same subnet.
I think it may be the PC ARPING netinstalls IP address.
I also see these while the other “good” RB3100 that does the netinstall just fine.
Yes, the “bad” RB3011 works fine other than netinstall not working on it which this post is about.
The PC (running netinstall) and the RB3011 are the only things on the network.
I of course before even posted here tried all of these things that are asked about including a dumb switch a hub and a direct connection between the two.
Same results in all cases I have tried.

After all of your tests and answers, I am inclined to think you are right that there is something uniquely wrong with the one 3011 unit. Now the obvious question is, what is it?

First, just to clear up a few possible misconceptions:

Netinstall does NOT re-flash RouterBOOT. If there is something wrong with your RouterBOOT, a Netinstall will therefore NOT fix it. Netinstall only reinstalls RouterOS itself, and reformats the part of the internal flash where RouterOS itself lives and boots from. It doesn’t touch RouterBOOT at all.

Think of RouterBOOT as more like your PC’s BIOS (or EFI, to use the more modern form), which is the firmware that first runs at power-up and is responsible for bootstrapping everything. While RouterOS is (hence the “OS” part of the name) like the operating system you run on your PC. Just as reinstalling Windows on your PC doesn’t cause your BIOS/EFI to get re-flashed, reinstalling RouterOS does not cause RouterBOOT to be re-flashed.

I think the fact that MikroTik now releases RouterBOOT and RouterOS updates in concert with each other & uses the same version numbers for both has conflated these things in many people’s minds, as well as has caused confusion with some about what exactly RouterBOOT’s role is. But it did not always used to be this way…versioning of each used to be completely separate and entirely unrelated (as were the cadences of the release cycles of both). Don’t let the version numbers fool you. A RouterBOOT version of 7.18.2 is just the version of RouterBOOT that happened to be released at the same time as RouterOS 7.18.2. That’s all.

The way to “re-flash” RouterBOOT is simply to issue a “/system/routerboard/upgrade” command. This will cause whatever version of RouterBOOT was bundled with the currently running version of RouterOS to get flashed in place of whatever version was in flash memory before. (And, yes, it completely replaces the former copy of RouterBOOT during this “upgrade”. So if you suspect any corruption of any kind in RouterBOOT, simply running “upgrade” should in theory fix it.) In some scenarios, it is even possible to downgrade RouterBOOT.

There are actually two copies of RouterBOOT on every RouterBOARD: the primary bootloader, and the backup bootloader. The primary bootloader is the only one that gets updated when you run a “routerboard upgrade”. The backup bootloader pretty much stays at whatever version the 'board came with from the factory. The backup bootloader version is what is represented by “factory-firmware” when issuing “/system/routerboard/print”. The backup bootloader is there JUST IN CASE your primary one becomes corrupt somehow, to make the hardware more difficult to permanently “brick”. You can engage the backup bootloader using the particular means documented for your specific RB model in its manual. Typically, for most devices, this involves holding in Reset button BEFORE applying power, and then releasing after any length of time post-power-up. (The Reset button has multiple purposes on most models, and you can actually trigger multiple different actions, depending on exactly when you start holding it down, and when you let it go. If you only start holding in Reset immediately after applying power, then Reset button only affects resetting the RouterOS config and/or triggering netboot for Netinstall.)

So if you have already tried holding in Reset BEFORE powering up, and Netinstall behaves the same way, then you have already engaged the backup bootloader’s services, and thus we can conclude that the problem isn’t with the bootloader.

Or can we…?

Are you holding down Reset before applying power EVERY TIME you try a Netinstall? And have you ALWAYS done that? Or have you ever tried to trigger netboot on this 3011 a different way? If you have always done it the exact same way, every time, then maybe the problem IS with the bootloader…specifically, the backup bootloader! Is it at all possible that “factory-firmware” on your two RB3011s is NOT identical? Is it therefore possible that the version of “factory-firmware” that shipped on the troublesome 3011 is, in fact, somehow buggy in a way that prevents Netinstall from working??

The way to test this would be to engage the netboot feature in a DIFFERENT way, one that does NOT trigger the backup bootloader. These methods would be either:


  1. …following the suggestion from “jaclaz” in the prior response, where he showed you how to trigger a netboot attempt on the next reboot, without relying at all on the Reset button (you dismissed this without considering that this does in fact behave in a quantitatively different way)
  2. …only pressing and holding down Reset AFTER first applying power

If you try method #2 and still see the same results, honestly I think it would still be worth trying #1, just in case you are being “too quick on the draw” & the timing of your Reset button press is at all in question.

I scrolled back through your prior responses, and also didn’t see you ever actually answer one of my earlier questions:

Are your two 3011s running the same version of RouterBOOT, or different versions?

You only partially answered this…you mentioned that you had upgraded RouterBOOT on the problematic 3011 to 7.18.2. But you said absolutely nothing (at least, unless I missed it) about what version of RouterBOOT the GOOD 3011 has on it.

So, at this point, I’ll ask explicitly:

Paste the output of “/system/routerboard/print” as well as “/system/routerboard/settings/print” from BOTH 3011s. That way, we all have as much detail as we can possibly glean both about the primary AND backup RouterBOOT versions on BOTH devices, how BOTH of them are configured, as well as other potentially interesting and possibly pertinent details about any differences that there might be between the two devices (e.g., maybe one is a “r2” device and the other isn’t, or something along those lines? In which case, there is also a possible hardware difference between them?)

In the end, I have my doubts it is a RouterBOOT problem, but…who knows?

The only other possibility that occurs to me is that there could be some kind of hardware fault with the “bad” 3011, specifically with the ether1 port. I’m wondering if it is having trouble training or maintaining a solid ethernet link as a result. If the ethernet link on ether1 is “flapping” at all while it is downloading the Netinstall payload, that would also cause the symptoms you are seeing. I have, for example, run across RB devices where somebody tried to, say, plug a PoE injector into one of the ports that doesn’t support PoE input, and either fully fried it, or just partially fried it (sometimes it links, sometimes if it does it isn’t reliable, and/or it takes a long time to link or to have the link stabilize, an/or it just won’t link at higher speeds anymore like 1Gbit but will eventually auto-neg down to 100Mbit, etc.). It’s not likely that is what happened here (after all, ether1 on most RBs is PoE-input-capable, and indeed the 3011 is one such model this is true of), but it could have been damaged in a different way, it could have been defective straight from the factory, etc.

Following similar lines to @NathanA,

Your backup routerboot might well already be at the updated version, so the following will not be helpful. But just in case.

You can update your backup routerboot as described in the protected routerboot information linked to earlier.
It might then behave better when trying to netinstall.

What I don’t understand (but likely it is just me, there are so many things I don’t understand) are the ARP requests shown.

The PC has 10.73.73.99 .
The router has 10.73.73.3.
They are the only two devices in the network.

Then WHY (the heck) is the PC sending requests for 10.73.73.1 and for 10.73.73.31? :question:

Would changing the network addresses to a /30, with the PC 10.73.73.2 and the router 10.73.73.1 make any difference?

I do believe these are red herrings.

Undoubtedly ARP requests for 10.73.73.1 are being transmitted because I’d imagine that @Michiganbroadband configured the IP address on his ethernet interface in Windows statically, and habitually typed in “10.73.73.1” into “Default Gateway” instead of leaving it blank. So the Windows PC is occasionally trying to ARP for its default gateway, probably because Windows is trying to do an “am I online?” check to some Microsoft cloud server. Or maybe it’s trying to check in with a public NTP server. Or something. Who knows! The point is that some other running process on Windows is just repeatedly trying (and failing) to get to the internet. That doesn’t matter.

The ARP requests for 10.73.73.31 could have an explanation that is equally as innocent. Perhaps at one point he had another (different) device hooked up to the computer that was configured to be in the same subnet. Maybe it has a web interface, and he accessed the web interface on it, and then forgot to close the tab after unplugging the device and plugging the RB3011 in its place. So that browser tab could be trying to refresh contents of the page in the background or something. Again: doesn’t matter, shouldn’t make a difference. We can see that both these ARP requests and the ones to 10.73.73.1 are going unanswered, so clearly those hosts aren’t on the same network right now. That the computer is trying to ARP for them just indicates that some running process on the machine wants to talk to those IPs for some reason.

Thank you VERY much for your reply and clearing up routerboot as you did.
I will indeed take the time to answer each and every one of your questions in perfect clarity.
I got super busy with a pressing project for 2-3 days that is critical and I’ll have to come back to this when I get that done.
I am clear in understanding what etherboot vs routerboot are in terms of needing to use etherboot for netinstall and what etherboot does.
I have always executed etherboot the same exact way by holding the botton until it gets past both routerboot options.
Your message cleared me up on what it does versioning and about upgrading it (what exactly is getting upgraded when performing that task) I was unsure if that also had the ability to update etherboot or if it was or was not when it was done.
And yes both RB3011s are software identical in terms of OS and routerboot.
I (possibly unfortunately) upgraded both router boot and the backup routerboot to the latest version so I no longer have the old factory one installed.
But I didn’t try this before trying everything else while the old routerboot was still present.
Although any of that should not have any affect on etherboot if my understanding is now correct. (unbrickable part of the firmware) maybe until I get out the serial console and attempt wiping everything. (if that’s even permitted) and is recoverable.. Sounds like it might require Xmodem LOL. I do miss the '80s.
On my next test run I will make sure the network settings are clean (I’m pretty sure I did not have the gateway configured and the only interface enabled was the one I was using to attempt netinstalls.
The ARP stuff is a bit weird in that I see ARP requests between the netinstall Windows host AND the bootpserver IP address that is configured for the netinstall software itself.
Windows 11 also includes some “discovery protocals” bound to the interfaces by default, I could also make sure any/all of that is turned off and make sure only IPV4 is enabled on it.
But I was seeing that present with both routers (working and non-working).
I will work on getting you solid answers to your questions so we can compare EVERYTHING as you have asked.
But gimme a couple of days.
THANKS!!!