Hello All, I originally posted this yesterday in the main “License and V4 Issues” topic yesterday, but feel it may be a little more out of that topic, so I am reposting in the main board. I was not able to move the topic, so hopefully Admin can delete the post at his leisure.
Hark /b/rothers and hear my tale of woe…
I just had two very unfortunate experiences when upgrading a four-proc server and a two-proc PC from ROS 3.30 to ROS 4.2.
I followed the steps outlined on the download page, i.e., upgraded the license file, etc.
-
The server. After reboot, three Intel Pro/1000 PCI cards would not longer function as they no longer had drivers! The two onboard Realtek ethernet ports worked fine. I had to downgrade back to 3.30 and everything returned to normal, except that the hotspot put another set of files on top of the ones that were in use and that had to be cleaned up. A supout.rif was emailed to Mikrotik.
-
The PC. In a moment of mental retardation, I upgraded this over the link as it is only 1km away, but lies through Taliban territory. On reboot, the PC never came back to life. Sighing mightily, I strapped on my 9mm and body armor and drove the 1km from Kandahar Airfield to FOB Lindsey. Thankfully, I had the presence of mind to take a monitor and keyboard with me, because when I got there, both the onboard and two PCI Realtek 8139 ethernet ports had become autistic and no longer spoke to the outside world. An Ouija board was not at hand, and I could not login via WinBox to drop the downgrade files, so I wound up having to use NetInstall and start from scratch. Thankfully, I had the forethought to make a configuration backup. Again the buggardly thing had no drivers.
Routerboard RB433, RB450G and RB600A boards had no problems with the upgrade except they would not act as NTP clients to the NTP server. Initially when all were at ROS 3.30 the server was set at manycast and each routerboard was set to unicast and all was well in dogpatch. Now they will not connect to the ROS 3.30 server to get NTP. I have not set the server to broadcast yet, but I would rather not if this can be resolved.
I have submitted this to Mikrotik, but I’m hoping that someone else has seen this issue and perhaps found a workaround for it. if nothing else, heed my warning and wait a little before upgrading. It strikes me that there may be a serious issue with the driver package in ROS 4.2.
Been a horrible day and I sure wish I could lay my hands on a bottle of Johnny Walker Blue!
RadioResearch
v4 has a new Kernel and MikroTik can not test it with every ethernet card in existence
I guess its time for them to either
- never break an ethernet driver again
or
- have fail-safe script that would detect the missing driver/no connectivity and fall back on the previous RouterOS version
This can partially be done with a simple script written in the RouterOS scripting language, that does ping and if ping fails for 10 minutes - it would dowgrade the installation back.
But in order to have the npks for the downgrade, they have to be allowed to sit in Files during the upgrade of the new version. I am not sure if for example routeros-3.30.npk will survive just sitting there while you upload routeros-4.2.npk and reboot the router. I think the upgrade process will detect the old npk, put something about it in the log, and delete it from disk.
Can we rename an .npk file so that the upgrade does not see it? Like a fail-safe file on the disk that our script would rename to .npk in case a downgrade is needed to restore communication with the hosts selected for pinging …

Well, both of these ethernet cards are mainstream and Mikrotik has had those drivers in there for a long, long time. I see no reason why one would remove them since they are so ubiquitous and popular.
I received this strange reply to my supout.rif from Janis:
QUOTE:
"Hello,
We double checked - there are no changes in between 3.30 and 4.2 that is at least remotely connected to problematic area. At this point we have no ideas and as this is first report with this issue I can only suggest to check your installation method and hardware.
Regards,
Janis Megis"
END QUOTE
I have written back to Mikrotik and suggest they actually stage this in the lab instead of just comparing the two packages and saying they see no issues. That strikes me as being somewhat cavalier and borderline irresponsible.
Let us see if they actually attempt it in the lab and what the results are. I am still waiting for someone else to have this problem. After all, what I did was follow their exact steps (except for the routerboard upgrade), load the new packages on the equipment and reboot. The equipment follows their scripts and does the upgrade, eliminating me from the equation. If it didn’t work, then there must be, as NetworkPro suggests, a problem with a script.
No addressing of the NTP issue yet either.
RadioResearch
bad RAM maybe … ?
I wrote to MT support about my idea for a fail safe capability that would be possible if
- RouterOS enables us to keep a copy of the old .npks in Files
- Downgrade is possible through the script we would make
I dont know if they will do that since they seem soo confident that the update should work. And as a matter of fact I just updated a bunch of routers, I do it all the time. And all seems stable.
on x86 so far strange behaviour was due to bad RAM etc…
Thanks for the advice NetworkPro!
I checked all the RAM and all tests good. I think more than likely it had to do with the network cards.
Are you by chance using either an Intel pro/1000 or Realtek 8139 cards?
Are you able to make NTP work with 3.30 as a server and 4.2 as a client?
This is really maddening and I am loath to do it again. As soon as I get some more cards delivered I will do a test setup here and see what happens.
Thanks again.
Radioresearch
CORRIGENDUM
After 36 hours (!), the 4.2 NTP clients finally synchonized with the ROS 3.30 server.
The error message displayed was that the server was not synchronized. This is the error message for the last unsuccessful attempt.
Interesting that the server claims to be synched and all of the 3.30 clients happily connect to it.
So, in summary, as far as the NTP issues are concerned, this is perhaps not so drastic, but does bear an investigation, as they do tend to work in the long run.
RadioResearch
The thing with upgrading/downgrading is that configuration is LOST when for example (happened to me just now) I switched from v4.2 with routing package to v3.30 without routing or routing test package. My default route went missing, good thing my colleague on the other side restored it on time. Had I uploaded routing-test package this would not have happened.
Let this be a warn to the developers as well - be sure to include the configs across versions and packages, including Downgrade, OK buddies? OK.