Community discussions

MikroTik App
 
sathackr
just joined
Topic Author
Posts: 14
Joined: Thu Dec 25, 2014 5:13 am

Uptime rollover bug/SNMP

Mon Nov 16, 2020 10:31 pm

Hello,

About 497 days ago we deployed our first Mikrotik CRS326 switches running RouterOS 6.44.3 into production.

Today they are one-by-one becoming unreachable via SNMP, and when viewing system uptime in the Web UI, it's becoming clear that the uptime counter is being measured in 32bits and has rolled over.

We suspect this is causing SNMP to fail.

Has there been any update in versions >6.44.3 to address this issue? We have over 400 of these switches deployed and do not want to have to track rebooting them every 497 days.
 
joegoldman
Long time Member
Long time Member
Posts: 551
Joined: Mon May 27, 2013 2:05 am

Re: Uptime rollover bug/SNMP

Mon Nov 16, 2020 11:20 pm

497 days is a long time to go without security upgrades etc.

Perhaps set up a yearly maintenance and upgrade cycle.

Or at the least - have SNMP monitoring start warning at day 450, and become critical at day 480.

Who knows - maybe uptime is 64bit int in newer version of RouterOS - a lot of new versions since your current one.
 
mkx
Forum Guru
Forum Guru
Posts: 5054
Joined: Thu Mar 03, 2016 10:23 pm

Re: Uptime rollover bug/SNMP

Mon Nov 16, 2020 11:50 pm

Linux kernel had 64-bit uptime counter (regardless the HW platform "bitness") since version 2.6 which was released in mid-December 2003.
ROSv7 is built around much newer linux kernel, so the issue will be gone. Not with ROSv6 though, MT is not going to upgrade kernel inside (it's not a trivial task, they stuck to same kernel for too long).

While I tend to agree that some minimum maintenance is right thing to do I don't see that as pressing for a switch where (almost) everything happens inside ASIC / switch chip.
BR,
Metod
 
sathackr
just joined
Topic Author
Posts: 14
Joined: Thu Dec 25, 2014 5:13 am

Re: Uptime rollover bug/SNMP

Wed Nov 18, 2020 11:54 pm

Yep -- also we are always hesitant to upgrade firmware unless there is a specific issue to address. The risk of firmware upgrade and even just a reboot is not zero. We know that 6.44.3 & 6.44.5 work very well on hundreds of switches and thousands of customers. We're not in a hurry to change it every month when there is a new firmware upgrade and/or potential new firmware regression.

More than a couple of times I've had a MT device fail after a firmware upgrade or simple reboot (corrupt routerboot, corrupt flash, and self-recovery fails and causes and outage and requires subsequent truck roll)

We protect the devices with a robust firewall rule set, and while not perfectly secure, it serves our purposes.

The rollover bug itself isn't necessarily a problem, but SNMP dies somehow in connection with it and makes the devices unmonitorable.

Who is online

Users browsing this forum: Bing [Bot], Google [Bot], sindy and 194 guests