I am a newbie. So sorry if I am not perfect in reporting this issue. I am a end user with a 450G.
It seemed everything was perfect…
Then I noticed that randomly, about once a day or even once every few days, I would loose connectivity for a brief period through the router. I then also noticed that during these periods winbox could not connect to the router. These outages last around 5 minutes. They do self correct and everything then works fine.
The router does not reboot. It makes no entry in the log. The watchdog is disabled.
I am doing quite a bit with the router, mostly for education purposes. I am running a Metarouter.
I turned on the watchdog and it has reset once. However a short time before that I was playing with webconfig and that is really new so it might have caused some issue..
Strangely the winbox service got turned off. Its possible I suppose I did it by accident, but that seems highly unlikely. I did use webconfig to reenable it tho.
I have noticed another issue. I am running a metarouter. Mostly just for fun and experience, It crashed separately from the main router. Required a power cycle reboot.
I should point out that I used 4.5 with this exact same config for months with no issues at all.. Now, rather randomly, after days of perfect operation, data flow stops and I cant log in using winbox. After some number of minutes, it corrects itself and everything is fine again. NOTHING in the logs. It did this with the watchdog turned off.
Maybe try setting up NetWatch with a short interval against a reliable host, and fire a script that just logs something. If you get that log message when you experience the outage, you could change the script to generate a supout.rif instead. Having it generated when the condition occurs is probably more useful than one from after/before the fact unless you have steps to reliably reproduce the issue.
Mate are you making it ping a single IP address? What happens if there is some outage along the way? Or when google shut down their machine, because they have DNS fail-over and load balance? Hm?
The watchdog timer isn’t pinging anything. Its just doing its internal check.
I am also pinging google as a separate thing using netwatch just to let me know that the internet might be down and beeping if it cant get through. Ive been doing that for a year and google has never missed a ping actually. But yes, your point is really valid. I did want a good target that would most likely not be down that was on the net… Its worked actually really well…
I am also using siteuptime to monitor the router from the WAN side. It detects the router being down during the intervals when it sorta locks up, IE when I dont use the watchdog to reset it.
ANYWAY… The watchdog timer was tripping almost every few hours on 5.0 B1 and B2..
12 hours ago a moved it back to 4.9. No other changes…
No issues at all. If it goes 24 hours, then i will switch back to 5.0 B2 and see what happens. If it starts tripping the watchdog timer again, then its gotta be a combo of my config + 5.0B2 + 450G
Besides the Metarouter, I am also using a microSD card and reading / writing from/to it once a hour doing a DDNS update. Yes I should trigger it better then just just checking once a hour, but as I said I am a newbie and im learning scripts.. I have not checked to see if the issue occurs when the MicroSD card is written to. I will when I change it back to 5.0 B2 in 12+ hours..
Upgraded to 5.0 B2 and 2 hours later, watchdog timer tripped. Then again another 3 hours later…
Some combo of 450g+5.0B2+myconfig is causing issues…
While the watchdog timer is set to generate autosup files, it did not generate these for either watchdog trip.. It also did not email them either…
The log does show improper shutdoqn caused by watchdog time tho
Im going to let it run 36 or so hours and see how it goes.. These reboots are annoying enough tho that I might just downgrade to 4.8 till the next beta…
Too many reboots for me on 5.0B2… Im back on 4.9… I saw 5 watchdog triggered reboots in 9 hours. No doubt, 5.0B2 does not work for me with my config on a 450G. 4.9 works perfectly stable.
These may be occurring when my script to update ddns runs. It does talk to the sd card. However it does not always result in a reboot. It sometimes goes 5 hours without rebooting.
:log info “DNSoMatic: Updating dynamic IP on DNS for host $matichost”
:log info “DNSoMatic: User $maticuser y Pass $maticpass”
:log info “DNSoMatic: Last IP $previousIP”
get the current IP address from the internet (in case of double-nat)
/tool fetch mode=http address=“checkip.dyndns.org” src-path=“/” dst-path=“/micro-sd/dyndns.checkip.html”
:local result [/file get micro-sd/dyndns.checkip.html contents]
:if ($currentIP != $previousIP) do={
:log info “DNSoMatic: Update need”
:set previousIP $currentIP
:log info “DNSoMatic: Sending update $currentIP”
:log info [ :put [/tool fetch host=MT user=$maticuser password=$maticpass mode=http address=“updates.dnsomatic.com” src-path=$str dst-path=$matichostp]]
:log info “DNSoMatic: Host $matichost updated on DNSoMatic with IP $currentIP”
} else={
:log info “DNSoMatic: Previous IP $previousIP and current $currentIP equal, no update need”
}
I completely reworked my metarouter. Did one from scratch. Used a mikrotik one. I did all the work using 5.0B2. Its up and working… BUT its disabled.. I want to run 5.0B2 without a metarouter for a bit and see what happens, then enable it…
There were some hiccups.
It would randomly drop the winbox connection to it. Reconnecting after a drop produced odd results, sometimes it said my connection was refused. Sometimes it seemed to loose the user database. I could however login fine from the Metarouter console. No doubt it was acting a bit weird during config..
When I decide to turn on the metarouter again I will run multiping through it so I can monitor if it goes down or is weird…
Not that I even need a metarouter. But I guess I can help isolate issues with beta 5.0
With the metarouter disabled I have gone 23 hours with no reboots at all on 5.0B2…
Im going to let it run longer just to be sure.
It looks like its the metarouter causing the whole router to hang and then the watchdog timer to reboot the router. I will confirm this soon by starting the metarouter again. I suppose what is interesting will be if some specific thing the metarouter is doing causes the hang. I will try and play with it some to maybe isolate what action in the metarouter causes the hang…
Im going to start a metarouter now, but not set any vm.interfaces to it.. So its isolated.. Just let it run and see what happens..
It ran fine for around 8 hours and then started watchdog timer rebooting..
I have disabled the metarouter and I am testing again.. Just to be 100% sure its the metarouter..
Other issues I have noticed along the way. supout.rif hangs and does not complete. Mikrotik support is aware and the issue is being worked on. When I do a backup, the log says “error creating backup file: could not read all configuration files”..
I gave Mikrotik support a login to my router. They are looking at it
This is a annoyingly hard problem to isolate.. I wish I could help more to find it..
36 hours of perfect operation with the metarouter disabled. No doubt the problem is with the metarouter.
I then enabled the metarouter again and rebooted. After around 8 hours, its rebooted and I had a new symptom. I could no longer log into the main router with winbox. I got the error “Could not get index: fatal error”. I could get in via SSH. I power cycled and looked around at everything and could not discover why it suddenly started doing that. I also could not fix it.
The router was working correctly, I just could no longer login via winbox.
I could log into the metarouter with winbox.
I decided to try loading a backup. I did this and it then worked perfectly. Som something in the config of the main router got messed up when the router crashed/hung during my testing that I could only resolve by loading a backup..
Then I made a mistake…
I loaded a backup from the metarouter into the main router. Oops. Dont try this at home folks. This bricked my router and required a full reset by shorting out the reset “hole”. hehehehe oops.. But took zero time and was very easy to recover.. This also did a nice full reset. I decided to hand configure everything and did not load a backup file.
I then had a clean fresh install and config. I started the metarouter and 5 hours later it crashed. Stopped the meta router and it ran for 24 hrs. Started the metarouter and it has been crashing randomly for the last 24 hours. I am running it so Mikrotik support could play with it. They have a login.. If anyone else wants to play with it let me know and I will get you a login..
It has been running with metarouter now for 19 hours ?!?.. So it had a few watchdog reboots and has now been stable for 19 hours ?!? I have now enabled debug logging, on both the main router and the metarouter… We shall see…
With the watchdog disabled, I see about 2 or 3 events a day that last about 3-8 minutes. Each time the router is 100% not responsive and does not pass any data and you cannot log into it. The issue will clear on its own and the router will resume normal operations.
During these periods there are no log entries at all. The CPU graph does not show anything during this period.
This effects both the metarouter and the main router.