751G-2Hnd routerboard anomolies

although there are several other isues regarding the 751U routerboard, i have little or no experience with them, so i have opened this issue specifically for the 751G.

i have 20 7510G-2Hnd units purchased over the last 5 months or so. We are testing them as a possible replacement for new installations to 433AH units now being placed into production. The 433AH units display none of the issues itemized below.

Although i would expect the 751G units to all work the same, and work the same after every reboot, i find the units produce somewhat eratic behavior. Our test environment for now is all 20 units, stacked together in the same environment. All have their wireless enabled, but there is virtually no inbound or outbound traffic. All have the same ROS 5.22 and firmware 1.30 versions installed. (FYI: we have been trying ROS versions as far back as 5.6 without any significant differrence in the test results. All are loaded with exactly the same configuation; the complete ROS configuration and metarouter environment are always loaded with a script.

I find the following:

  1. most of the units indicate 64MB of file space. Four in the latest batch of 12 purchased indicate 128MB of file space. The Mikrotik documentation does not specify a size (as far as i can tell). is this an upgrade, or will they continue to be produce with different size file space?

  2. On a reboot, the systems do not come up identically operational the same every time. On occasion, the following issues are found immediately after a reboot has completed:
    2.a The DHCP client attached to ether1 port comes up indicating ‘Invalid’. if the DHCP-Client is disabled and then re-enabled, it works properly. we have worked around this with a script that checks for a status of ‘bound’ and if not found, then cycle disabled and then re-enabled
    2.b The SSTP_client is not found when printing by a command line request. Attempting to produce an supout file just hangs (at least for 5 minutes) If the hardware is rebooted (without any other changes) this problem is no longer evident.
    2.c a virtual ether interface definition will not be found on system startup. This is used by the metarouter to communicate with the outside world, and that keeps the metrouter applciation from running properly. The metarouter | interface tab indicate that the interface is undefined. Again, a reboot with no changes will very likely fix the problem.

  3. Enabling the watchdog will cause significantly more automatic reboots to occur. Disabling the watchdog eliminates these ‘watchdog’ restarts. From the documentaiton, i believe the watchdog should expire after 1 minute. However, from testing with an external system, i can verify the following:

  • the ROS system has responded to an external ping within the last 10 seconds
  • the metarouter will have sent a TCP messsage to a remote system within the last 10 seconds.
  • many (if not most) of the watchdog restarts occur with significantly less that 1 minute of what the documentaiton indicates as “…reboot if system is unresponsive for a minute”
  • the watch address does seem to work properly, and will reboot the system if no pings responses are received within about 1 minute.
  • unliked the 751U comments, i have not see the ‘flash’ utilization climb to a very high percentage, but it s possible that i ahve not been looking at that time.
  • if the watchdog is disabled, i see no missing pings reported by the external system pinging the 751G, and they continue to run with no watchdog reboots.

As i mentioned above, the 50 or so 433AH units with essentially the same functionality are in produciton and we do not see any similar problems. All of them again are configured identically with each other (and running ROS 5.6)

At this point in our testing, i als can not comment on the reolability of the wireless interface on these units.

From the dwindling responses to the 751U questions, I can’t tell if the problems have been addressed or the participants have given up.

Are others seeing issues with the startup reliability and the watchdog reboots?

thanks in advance.

  1. Every RouterBOARD is produced and shipped with NAND that is fully appropriate to run RouterOS on it (size might be different).

  2. Please make support output file before (configuration exists) and try to make another one after reboot.
    Reboot should not erase any configuration (RouterBOARD independent).

  3. Check /log print for watchdog restart reason. watch-address reboots the router, when 6 consecutive pings fails.

Thanks for the response.

RE: 1) Every RouterBOARD is produced and shipped with NAND that is fully appropriate to run RouterOS on it (size might be different).

Understand. I was not complaining that 64MB appeared on some 751G routerboards, while others had 128MB. I merely expected them to all be produced with the same components. And wondering if future units would be produced all with the 128MB NAND parts, or whether to expect 751G units would be produced, some with 64MB and others with 128MB. This might affect what we could expect so store on the NAND before we ran out of space.


RE: 2) Please make support output file before (configuration exists) and try to make another one after reboot.
Reboot should not erase any configuration (RouterBOARD independent).

I will do this.
Just to be sure we understand each other:
I do not believe the configuration stored on the routerboard is being corrupted or changed. But, I do expect that for each manually executed system reboot that the router would start up with the same information. This does not happen. On some reboots the dhcp-client works properly, on other reboots the dhcp-client indicates it is invalid. On some reboots the SSTP-Client is present and operating properly, on other reboots the SSTP-Client is not found when printing it from the metarouter command line interface (/interface sstp-client print). On some reboots the virtual Ethernet interface is present, on other reboots the virtual Ethernet interface does not exist in the interface device list.

Between each of these reboots, the stored ROS configuration is not changed by us.


RE: 3) Check /log print for watchdog restart reason. watch-address reboots the router, when 6 consecutive pings fails.

Although at times we have turned on both watchdog timer and watch-address, the problems we are experiencing are with the standard watchdog timer and not the watch-address ping timer. We are certain of this by: after the unit has rebooted itself we examine the log to find the reason code. The indications we are concerned about say “system,error,critical router was rebooted without proper shutdown by watchdog timer”. These messages are found in the log after a reboot where the systems was not unresponsive for 1 minute– and where the routerboard system had successfully responded to a ping (from another system) within 10 seconds of the reboot and messages had been received from the router (by another system) within the previous 10 seconds.
We are not confusing this with the log messages which indicate “system,error,critical System rebooted because of ping watchdog timeout”, These we understand are due to lack of response received by the router from a watched (pinged) address.

I will pull together a set of support output files and send to you with comments as to what was done and found. If it would be of help, I can also include the script that is used to configure every one of the units identically. where would you like these files sent (i’m new to sending support files)

Thanks much for the response.

The following 5 router restarts produce different running configurations after the 751G-2HnD 751G-2HnD reboot was completed. There were no changes manually entered between the tests (reboots). All were done on the same 751G-2HnD. These erratic boot-up results can be duplicated on all of the routers we have purchased if we boot them enough times.

We normally run with watchdog time enabled and watch-address set to a valid IP address. For these tests we have temporarily turned both of these features off to simplify the testing and eliminate the reported unexpected watchdog timeouts during these tests.


1.
Set configuration on 751G
Power up 751G, wait for 751G startup to complete, attach to ether3 (IP=169.254.0.1) using winbox
System booted successfully
Created supout1.rif

Reboot 751G using winbox, wait for 751G reboot complete, attach to ether3 (IP=169.254.0.1) using winbox
System booted incorrectly: dhcp client shows ‘I (invalid) and status=stopped
Created supout2.rif

3
Manually Reboot 751G using winbox, wait for 751G reboot complete, attach to ether3 (IP=169.254.0.1) using winbox
System booted incorrectly: this time:
Dhcp client shows invalid as in #2 above, AND
Virtual Ethernet interface ‘ve-meta’ is not in the interface list, and the address list indicates ip address 10.219.8.2/29 as having an invalid interface, and the metarouter interface list shows an ‘invalid’ static interface (where it should be ‘ve-meta’
Created supout3.rif

Manually Reboot 751G using winbox, wait for 751G reboot complete, attach to ether3 (IP=169.254.0.1) using winbox
System booted incorrectly: this time:
Dhcp client shows valid
Virtual Ethernet interface ‘ve-meta’ is not in the interface list, and the address list indicates ip address 10.219.8.2/29 as having an invalid interface, and the metarouter interface list shows an ‘invalid’ static interface (where it should be ‘ve-meta’
Created supout4.rif

5
Set configuration on 751G
Manually Reboot 751G using winbox, wait for 751G reboot complete, attach to ether3 (IP=169.254.0.1) using winbox
System booted successfully
Created supout5.rif

Thanks in advance

Please send your findings and report to support@mikrotik.com, thank you very much for the cooperation.

Mine says 64MiB of space :slight_smile: