5.16 - x86 configured as l7 shaper rebooted

We got a reboot without proper shutdown today on a x86 box configured as l7 shaper.

Yesterday we upgraded from 5.14 to 5.16 because on the new release MT supporto l2mtu on intel 82574L.


My idea is that the appliance was rebooted by the bios watchdog after some frizze.

is there someone that have similar experience?

regards
Ros

I would blame the 82574L - it has a weird bug that causes it to freeze/restart on heavy usage (500+Mbit) with MSI-X turned on. See intel’s errata.

No idea whether this is a real cause of your reboot, on my configurations the problem always appeared as interface going down/up, but from the errata it seems like it could actually cause kernel panic.

I will try to disable MDI-X.

thanks
Ros

We recently configured one dual CPU xeon x86 to be the core router and PPPoE concentrator. Everything was fine for almost a month or so.

The same upgrade was done: 5.14 to 5.16, and since then, we have random reboots! Sometimes once a day, sometimes twice in one hour, but sometimes it is everything OK for several days.
This is very annoying, as all the users get disconnected and we have to wait for session timeout in radius for them to be able to log in again.

I will upgrade to 5.17 tomorrow, but if this continue I will downgrade to 4.17.

Any idea what can cause this reboot??

everyhting seems related to ethernet chipset and multicore.

My advice is to still user RPS on the ethernet with more traffic and disable MDI-X on the ethernet ports.

regards
Ros

Thank you for your feedback!

This router is actually a HP DL380 Proliant G5, with one additional quad gigabit ethernet card.
Max. bandwidth utilization is < 50Mbit.
The router has 8GB RAM, but of course only 2 is visible.

What kills me is that I have the same configuration on 4 different places in Europe, and it never reboots like this! The utilization there is up to 400Mbit! And it works like a charm..

I will disable mdi-x and we’ll see will it make a difference.

where can i find the option to disable mdi-x
i can’t find it

You can’t find it in winbox, you must use the console.

It should be here:

/interface ethernet edit ether1 mdix-enable

and then the editor will be opened and you can edit the value

Everything is documented here:

http://www.mikrotik.com/testdocs/ros/3.0/interface/ethernet.php

I have 6 ethernet ports, 2 integrated and one quad port gigabit NIC
All ports already had mdix disabled. However, I still experience random reboots.

Any other clue what to change? Or, maybe it will really be the best to downgrade to the latest stable 4.x version?
I never had problems like this with 4.x, and I’m using almost the same hardware everywhere.

I had similar issues, got an email from support saying there were problems with Simple Queues and this kernel.

I am running 5.7 right now. I found this thread looking for any info about 5.17 or better on x86.

Hopefully they won’t mind me posting this.

Hello,

At the moment there are few known issues that can and will be fixed only in
upcoming RouterOS v6.x, which will be ready for public release sometime this
summer. Perhaps the best choice for you would be to keep using RouterOS v4.17
until stable version 6 is released.

Currently (v5.11 and above) all x86 crashes/reboots/halts are related to 2 things:

  1. resource management in complex setups on multi-core systems - as temporary
    solution you can:
    *) reduce number of CPU cores (by disabling Hyperthreading on Intel CPU boards)
    *) reduce possible amount of core switching -
    a) disable all entires in “/system resource irq rps” menu
    b) while monitoring “/system resource cpu” menu, manually allocate “/system
    resource irq” to specific cores
    c) change your configuration to avoid using “global-in” and “global-out” HTBs
    (no simple queues)
    d) change your configuration to avoid deep packet inspection (“layer-7”,
    “content” all 3 “use-ip-firewall” options.
    e) in “/queue interface” menu set all queues to “multi-queue-ethernet-default”
    *) switch to single-core Linux Kernel - “/system hardware set multi-cpu=no”

  2. interface driver problems - we are not maintaining other vendor drivers, so
    fixes to driver related problems can be done only with driver update, some driver
    updates require Linux Kernel update - and in RouterOS that is possible only with
    next major RouterOS release (v6.x) as temporary solution you should change your
    interfaces to other type/vendor cards.

Regards,
Janis Becs

I do wonder if the same problem exists on AMD systems. I have a backup for our Core i7 router running a Phenom II that hasn’t had any issues, however it is just sitting there, no traffic passing through.