observing frequently (= a couple of times per day), but not periodically or reproducable starting a flood of (blue colored) messages in the log = hundreads and thousand of same log Messages per second - seams to be unstoppable:
System: RB3011 with 7.18.23 and 7.21rc3 and 7.21rc4 (same with all versions)
MT-Script: sending metrics every minute via mqtt to my broker (consumed by influxDB/grafana) with connect/disconnect to the broker in any run = I don't keep the session open during the minute
Runs hours w/o any problem and suddenly it starts = not reproducable or periodically
Profile "total": If it is running normal: "mqtt" is 0% (=disapears), during sending ~ 1,5% CPU usage
Profile "total": If the flood starts "mqtt" is constantly +/- 30% CPU usage. Plus logging the CPU is almost 100%
Reboot of the RB3011 always helps - until it starts again. I don't know any other option to stop it or heal it by now.
Questions:
Any idea what could be the root cause and how to heal it w/o rebooting? Why it runs hours and suddenly... fuzzy is not easy to investigate...
Script: Is it possible to check the CPU usage of "mqtt" frequently by a script to react somehow? Happy for any script snippet...
Script: Is it possible to kill the mqtt-process via script and restart it? Would be self-healing at least the symptomes.
Appreciate any idea for my questions. In best case to avoid the root cause.
I've always used auto-connect=yes on the broker. I'd imagine manually re-connecting uses more resources but IDK.
Now it should not leak resources when connecting+disconnecting, which the error message suggests is happening... You may want to grab a supout.rif when it's in bad state (as well as one when it's working) and open a ticket with MikroTik.