Had some issues with a lot of routerboards causing internet service disruption/massive packet loss.
Randomly, the router would not be accessible for 10-30 seconds. No interface flopping logged.
Even weirder, a subnet behind the router will also lose connectivity when this happens.
A netwatch on the router pinging the ISP gateway every second logs problems at the exact time I am unable to access the location remotely.
I have tried several RB1100AHx4 The Dude and finally moved to CHR on a Supermicro server with Intel Xeon, ECC RAM and Intel 10Gb/s nics, inside a VM.
At first, I thought my ISP has issues, seeing so many Mikrotik routers and even CHR behaved the same.
For a time I just decided to not pay attention to this problem, until I decided to disable The Dude.
Lo and behold, the packet loss and service disruptions stopped.
Kept the Dude disabled for one week=no problems at all.
Enabled the Dude again for one week=several issues a day for the whole week.
Disabled the Dude again=no problems at all.
Now it seems very clear to me that the Dude is the cause of all this issues I am having, but at the same time, very few people are reporting this issue. Only one other user to be precise.
I have tried to limit the number of monitored devices, increased the pooling time, monitor only icmp without any other services, all without success.
The Dude database size is 8 Mb so very small.
I am thinking, if this is a Dude issue, a lot more topics would show up on a search, but then again, I cannot see any other possibility besides maybe the ISP seeing a lot of icmp at once, considering it a threat and disabling the connection temporary (which they do not admit to)
Anyone else is having massive packet loss while using the Dude?