When items go down, dude is sending tcp syn packets to them and dont close these connections or just repeating them over and over or just wait too long.
I can see these connections on netstat (syn sent) filling up to 10 wich is xp prof limit and stays at that limit.
This is causing bad performance of windows box when there are many items down on the dude map.
it is very simple to calculate number of TCP services that has to be down
(timing out) to have XP SP2 not being able make any new TCP connections:
if
(probe interval / probe timeout) * 10
is greater than
number of services down
new TCP connections will be practically impossible
You can limit number of services in ‘down’ state by configuring proper
dependencies between devices. In that case devices and their services that
are behind device that is down get into ‘unknown’ state and in this state
they are not polled and connection attempts to them are not made.
Well, if you enable ‘Router OS’ checkbox in device settings Dude tries to
maintain connectivity with router using special protocol. This connection
timeout is hard coded to 10 seconds, if it fails it waits for another 10
seconds and then tries again.
Will change those values to 3 and 60 seconds respectively in next Dude
release.
Yeah, that will solve the problem (until someone have hungreds of down devices), but what about just not to try to connect to devices wich are considered “down”… ?
Because there can be devices that have zero services, but that doesn’t mean
that they are not reachable.
Also some or all services being down doesn’t mean that that routeros special
service is down as well.
And what state will they have? For sure this will be not “down” but “unknown”…
I am not talking about partialy down devices…
I thing “router os” with all services down should be considered “dead”…
Ping service down… what the hell that can mean other than that device is dead??? LOL (maeby you have in mind hacker is playing with our firewall on that box, and it should be consideret “up” whatever heppend…)
But thx for the fix on the next release, 3s/60s timing will solve all the problems anyway as I have only 100 mikrotik devices on the map for now…