After upgrading from 6.27 to 6.29.1 we have had now several routerboards that stopped passing traffic.
routerboard 1 (rb433AH); IP 10.50.50.1/26 on ether2. (Port is not in a bridge. Is gateway address for attached network)
connect with utp cable to;
routerboard 2 (rb750UP); IP 10.50.50.2/26 on ether1. (This port is PoE-in via some powershot and is bridged with Ether2 which is master of the rest of the ports in the switch config)
On the rb750UP we setup a watchdog with ping address 10.50.50.1
This setup ran for months without problems on the v.6.27. Now some 20 hours after the upgrade traffic stopped passing from rb750UP towards rb433AH.
On the rb433AH the port statistics show 0 bps and it stays like that until after a 20mins car drive I just did a powercycle on the rb750 and traffic came back......
Then it happened again after another day or so.
This now happened on 2 other occasions with similar setups but in complete different parts of the network.
What I don't understand is why the watchdog on rb750 that is supposed to ping the IP of the rb433AH doesn't make the rb750UP to reboot? According the log the rb750UP only rebooted after the manual powercycle. Before is has not a single reference to a reboot.
So, is the watchdog not working?
Or is the ping still coming back although the opposite port (that is 'pinged') shows it doesn't receive traffic?
Should I maybe put a watchdog ping towards a router higher up in the route? Is it actually the rb433AH that is malfunctioning and put the forwarded traffic to a hold where it still answers a ping package?
Anybody has any clues?
(All routers have hardware queues enable, installed latest ros with firmware and fastpath enabled.
All ports are set to manual rate negotiation at 100mbps. 'Auto' really effects the throughput in a negative way...)