Wondering if anyone has experienced anything like this. We have a CCR1072 core router that links to several other CCR routers (1016, 1036, 1072). The router runs both BGP and OSPF and links our AS to the internal network.
We decided to embark on a routine upgrade since it had net been done for close to a year, due to the high availability requirements of the router we are always reluctant to do so. However we went ahead with an upgrade to the current Long Term release.
Absolute disaster. The router did not come back up after the upgrade. This necessitated a drive to the data center to hard reboot. After that the router came back up, however this is when we discovered that almost every CCR router that was connected to this router had also frozen up. This required driving to multiple data centers to reboot routers.
This has been completely unprecedented for us. A lockup on upgrade, while very rare, has occurred before. However I have never seen a failure on one router manage to lock up multiple other routers simultaneously. The locked up routers could still be logged onto locally, however they did not pass any traffic through interfaces. Only a hard reboot solved the problem.
This is extremely worrisome as it locked up our network for several hours. Has anyone else ever experienced anything like this? Any feedback is appreciated.
Unfortunately at the time I did not think to collect supout files manually, and since the routers did not think they were down they did not do an auto supout.