Say you have a ring of 5 routers, A-B-C-D-E-A, and the OSPF costs dictate that traffic from D to A will travel in the East direction of the ring (ie, through E). It comes time to do maintenance on the link between D and E, but the link is traffic-bearing. No big deal, right? Just change the costs so that traffic starts travelling West (ie, through C and B), right?
Except that when you change the costs on D's interface to E, it tears down the adjacency, causing all the routes to be withdrawn for the duration of the hello interval of the adjacency between D and C (when D reannounces its routes to C). Traffic bound for customer subnets is suddenly rejected with ICMP Unreachable, and you've got yourself a 5-10 second outage (wait, isn't this why we build the ring - so we wouldn't have outages?).
In Ciscoland, one would simply increase the cost on E's interface to D, and D's interface to E, and the next update would have new costs that would cause the FIBs to be updated.
Either the OSPF daemon under the hood is incapable of being reconfigured on the fly, or the calls that are being made do not take advantage of its ability to do so. The ideal fix is to resolve this and make it possible to change OSPF costs without tearing down adjacencies.
Failing that, if one could influence OSPF update's costs with filters (IE if announcement interface = X, increment the cost of those announcements by Y), that would be an acceptable workaround. This is a little hacky as it means the filter is meddling with the OSPF database, but I can live with a kludge much more easily than I can live with ICMP unreachables to customer subnets.
We work around this today with a script that generates static routes; basically turning off our IGP so that we can turn a knob. I don't have to tell you how much this approach sucks.
Thoughts? Can this be fixed in ROS 6.28/29, or ROS7?
Thanks!