Community discussions

MikroTik App
 
suran
just joined
Topic Author
Posts: 15
Joined: Fri Dec 16, 2011 9:43 pm

Feature Request: OSPF Cost Changes Without Adjacency Loss

Tue Mar 31, 2015 9:18 pm

Say you have a ring of 5 routers, A-B-C-D-E-A, and the OSPF costs dictate that traffic from D to A will travel in the East direction of the ring (ie, through E). It comes time to do maintenance on the link between D and E, but the link is traffic-bearing. No big deal, right? Just change the costs so that traffic starts travelling West (ie, through C and B), right?

Except that when you change the costs on D's interface to E, it tears down the adjacency, causing all the routes to be withdrawn for the duration of the hello interval of the adjacency between D and C (when D reannounces its routes to C). Traffic bound for customer subnets is suddenly rejected with ICMP Unreachable, and you've got yourself a 5-10 second outage (wait, isn't this why we build the ring - so we wouldn't have outages?).

In Ciscoland, one would simply increase the cost on E's interface to D, and D's interface to E, and the next update would have new costs that would cause the FIBs to be updated.

Either the OSPF daemon under the hood is incapable of being reconfigured on the fly, or the calls that are being made do not take advantage of its ability to do so. The ideal fix is to resolve this and make it possible to change OSPF costs without tearing down adjacencies.

Failing that, if one could influence OSPF update's costs with filters (IE if announcement interface = X, increment the cost of those announcements by Y), that would be an acceptable workaround. This is a little hacky as it means the filter is meddling with the OSPF database, but I can live with a kludge much more easily than I can live with ICMP unreachables to customer subnets.

We work around this today with a script that generates static routes; basically turning off our IGP so that we can turn a knob. I don't have to tell you how much this approach sucks. :)

Thoughts? Can this be fixed in ROS 6.28/29, or ROS7?

Thanks!
 
User avatar
ZeroByte
Forum Guru
Forum Guru
Posts: 4051
Joined: Wed May 11, 2011 6:08 pm

Re: Feature Request: OSPF Cost Changes Without Adjacency Loss

Wed Apr 01, 2015 12:29 am

I agree that changing an interface's cost should not break an adjacency, but even breaking an adjacency on a working link shouldn't cause what you're seeing.

I always just remove OSPF from an interface before doing maintenance on it. In ciscoland.
Zero packets lost while the network re-converges (the link still is up, so it will pass the odd packet or two which were in the buffer while LSAs go out announcing the detour). After about 30 seconds or so, I take the link down for real and do the work. Something sounds wrong if you're getting destination unreachable messages.
(granted, I didn't use a ring topology)

Changing cost then dropping link then raising link then fixing cost = 4 reconverge events.
remove ospf / restore ospf when done = 2 reconverge events.
When given a spoon,
you should not cling to your fork.
The soup will get cold.
 
suran
just joined
Topic Author
Posts: 15
Joined: Fri Dec 16, 2011 9:43 pm

Re: Feature Request: OSPF Cost Changes Without Adjacency Loss

Wed Apr 01, 2015 2:06 am

Appreciate the feedback. This behavior is reproducible in a lab environment with a trivial configuration. :)

Just to be clear, router D has 3 interfaces. 2 face other routers, one faces customers. Recosting the active interface causes packets bound for the customer interface to go ICMP unreachable (because that route has been temporarily removed from the FIBs of the surrounding routers).
 
User avatar
ZeroByte
Forum Guru
Forum Guru
Posts: 4051
Joined: Wed May 11, 2011 6:08 pm

Re: Feature Request: OSPF Cost Changes Without Adjacency Loss

Wed Apr 01, 2015 2:15 am

This wouldn't be the first "not like Cisco" thing I've found in Mikrotik OSPF - I'll be honest.

The thing that's NOT like Cisco I like the most about Mikrotik though is that you don't have to sell your soul just to own a measley brach office router, and then lease it out again every year for support contracts JUST TO HAVE ACCESS TO FIRMWARE UPDATES!!!!!

(Just so people don't think I'm a big ol' Cisco worshiper)
When given a spoon,
you should not cling to your fork.
The soup will get cold.
 
User avatar
ZeroByte
Forum Guru
Forum Guru
Posts: 4051
Joined: Wed May 11, 2011 6:08 pm

Re: Feature Request: OSPF Cost Changes Without Adjacency Loss

Wed Apr 01, 2015 2:18 am

Appreciate the feedback. This behavior is reproducible in a lab environment with a trivial configuration. :)
I think I will lab this up just for my own edification - in your experience, how long does convergence take?
How many routes are in your table?
(so I make a fair comparison)
Do you use static default route or OSPF default-information?
When given a spoon,
you should not cling to your fork.
The soup will get cold.
 
suran
just joined
Topic Author
Posts: 15
Joined: Fri Dec 16, 2011 9:43 pm

Re: Feature Request: OSPF Cost Changes Without Adjacency Loss

Wed Apr 01, 2015 7:35 pm

Appreciate the feedback. This behavior is reproducible in a lab environment with a trivial configuration. :)
I think I will lab this up just for my own edification - in your experience, how long does convergence take?
How many routes are in your table?
(so I make a fair comparison)
Do you use static default route or OSPF default-information?
~1000 routes, including default ('just another route'). But you should see it with fewer; actual convergence is extremely fast, the delay is largely the product of the hello timer as far as I can tell.
 
User avatar
ZeroByte
Forum Guru
Forum Guru
Posts: 4051
Joined: Wed May 11, 2011 6:08 pm

Re: Feature Request: OSPF Cost Changes Without Adjacency Loss

Wed Apr 01, 2015 10:57 pm

Here's my test topo:
5-tiks-ospf ring.png
Here's my node 4 config: (cost = 50 was just the last cost I set on the link to node 5)
[admin@Mikrotik-4] /routing ospf interface> /export compact
# apr/01/2015 19:24:10 by RouterOS 6.27
#
/interface bridge
add name=lo1
/ip neighbor discovery
set ether1 discover=no
/routing ospf instance
set [ find default=yes ] router-id=4.4.4.4
/ip address
add address=10.1.1.4/24 interface=ether1 network=10.1.1.0
add address=4.4.4.4/32 interface=lo1 network=4.4.4.4
add address=10.0.34.4/24 interface=ether2 network=10.0.34.0
add address=10.0.45.4/24 interface=ether5 network=10.0.45.0
add address=10.4.0.1/24 interface=ether3 network=10.4.0.0
/ip route vrf
add interfaces=ether1 routing-mark=mgmt
/routing ospf interface
add cost=50 interface=ether5 network-type=point-to-point
/routing ospf network
add area=backbone
/system identity
set name=Mikrotik-4
Would you call this a fair representation?
(not shown = a single 10.x.0.1/24 on each router as a "customer" interface)
(Also - for some reason, internally, GNS3 swapped ether3 and ether5 on tik 4 so ether5 -> node5 on node 4, the drawing label is incorrect)

When I change the cost of the interface on router D, router E shows:

echo: route,ospf,info OSPFv2 neighbor 4.4.4.4: state change from Full to Init
(and sadly, there's no info-level loging event when the state changes Init -> Full)

Debug level shows all of the LSA flooding and route changing, etc. that you would expect, and indeed, the uptime on the adjacency does reset. (Tested simpler topo in Cisco - adjacency doesn't reset, just sends a LS-Update)

During the change, I ran a constant ping from 1.1.1.1 to 4.4.4.4 - Even running this in GNS3, I only dropped one ping (due to TTL exceeded @ node 5) during the re-converge, but it didn't take a neighbor timeout. This behavior was the same for broadcast and point-to-point network types. Granted, I didn't stuff the routing table full of routes, which might take slightly longer if more LSAs have to be sent/updated. You did say that convergence is fast once the other side times out....

I must still have something different because the route never gets torn down completely (which would cause the unreachable). It just switches direction around the ring, and if a node between the propagating topo-change LSA updates happens to send the old way, then the packet bounces back and forth between the two routers that disagree.
(which is why I got TTL expired as my single ping that fails)
You do not have the required permissions to view the files attached to this post.
When given a spoon,
you should not cling to your fork.
The soup will get cold.
 
suran
just joined
Topic Author
Posts: 15
Joined: Fri Dec 16, 2011 9:43 pm

Re: Feature Request: OSPF Cost Changes Without Adjacency Loss

Thu Apr 02, 2015 7:08 pm

Yea, that's a fair representation. You might only see a single TTL exceeded due to the small size of the route database; I get a few more in the lab but I'm pinging at 50-100ms intervals. In production, it takes a while longer due to the sheer number of routers.

In either case, those TTL exceeded's are sufficient to cause TCP connections to close, calls to drop, etc.

Bottom line: adjacencies simply shouldn't drop when you change the cost metric on an interface. You go from sending a simple LSUpdate to having to load and synchronize the entire LS database. During the synchronization, packets will be lost due to the resulting routing loops.
 
User avatar
ZeroByte
Forum Guru
Forum Guru
Posts: 4051
Joined: Wed May 11, 2011 6:08 pm

Re: Feature Request: OSPF Cost Changes Without Adjacency Loss

Thu Apr 02, 2015 9:03 pm

I agree 100% with your statement.
bouncing an adjacency is just a lazy action for the OSPF process to do.
It should not do this.

Sizeable routing table of 1000+ prefixes means that a topo change is not as instantaneous as the lab, I completely understand.

Obviously you know your network and I don't but if this sort of thing happens on maintenance, then it certainly happens on a real failure - Perhaps a different topology than a ring could keep link failures from being so disruptive. Your comment leads me to believe that there are more than 5 routers in your ring.

For what it's worth - in my experience, rings are for SONET, Brocade RRP, and other protocols that are designed as rings.
Higher layer protocols that operate hop by hop tend to be sloppy at handling topo changes.
(when I started at a telephone company, the CCIE had made a ring of switches around town running spanning tree. - a fiber cut or a bounced interface of any kind would cause 90seconds of down time while spanning tree (not rapid spanning tree - regular spanning tree) dealt with the change - I came to call them netquakes because they were common enough to deserve a name)

Compare these two topologies with 12 access routers:
(obviously, if physical or budget constraints don't allow such a design as the 'fancy' one, then obviously it's not possible)
Drawing16.png
Drawing16a.png
Again - I agree that cost change should not bounce the adjacency - I'm just not a big fan of rings...
You do not have the required permissions to view the files attached to this post.
When given a spoon,
you should not cling to your fork.
The soup will get cold.
 
fallegretti
newbie
Posts: 33
Joined: Thu Jul 20, 2017 1:23 pm

Re: Feature Request: OSPF Cost Changes Without Adjacency Loss

Fri Nov 02, 2018 1:35 pm

Hi all,
Does anyone know if this has been introduced in one of the latest version? Running 6.39.1 and still an issue, planning the upgrades soon. Thanks
 
TheCiscoGuy
Frequent Visitor
Frequent Visitor
Posts: 51
Joined: Fri Jun 22, 2018 8:32 am

Re: Feature Request: OSPF Cost Changes Without Adjacency Loss

Fri Jan 04, 2019 7:50 pm

Problem still exists, more-over it restarts the BFD session too. This is a pretty lazy way of handling a simple cost change.....mikrotik please add a feature request to the development cycle. I hope it doesn't come back to the RouterOS7 unicorn.....
Network Solutions Engineer and Trainer
Cisco | Juniper | Mikrotik | Ubiquiti

Who is online

Users browsing this forum: freemannnn, Google [Bot], paulct and 102 guests