OSPFv3 stuck EXSTART between cisco and ccr

Hi, I’m stuck and need some wisdom from the forms :slight_smile:

Summary:
OSPFv3 between a Cisco Cat 3560G and a CCR-1036 is stuck in EXSTART state
OSPFv3 between the same Cisco and a RB-2011 works every time.

The Link between the Cisco and the CCR is via a pair of AirFiber24 radios.

The link between the Cisco and the RB2011 is via a Rocket M5 radio, PTP.

We are using OSPFv3 to pass IPv6 routes.
This config also has OSPFv2 enabled and its “just working”

We have tested for MTU issues, and do not see any MTU errors from the Cisco or the CCR

When doing a debug ipv6 ospf adj on the Cisco we see the following:
What is interesting is that the Cisco is saying a Bad request is being received from the CCR.
I don’t know what bad request this is.

** CISCO DEBUG IPV6 OSPF NEIG **
Mar 20 08:00:44: OSPFv3: Send DBD to XXX.XXX.XXX.28 on GigabitEthernet0/1 seq 0x2C29E305 opt 0x0013 flag 0x7 len 28
Mar 20 08:00:44: OSPFv3: Retransmitting DBD to XXX.XXX.XXX.28 on GigabitEthernet0/1 [1]
Mar 20 08:00:44: OSPFv3: Rcv DBD from XXX.XXX.XXX.28 on GigabitEthernet0/1 seq 0x2C29E305 opt 0x0013 flag 0x0 len 748 mtu 1500 state EXSTART
Mar 20 08:00:44: OSPFv3: NBR Negotiation Done. We are the MASTER
Mar 20 08:00:44: OSPFv3: GigabitEthernet0/1 Nbr XXX.XXX.XXX.28: Summary list built, size 36
Mar 20 08:00:44: OSPFv3: Send DBD to XXX.XXX.XXX.28 on GigabitEthernet0/1 seq 0x2C29E306 opt 0x0013 flag 0x1 len 748
Mar 20 08:00:44: OSPFv3: Send LS REQ to XXX.XXX.XXX.28 length 84 LSA count 7
Mar 20 08:00:45: OSPFv3: Rcv DBD from XXX.XXX.XXX.28 on GigabitEthernet0/1 seq 0x2C29E306 opt 0x0013 flag 0x0 len 28 mtu 1500 state EXCHANGE
Mar 20 08:00:45: OSPFv3: Exchange Done with XXX.XXX.XXX.28 on GigabitEthernet0/1
Mar 20 08:00:45: OSPFv3: Rcv LS UPD from XXX.XXX.XXX.28 on GigabitEthernet0/1 length 536 LSA count 7
Mar 20 08:00:45: OSPFv3: Bad request received from XXX.XXX.XXX.28 on GigabitEthernet0/1
Mar 20 08:00:45: OSPFv3: GigabitEthernet0/1 Nbr XXX.XXX.XXX.28: Prepare dbase exchange
Mar 20 08:00:45: OSPFv3: Send DBD to XXX.XXX.XXX.28 on GigabitEthernet0/1 seq 0x2C29E308 opt 0x0013 flag 0x7 len 28
Mar 20 08:00:45: OSPFv3: Rcv DBD from XXX.XXX.XXX.28 on GigabitEthernet0/1 seq 0x2C29E307 opt 0x0013 flag 0x7 len 28 mtu 1500 state EXSTART
Mar 20 08:00:45: OSPFv3: First DBD and we are not SLAVE


** CCR LOG **
02:40:50 route,ospf,info OSPFv3 neighbor XXX.XXX.XXX.32: state change from ExStart to Down
02:40:51 system,info device changed by admin
02:40:55 route,ospf,info Database Description packet has init bit set in middle of an exchange
02:40:55 route,ospf,info OSPFv3 neighbor XXX.XXX.XXX.32: state change from Full to 2-Way
02:40:59 route,ospf,info Database Description packet has init bit set in middle of an exchange
02:40:59 route,ospf,info OSPFv3 neighbor XXX.XXX.XXX.32: state change from Full to 2-Way
02:41:02 route,ospf,info Ignoring Link State Acknowledgment packet: wrong peer state
02:41:02 route,ospf,info state=ExStart
02:41:04 route,ospf,info Database Description packet has init bit set in middle of an exchange
02:41:04 route,ospf,info OSPFv3 neighbor XXX.XXX.XXX.32: state change from Full to 2-Way


Relevant config from the CCR

set [ find default-name=ether3 ] l2mtu=1598 name=eth3-af24-505-6001

/routing ospf-v3 instance
set [ find default=yes ] router-id=XXX.XXX.XXX.28

/routing ospf-v3 interface
add area=backbone interface=eth11-uplink network-type=point-to-point
add area=backbone interface=eth3-af24-505-6001 network-type=point-to-point

/ipv6 address
add address=XXXX:YYYY:a000:103::1/126 advertise=no interface=eth3-af24-505-6001



Relevant config from the Cisco

interface GigabitEthernet0/1
description Airfiber to CCR Downtown
no switchport
bandwidth 10000
ip address 172.16.31.4 255.255.255.248 secondary
ip address YYY.YYY.YYY.34 255.255.255.252
ip ospf network point-to-point
ip ospf cost 110
ip ospf 7850 area 0
load-interval 30
ipv6 address XXXX:YYYY:A000:103::2/126
ipv6 enable
ipv6 ospf network point-to-point
ipv6 ospf 7850 area 0
no cdp enable
spanning-tree portfast


ipv6 router ospf 7850
router-id XXX.XXX.XXX.32
redistribute connected


Many thanks for any help.

Just curious

What ROS version is on the CCR…also what ROS version is on the 2011?

Good point, I should have included that in the original post.

CCR v6.34.3
RB2011 v6.32.3

Cisco IOS: c3560-ipservicesk9-mz.150-2.SE2

The CCR was running 6.29 or similar and I upgraded it last night incase it was an issue.
Didn’t have any impact on the problem.

I upgraded the RB2011 to 6.34.3 so its the same version as the CCR now.

That OSPFv3 session came right up no issues. I can reset it and it comes right back.

So to recap:

RB2011 <----Rocket M5 PTP Link----> Cisco 3560G Works just fine

CCR <—AirFiber 24 PTP Link—> Cisco 3560G Stuck in EXSTART

Ciscos devices usually get stuck on ExStart when there’s a MTU mismatch between the neighbors (As in, their interface MTU is not the same).

I know you said you checked it but could work that you make sure that the MTU is the same on the 3560 interface going to the CCR and vice versa.

Enviado desde mi MotoE2(4G-LTE) mediante Tapatalk

Hi, when looking at the OSPF debug from the Cisco there is no tell-tail MTU mismatch showing up in the logs.
Also, OSPF v2 between these two routers is working just fine. Which further leads me to believe that MTU
is ok. The Layer 3 MTU’s match. The Layer2 MTU’s match, no OSPF errors otherwise

Thanks

I know you said the MTUs are good, but this really does sound like an MTU issue - even if your transport MTUs are good it could be a bug or other issue with MTU at the endpoints. The quickest (but maybe not the easiest) test would be to bring the CCR to the Cisco switch and directly cable it in to see if the adjacency comes up.

Might be helpful to do a packet capture and post it here if you aren’t able to directly cable the cisco/CCR together.

Hi, As another datapoint that its not MTU, the CCR in question has a hard cable connection to another Cisco 3560G, and that session is just fine. When doing debug on the Cisco we get a “Bad request received” this request is from the CCR.

Another note: NBMA does not bring up the adjacencies either.

OSPFv2 is working just fine on all of the links, which I think helps rule out MTU issue, that PTP as network type is valid…

Posted this in the other thread as well…

I would seriously check the AirFiber code version and test MTU throughput. There seem to be a number of threads in the Ubiquity forums that claim jumbo frames are enabled but not actually passed…

https://community.ubnt.com/t5/airFiber/AF24-MPLS-Issue/td-p/1099827

Are you saying that with a direct cable OSPFv3 works between the Cisco and CCR?

yes, to a DIFFERENT Cisco. So.


Cisco <–AF 24–> CCR <-wire-> Cisco <–fiber–> Juniper MX480 <–fiber> [The world BGP]

Cisco via AF24 to CCR OSPF v2 works, OSPF v3 DOES NOT

CCR via wire to another Cisco, OSPF v2/3 WORKS
Cisco to Juniper OSPF v2/3 Works

Hate to say it but if a direct cable works and the UBNT AF doesn’t then you have some kind of issue with the AirFiber transport that probably is MTU related.

The easiest way to prove that is to do a packet capture on both sides and see the MTU of the OSPFv3 packets being exchanged. My guess is you’ll see a disparity in one of the captures.

OK, I can see you point. SO then why does OSPFv2 come up and just work on this exact same link?
It would also have MTU issue if there was MTU issues.

I am going to try and capture data

Make a packet capture on working link and on non working link and look fore difference.

I thought I would post a follow-up / resolved message.

First THANK YOU TO MIKROTIK SUPPORT / ENG team.

The issue ended up being an interesting corner case bug in the OSPFv3 code of RouterOS
I’ve tested 6.36rc10 and it’s fixed in that version.

All OSPFv3 sessions are now operating as expected.