cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
5921
Views
15
Helpful
12
Replies

OSPF neighborship keeps dropping and reforming

deca24
Level 1
Level 1

Hello all,

I am having an issue to which I have not really be able to find a cause or resolution. The OSPF neighbor relationship between 2 routers keeps dropping, then immediately reforming the adjacency.

ROUTER1GW#
May 13 13:35:51 EDT: %OSPF-5-ADJCHG: Process 10, Nbr 10.5.1.1 on Port-channel2.30 from LOADING to FULL, Loading Done
ROUTER1GW#
May 13 13:36:40 EDT: %OSPF-5-ADJCHG: Process 10, Nbr 10.5.1.1 on Port-channel2.30 from FULL to DOWN, Neighbor Down: Dead timer expired

GOLABINET1GW#
May 13 13:36:48 EDT: %OSPF-5-ADJCHG: Process 10, Nbr 10.5.1.1 on Port-channel2.30 from LOADING to FULL, Loading Done

 

This keeps happening and I am not sure where to start looking to get this fixed.

I can watch the dead timer count down till it hits 0 then re-establishes the neighbor ship and repeats the process.

A little help to get pointed in the right direction would be awesome!

 

Thanks!!

12 Replies 12

Giuseppe Larosa
Hall of Fame
Hall of Fame

Hello @deca24 ,

The OSPF dead timer expired means that no OSPF hello or other OSPF packet has been received from the neighbor for 40 seconds (default dead timer value).

After the OSPF adjacency is declared down it reforms and then again it is declared down for dead timer expired

 

>>ROUTER1GW#
May 13 13:35:51 EDT: %OSPF-5-ADJCHG: Process 10, Nbr 10.5.1.1 on Port-channel2.30 from LOADING to FULL, Loading Done
ROUTER1GW#
May 13 13:36:40 EDT: %OSPF-5-ADJCHG: Process 10, Nbr 10.5.1.1 on Port-channel2.30 from FULL to DOWN, Neighbor Down: Dead timer expired

 

Here, you can have an issue with excessive user traffic in the direction from the other router the one with IP 10.5.1.1 towards the local router with a lack of QoS policy to protect OSPF.

 

Depending on what the device with IP 10.5.1.1 is you can build a policy map to apply outbound to protect OSPF traffic and possible other routing traffic.

 

Example:

on other devicewith IP 10.5.1.1

access-list 189 remark OSPF

access-list 189 permit ospf any any

 

class-map  ROUTING-PROTOCOLS match-any

match address 189

match dscp cs6

 

policy-map SAFE

class ROUTING-PROTOCOLS

bandwidth percent 5

class class-defaut

fair-queue

 

interface portchannel  x.30

service-policy SAFE out

 

Edit:

depending on the platform the policy map can be applied at port channel level or you need to configure it on each physical member link.

For example I had to do in the second way on ME 3600

 

Hope to help

Giuseppe

 

Martin L
VIP
VIP

 

Any other issues? L1 or 2 issues? port channels or individual links staying up,?

Try increase OSPF Hello/Dead timer interval  to larger value to see if that helps (and it is load/bandwidth problem)

 

Regards, ML
**Please Rate All Helpful Responses **

Hello

Post the output of

sh etherchannel summary
sh ip ospf interface brief

debug ip ospf adj


Please rate and mark as an accepted solution if you have found any of the information provided useful.
This then could assist others on these forums to find a valuable answer and broadens the community’s global network.

Kind Regards
Paul

Hi Paul,

Please forgive my oversight, I didn't put it in initially because we didn't think it was contributing to the problem but I am going to add it in anyway now..  We have a transparent firewall between the core switch and the router. It does not seem to be blocking anything because there are no problems with another router and the core switch to which the firewall is between as well.

 

As requested here is the result of the commands you asked for:

*************CORE SWITCH**********************
CORE#sh etherc summ
Flags: D - down P - bundled in port-channel
I - stand-alone s - suspended
H - Hot-standby (LACP only)
R - Layer3 S - Layer2
U - in use N - not in use, no aggregation
f - failed to allocate aggregator

M - not in use, no aggregation due to minimum links not met
m - not in use, port not aggregated due to minimum links not met
u - unsuitable for bundling
d - default port

w - waiting to be aggregated
Number of channel-groups in use: 35
Number of aggregators: 37

Group Port-channel Protocol Ports
------+-------------+-----------+-----------------------------------------------
2 Po20(SU) LACP Gi1/2/2(P) Gi2/2/2(P)

CORE#sh ip ospf int br
Interface PID Area IP Address/Mask Cost State Nbrs F/C
Vl20 10 0 10.5.1.1/23 10 BDR 2/2
CORE#debug ip ospf adj
OSPF adjacency debugging is on
CORE#term mon
CORE#
May 14 08:32:28 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
May 14 08:32:28 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
CORE#
May 14 08:32:34 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
CORE#
May 14 08:32:44 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
CORE#
May 14 08:32:53 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
May 14 08:32:54 EDT: OSPF-10 ADJ Vl20: Nbr 10.5.1.4: Clean-up dbase exchange
CORE#
May 14 08:33:02 EDT: OSPF-10 ADJ Vl10: Send with youngest Key 1
May 14 08:33:03 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
CORE#
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Cannot see ourself in hello from 10.5.1.4, state INIT
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Neighbor change event
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: DR/BDR election
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Elect BDR 10.5.1.1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Elect DR 10.5.1.3
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: DR: 10.5.1.3 (Id) BDR: 10.5.1.1 (Id)
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Remember old DR 10.5.1.4 (id)
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Rcv DBD from 10.5.1.4 seq 0x1C16 opt 0x52 flag 0x7 len 32 mtu 1500 state INIT
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: 2 Way Communication to 10.5.1.4, state 2WAY
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Neighbor change event
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: DR/BDR election
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Elect BDR 10.5.1.1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Elect DR 10.5.1.3
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: DR: 10.5.1.3 (Id) BDR: 10.5.1.1 (Id)
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Nbr 10.5.1.4: Prepare dbase exchange
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Send DBD to 10.5.1.4 seq 0xD8C opt 0x52 flag 0x7 len 32
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: NBR Negotiation Done. We are the SLAVE
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Nbr 10.5.1.4: Summary list built, size 46
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Send DBD to 10.5.1.4 seq 0x1C16 opt 0x52 flag 0x2 len 952
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Neighbor change event
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: DR/BDR election
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Elect BDR 10.5.1.1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Elect DR 10.5.1.4
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: DR: 10.5.1.4 (Id) BDR: 10.5.1.1 (Id)
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Remember old DR 10.5.1.3 (id)
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Neighbor change event
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: DR/BDR election
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Elect BDR 10.5.1.1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Elect DR 10.5.1.4
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: DR: 10.5.1.4 (Id) BDR: 10.5.1.1 (Id)
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Rcv DBD from 10.5.1.4 seq 0x1C17 opt 0x52 flag 0x1 len 72 mtu 1500 state EXCHANGE
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Exchange Done with 10.5.1.4
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Send LS REQ to 10.5.1.4 length 36 LSA count 1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Send DBD to 10.5.1.4 seq 0x1C17 opt 0x52 flag 0x0 len 32
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Rcv LS UPD from 10.5.1.4 length 64 LSA count 1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Synchronized with 10.5.1.4, state FULL
CORE#
May 14 08:33:10 EDT: %OSPF-5-ADJCHG: Process 10, Nbr 10.5.1.4 on Vlan20 from LOADING to FULL, Loading Done
CORE#
May 14 08:33:12 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
May 14 08:33:12 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
CORE#
May 14 08:33:16 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
CORE#
May 14 08:33:19 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
May 14 08:33:19 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
May 14 08:33:19 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
CORE#
May 14 08:33:22 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1

***********ROUTER1GW**************
ROUTER1GW#sh etherc summ
Flags: D - down P/bndl - bundled in port-channel
I - stand-alone s/susp - suspended
H - Hot-standby (LACP only)
R - Layer3 S - Layer2
U - in use f - failed to allocate aggregator

M - not in use, minimum links not met
u - unsuitable for bundling
w - waiting to be aggregated
d - default port


Number of channel-groups in use: 2
Number of aggregators: 2

Group Port-channel Protocol Ports
------+-------------+-----------+-----------------------------------------------
4 Po4(RU) LACP Gi0/0/1(bndl) Gi0/1/1(bndl)
14 Po14(RD)

RU - L3 port-channel UP State
SU - L2 port-channel UP state
P/bndl - Bundled
S/susp - Suspended

ROUTER1GW#sh ip ospf int br
Interface PID Area IP Address/Mask Cost State Nbrs F/C
Po2.30 10 0 10.5.1.4/23 5 DR 1/1
ROUTER1GW#debug ip ospf adj
OSPF adjacency debugging is on
ROUTER1GW#term mon
ROUTER1GW#
May 14 08:42:41 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
ROUTER1GW#
May 14 08:42:50 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
ROUTER1GW#
May 14 08:43:00 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
ROUTER1GW#
May 14 08:43:02 EDT: OSPF-10 ADJ Po2.30: Nbr 10.5.1.1: Clean-up dbase exchange
ROUTER1GW#
May 14 08:43:10 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
May 14 08:43:11 EDT: OSPF-10 ADJ Po2.30: 10.5.1.1 address 10.5.1.1 is dead
May 14 08:43:11 EDT: OSPF-10 ADJ Po2.30: 10.5.1.1 address 10.5.1.1 is dead, state DOWN
May 14 08:43:11 EDT: %OSPF-5-ADJCHG: Process 10, Nbr 10.5.1.1 on Port-channel2.30 from FULL to DOWN, Neighbor Down: Dead timer expired
ROUTER1GW#
May 14 08:43:11 EDT: OSPF-10 ADJ Po2.30: Neighbor change event
May 14 08:43:11 EDT: OSPF-10 ADJ Po2.30: DR/BDR election
May 14 08:43:11 EDT: OSPF-10 ADJ Po2.30: Elect BDR 0.0.0.0
May 14 08:43:11 EDT: OSPF-10 ADJ Po2.30: Elect DR 10.5.1.4
May 14 08:43:11 EDT: OSPF-10 ADJ Po2.30: DR: 10.5.1.4 (Id)
May 14 08:43:11 EDT: OSPF-10 ADJ Po2.30: BDR: none
ROUTER1GW#
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: 2 Way Communication to 10.5.1.1, state 2WAY
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Neighbor change event
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: DR/BDR election
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Elect BDR 10.5.1.1
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Elect DR 10.5.1.4
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: DR: 10.5.1.4 (Id)
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: BDR: 10.5.1.1 (Id)
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Nbr 10.5.1.1: Prepare dbase exchange
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Send DBD to 10.5.1.1 seq 0x2469 opt 0x52 flag 0x7 len 32
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Neighbor change event
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: DR/BDR election
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Elect BDR 10.5.1.1
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Elect DR 10.5.1.4
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: DR: 10.5.1.4 (Id)
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: BDR: 10.5.1.1 (Id)
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Rcv DBD from 10.5.1.1 seq 0x331 opt 0x52 flag 0x7 len 32 mtu 1500 state EXSTART
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: First DBD and we are not SLAVE
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Rcv DBD from 10.5.1.1 seq 0x2469 opt 0x52 flag 0x2 len 952 mtu 1500 state EXSTART
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: NBR Negotiation Done. We are the MASTER
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Nbr 10.5.1.1: Summary list built, size 45
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Send DBD to 10.5.1.1 seq 0x246A opt 0x52 flag 0x1 len 72
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Rcv LS REQ from 10.5.1.1 length 36 LSA count 1
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Send LS UPD to 10.5.1.1 length 64 LSA count 1
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Rcv DBD from 10.5.1.1 seq 0x246A opt 0x52 flag 0x0 len 32 mtu 1500 state EXCHANGE
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Exchange Done with 10.5.1.1
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Synchronized with 10.5.1.1, state FULL
May 14 08:43:19 EDT: %OSPF-5-ADJCHG: Process 10, Nbr 10.5.1.1 on Port-channel2.30 from LOADING to FULL, Loading Done
ROUTER1GW#
May 14 08:43:20 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
May 14 08:43:20 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
ROUTER1GW#
May 14 08:43:24 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
May 14 08:43:25 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
ROUTER1GW#
May 14 08:43:29 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
ROUTER1GW#no deb
May 14 08:43:38 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1

 

I am not sure what is going on, as the router seems to be getting the hello packets, just not responding to them till the dead timer expires.

ROUTER1GW#sh ip ospf nei

Neighbor ID Pri State Dead Time Address Interface
10.5.1.1 1 FULL/BDR 00:00:01 10.5.1.1 Port-channel2.30
ROUTER1GW#sh ip ospf nei

Neighbor ID Pri State Dead Time Address Interface
10.5.1.1 1 FULL/BDR 00:00:00 10.5.1.1 Port-channel2.30
ROUTER1GW#sh ip ospf nei

Neighbor ID Pri State Dead Time Address Interface
10.5.1.1 1 FULL/BDR 00:00:00 10.5.1.1 Port-channel2.30
ROUTER1GW#sh ip ospf nei

Neighbor ID Pri State Dead Time Address Interface
10.5.1.1 1 FULL/BDR 00:00:00 10.5.1.1 Port-channel2.30
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei

Neighbor ID Pri State Dead Time Address Interface
10.5.1.1 1 FULL/BDR 00:00:39 10.5.1.1 Port-channel2.30

 

Thoughts?

 

THANKS FOR YOUR HELP! ! ! !

Hi,

 

Timers, MTU and Authentication is okay, otherwise the session would not get to the 2WAY state as shown in the debug.

Is it possible the issue is load on the Po after traffic is routes across it? Do you have any load information on the links?

 

Are there any other devices in between the 2 having the issue? Any L2 simulated circuit?

deca24
Level 1
Level 1

Wonder if it would do any good to remove and reconfigure ospf on one or both devices?

Is it possible that you have a different MTU on the interfaces of the two routers? I have seen numerous times where a difference in MTU size will cause OSPF peers to flap like this.

Martin L
VIP
VIP

MTU  size is good idea,  check it, make sure it is same on the link but since you have firewall between the core switch and the router and looks like authentication fails, I would double check firewall.

 

Regards, ML
**Please Rate All Helpful Responses **

Hello

Have you tried disabling one of the interfaces within the PC and see if that rectifys the flapping?


Please rate and mark as an accepted solution if you have found any of the information provided useful.
This then could assist others on these forums to find a valuable answer and broadens the community’s global network.

Kind Regards
Paul

Martin L
VIP
VIP

you wrote you have OSPF neighbor relationship between 2 routers but log shows 3 neighbors, right ?  10.5.1.4, 10.5.1.1, 10.5.1.3. see

Nbr 10.5.1.4: Clean-up dbase exchange

Cannot see ourself in hello from 10.5.1.4, state INIT
 Neighbor change event
DR/BDR election
Elect BDR 10.5.1.1
Elect DR 10.5.1.3
 DR: 10.5.1.3 (Id) BDR: 10.5.1.1 (Id)
Remember old DR 10.5.1.4 (id)
Send with youngest Key 1

 

What is your topology? how many ospf routers do we have ?  Could be your trans firewall leaking ospf packets ?

 

Regards, ML
**Please Rate All Helpful Responses **

deca24
Level 1
Level 1

Thank you all for the replies. We were finally able to get it working last week but I have been unable to get back to this thread until now to give an update.

We found that it was the firewall for sure somehow stripping out the multicast packets between 1 of the routers and the core switch. Once we added in a rule specifing the ospf multicast IP addresses, the timers and the updates started working as expected between the 2 devices. I just do not understand why it would allow the packets thru to make the connection but not the hello keep alive packets.

Martin L,

You are correct that we have 3 of ospf devices. 2 routers connected thru the same transparent firewall via 2 different port channels. I was only having problems with one of the routers connection to the firewall.

We ruled out a layer 1and 2 problem, due to the physical connection was fine and data was continually passing. Layer 2 was fine as the port channels never went down and there was never a loss of connection between the devices.

 

Thank you all again for your help.

Yes, By default the ASA transport will allow OSPF multicast BUT the unicast need to be allow from Low to High Secure.
this make OSPF Peer see it neighbor "multicast" but the update "unicast" is drop by ASA.

Review Cisco Networking for a $25 gift card