05-13-2021 12:53 PM - edited 05-13-2021 02:06 PM
Hello all,
I am having an issue to which I have not really be able to find a cause or resolution. The OSPF neighbor relationship between 2 routers keeps dropping, then immediately reforming the adjacency.
ROUTER1GW#
May 13 13:35:51 EDT: %OSPF-5-ADJCHG: Process 10, Nbr 10.5.1.1 on Port-channel2.30 from LOADING to FULL, Loading Done
ROUTER1GW#
May 13 13:36:40 EDT: %OSPF-5-ADJCHG: Process 10, Nbr 10.5.1.1 on Port-channel2.30 from FULL to DOWN, Neighbor Down: Dead timer expired
GOLABINET1GW#
May 13 13:36:48 EDT: %OSPF-5-ADJCHG: Process 10, Nbr 10.5.1.1 on Port-channel2.30 from LOADING to FULL, Loading Done
This keeps happening and I am not sure where to start looking to get this fixed.
I can watch the dead timer count down till it hits 0 then re-establishes the neighbor ship and repeats the process.
A little help to get pointed in the right direction would be awesome!
Thanks!!
05-13-2021 02:03 PM - edited 05-14-2021 12:40 AM
Hello @deca24 ,
The OSPF dead timer expired means that no OSPF hello or other OSPF packet has been received from the neighbor for 40 seconds (default dead timer value).
After the OSPF adjacency is declared down it reforms and then again it is declared down for dead timer expired
>>ROUTER1GW#
May 13 13:35:51 EDT: %OSPF-5-ADJCHG: Process 10, Nbr 10.5.1.1 on Port-channel2.30 from LOADING to FULL, Loading Done
ROUTER1GW#
May 13 13:36:40 EDT: %OSPF-5-ADJCHG: Process 10, Nbr 10.5.1.1 on Port-channel2.30 from FULL to DOWN, Neighbor Down: Dead timer expired
Here, you can have an issue with excessive user traffic in the direction from the other router the one with IP 10.5.1.1 towards the local router with a lack of QoS policy to protect OSPF.
Depending on what the device with IP 10.5.1.1 is you can build a policy map to apply outbound to protect OSPF traffic and possible other routing traffic.
Example:
on other devicewith IP 10.5.1.1
access-list 189 remark OSPF
access-list 189 permit ospf any any
class-map ROUTING-PROTOCOLS match-any
match address 189
match dscp cs6
policy-map SAFE
class ROUTING-PROTOCOLS
bandwidth percent 5
class class-defaut
fair-queue
interface portchannel x.30
service-policy SAFE out
Edit:
depending on the platform the policy map can be applied at port channel level or you need to configure it on each physical member link.
For example I had to do in the second way on ME 3600
Hope to help
Giuseppe
05-13-2021 05:59 PM
Any other issues? L1 or 2 issues? port channels or individual links staying up,?
Try increase OSPF Hello/Dead timer interval to larger value to see if that helps (and it is load/bandwidth problem)
Regards, ML
**Please Rate All Helpful Responses **
05-14-2021 12:29 AM - edited 05-14-2021 12:31 AM
Hello
Post the output of
sh etherchannel summary
sh ip ospf interface brief
debug ip ospf adj
05-14-2021 06:08 AM - edited 05-14-2021 06:09 AM
Hi Paul,
Please forgive my oversight, I didn't put it in initially because we didn't think it was contributing to the problem but I am going to add it in anyway now.. We have a transparent firewall between the core switch and the router. It does not seem to be blocking anything because there are no problems with another router and the core switch to which the firewall is between as well.
As requested here is the result of the commands you asked for:
*************CORE SWITCH**********************
CORE#sh etherc summ
Flags: D - down P - bundled in port-channel
I - stand-alone s - suspended
H - Hot-standby (LACP only)
R - Layer3 S - Layer2
U - in use N - not in use, no aggregation
f - failed to allocate aggregator
M - not in use, no aggregation due to minimum links not met
m - not in use, port not aggregated due to minimum links not met
u - unsuitable for bundling
d - default port
w - waiting to be aggregated
Number of channel-groups in use: 35
Number of aggregators: 37
Group Port-channel Protocol Ports
------+-------------+-----------+-----------------------------------------------
2 Po20(SU) LACP Gi1/2/2(P) Gi2/2/2(P)
CORE#sh ip ospf int br
Interface PID Area IP Address/Mask Cost State Nbrs F/C
Vl20 10 0 10.5.1.1/23 10 BDR 2/2
CORE#debug ip ospf adj
OSPF adjacency debugging is on
CORE#term mon
CORE#
May 14 08:32:28 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
May 14 08:32:28 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
CORE#
May 14 08:32:34 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
CORE#
May 14 08:32:44 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
CORE#
May 14 08:32:53 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
May 14 08:32:54 EDT: OSPF-10 ADJ Vl20: Nbr 10.5.1.4: Clean-up dbase exchange
CORE#
May 14 08:33:02 EDT: OSPF-10 ADJ Vl10: Send with youngest Key 1
May 14 08:33:03 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
CORE#
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Cannot see ourself in hello from 10.5.1.4, state INIT
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Neighbor change event
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: DR/BDR election
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Elect BDR 10.5.1.1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Elect DR 10.5.1.3
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: DR: 10.5.1.3 (Id) BDR: 10.5.1.1 (Id)
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Remember old DR 10.5.1.4 (id)
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Rcv DBD from 10.5.1.4 seq 0x1C16 opt 0x52 flag 0x7 len 32 mtu 1500 state INIT
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: 2 Way Communication to 10.5.1.4, state 2WAY
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Neighbor change event
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: DR/BDR election
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Elect BDR 10.5.1.1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Elect DR 10.5.1.3
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: DR: 10.5.1.3 (Id) BDR: 10.5.1.1 (Id)
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Nbr 10.5.1.4: Prepare dbase exchange
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Send DBD to 10.5.1.4 seq 0xD8C opt 0x52 flag 0x7 len 32
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: NBR Negotiation Done. We are the SLAVE
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Nbr 10.5.1.4: Summary list built, size 46
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Send DBD to 10.5.1.4 seq 0x1C16 opt 0x52 flag 0x2 len 952
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Neighbor change event
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: DR/BDR election
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Elect BDR 10.5.1.1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Elect DR 10.5.1.4
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: DR: 10.5.1.4 (Id) BDR: 10.5.1.1 (Id)
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Remember old DR 10.5.1.3 (id)
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Neighbor change event
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: DR/BDR election
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Elect BDR 10.5.1.1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Elect DR 10.5.1.4
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: DR: 10.5.1.4 (Id) BDR: 10.5.1.1 (Id)
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Rcv DBD from 10.5.1.4 seq 0x1C17 opt 0x52 flag 0x1 len 72 mtu 1500 state EXCHANGE
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Exchange Done with 10.5.1.4
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Send LS REQ to 10.5.1.4 length 36 LSA count 1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Send DBD to 10.5.1.4 seq 0x1C17 opt 0x52 flag 0x0 len 32
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Rcv LS UPD from 10.5.1.4 length 64 LSA count 1
May 14 08:33:10 EDT: OSPF-10 ADJ Vl20: Synchronized with 10.5.1.4, state FULL
CORE#
May 14 08:33:10 EDT: %OSPF-5-ADJCHG: Process 10, Nbr 10.5.1.4 on Vlan20 from LOADING to FULL, Loading Done
CORE#
May 14 08:33:12 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
May 14 08:33:12 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
CORE#
May 14 08:33:16 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
CORE#
May 14 08:33:19 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
May 14 08:33:19 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
May 14 08:33:19 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
CORE#
May 14 08:33:22 EDT: OSPF-10 ADJ Vl20: Send with youngest Key 1
***********ROUTER1GW**************
ROUTER1GW#sh etherc summ
Flags: D - down P/bndl - bundled in port-channel
I - stand-alone s/susp - suspended
H - Hot-standby (LACP only)
R - Layer3 S - Layer2
U - in use f - failed to allocate aggregator
M - not in use, minimum links not met
u - unsuitable for bundling
w - waiting to be aggregated
d - default port
Number of channel-groups in use: 2
Number of aggregators: 2
Group Port-channel Protocol Ports
------+-------------+-----------+-----------------------------------------------
4 Po4(RU) LACP Gi0/0/1(bndl) Gi0/1/1(bndl)
14 Po14(RD)
RU - L3 port-channel UP State
SU - L2 port-channel UP state
P/bndl - Bundled
S/susp - Suspended
ROUTER1GW#sh ip ospf int br
Interface PID Area IP Address/Mask Cost State Nbrs F/C
Po2.30 10 0 10.5.1.4/23 5 DR 1/1
ROUTER1GW#debug ip ospf adj
OSPF adjacency debugging is on
ROUTER1GW#term mon
ROUTER1GW#
May 14 08:42:41 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
ROUTER1GW#
May 14 08:42:50 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
ROUTER1GW#
May 14 08:43:00 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
ROUTER1GW#
May 14 08:43:02 EDT: OSPF-10 ADJ Po2.30: Nbr 10.5.1.1: Clean-up dbase exchange
ROUTER1GW#
May 14 08:43:10 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
May 14 08:43:11 EDT: OSPF-10 ADJ Po2.30: 10.5.1.1 address 10.5.1.1 is dead
May 14 08:43:11 EDT: OSPF-10 ADJ Po2.30: 10.5.1.1 address 10.5.1.1 is dead, state DOWN
May 14 08:43:11 EDT: %OSPF-5-ADJCHG: Process 10, Nbr 10.5.1.1 on Port-channel2.30 from FULL to DOWN, Neighbor Down: Dead timer expired
ROUTER1GW#
May 14 08:43:11 EDT: OSPF-10 ADJ Po2.30: Neighbor change event
May 14 08:43:11 EDT: OSPF-10 ADJ Po2.30: DR/BDR election
May 14 08:43:11 EDT: OSPF-10 ADJ Po2.30: Elect BDR 0.0.0.0
May 14 08:43:11 EDT: OSPF-10 ADJ Po2.30: Elect DR 10.5.1.4
May 14 08:43:11 EDT: OSPF-10 ADJ Po2.30: DR: 10.5.1.4 (Id)
May 14 08:43:11 EDT: OSPF-10 ADJ Po2.30: BDR: none
ROUTER1GW#
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: 2 Way Communication to 10.5.1.1, state 2WAY
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Neighbor change event
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: DR/BDR election
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Elect BDR 10.5.1.1
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Elect DR 10.5.1.4
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: DR: 10.5.1.4 (Id)
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: BDR: 10.5.1.1 (Id)
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Nbr 10.5.1.1: Prepare dbase exchange
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Send DBD to 10.5.1.1 seq 0x2469 opt 0x52 flag 0x7 len 32
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Neighbor change event
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: DR/BDR election
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Elect BDR 10.5.1.1
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Elect DR 10.5.1.4
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: DR: 10.5.1.4 (Id)
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: BDR: 10.5.1.1 (Id)
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Rcv DBD from 10.5.1.1 seq 0x331 opt 0x52 flag 0x7 len 32 mtu 1500 state EXSTART
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: First DBD and we are not SLAVE
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Rcv DBD from 10.5.1.1 seq 0x2469 opt 0x52 flag 0x2 len 952 mtu 1500 state EXSTART
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: NBR Negotiation Done. We are the MASTER
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Nbr 10.5.1.1: Summary list built, size 45
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Send DBD to 10.5.1.1 seq 0x246A opt 0x52 flag 0x1 len 72
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Rcv LS REQ from 10.5.1.1 length 36 LSA count 1
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Send LS UPD to 10.5.1.1 length 64 LSA count 1
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Rcv DBD from 10.5.1.1 seq 0x246A opt 0x52 flag 0x0 len 32 mtu 1500 state EXCHANGE
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Exchange Done with 10.5.1.1
May 14 08:43:19 EDT: OSPF-10 ADJ Po2.30: Synchronized with 10.5.1.1, state FULL
May 14 08:43:19 EDT: %OSPF-5-ADJCHG: Process 10, Nbr 10.5.1.1 on Port-channel2.30 from LOADING to FULL, Loading Done
ROUTER1GW#
May 14 08:43:20 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
May 14 08:43:20 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
ROUTER1GW#
May 14 08:43:24 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
May 14 08:43:25 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
ROUTER1GW#
May 14 08:43:29 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
ROUTER1GW#no deb
May 14 08:43:38 EDT: OSPF-10 ADJ Po2.30: Send with youngest Key 1
I am not sure what is going on, as the router seems to be getting the hello packets, just not responding to them till the dead timer expires.
ROUTER1GW#sh ip ospf nei
Neighbor ID Pri State Dead Time Address Interface
10.5.1.1 1 FULL/BDR 00:00:01 10.5.1.1 Port-channel2.30
ROUTER1GW#sh ip ospf nei
Neighbor ID Pri State Dead Time Address Interface
10.5.1.1 1 FULL/BDR 00:00:00 10.5.1.1 Port-channel2.30
ROUTER1GW#sh ip ospf nei
Neighbor ID Pri State Dead Time Address Interface
10.5.1.1 1 FULL/BDR 00:00:00 10.5.1.1 Port-channel2.30
ROUTER1GW#sh ip ospf nei
Neighbor ID Pri State Dead Time Address Interface
10.5.1.1 1 FULL/BDR 00:00:00 10.5.1.1 Port-channel2.30
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
ROUTER1GW#sh ip ospf nei
Neighbor ID Pri State Dead Time Address Interface
10.5.1.1 1 FULL/BDR 00:00:39 10.5.1.1 Port-channel2.30
Thoughts?
THANKS FOR YOUR HELP! ! ! !
05-29-2021 05:26 PM
Hi,
Timers, MTU and Authentication is okay, otherwise the session would not get to the 2WAY state as shown in the debug.
Is it possible the issue is load on the Po after traffic is routes across it? Do you have any load information on the links?
Are there any other devices in between the 2 having the issue? Any L2 simulated circuit?
05-17-2021 10:04 AM
Wonder if it would do any good to remove and reconfigure ospf on one or both devices?
05-20-2021 05:31 AM
Is it possible that you have a different MTU on the interfaces of the two routers? I have seen numerous times where a difference in MTU size will cause OSPF peers to flap like this.
05-29-2021 04:27 PM - edited 05-29-2021 07:59 PM
MTU size is good idea, check it, make sure it is same on the link but since you have firewall between the core switch and the router and looks like authentication fails, I would double check firewall.
Regards, ML
**Please Rate All Helpful Responses **
05-30-2021 01:24 AM
Hello
Have you tried disabling one of the interfaces within the PC and see if that rectifys the flapping?
05-31-2021 09:11 AM
you wrote you have OSPF neighbor relationship between 2 routers but log shows 3 neighbors, right ? 10.5.1.4, 10.5.1.1, 10.5.1.3. see
Nbr 10.5.1.4: Clean-up dbase exchange
Cannot see ourself in hello from 10.5.1.4, state INIT
Neighbor change event
DR/BDR election
Elect BDR 10.5.1.1
Elect DR 10.5.1.3
DR: 10.5.1.3 (Id) BDR: 10.5.1.1 (Id)
Remember old DR 10.5.1.4 (id)
Send with youngest Key 1
What is your topology? how many ospf routers do we have ? Could be your trans firewall leaking ospf packets ?
Regards, ML
**Please Rate All Helpful Responses **
05-31-2021 06:32 PM
Thank you all for the replies. We were finally able to get it working last week but I have been unable to get back to this thread until now to give an update.
We found that it was the firewall for sure somehow stripping out the multicast packets between 1 of the routers and the core switch. Once we added in a rule specifing the ospf multicast IP addresses, the timers and the updates started working as expected between the 2 devices. I just do not understand why it would allow the packets thru to make the connection but not the hello keep alive packets.
Martin L,
You are correct that we have 3 of ospf devices. 2 routers connected thru the same transparent firewall via 2 different port channels. I was only having problems with one of the routers connection to the firewall.
We ruled out a layer 1and 2 problem, due to the physical connection was fine and data was continually passing. Layer 2 was fine as the port channels never went down and there was never a loss of connection between the devices.
Thank you all again for your help.
06-02-2021 01:42 PM
Yes, By default the ASA transport will allow OSPF multicast BUT the unicast need to be allow from Low to High Secure.
this make OSPF Peer see it neighbor "multicast" but the update "unicast" is drop by ASA.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide