10-08-2008 06:22 AM - edited 03-11-2019 06:54 AM
Hi all,
We've got a redundant network with two routers, two ASAs and again two routers. Now we run OSPF between everything and the ASAs are configured in active/standby setup. When we failover we do however see that OSPF fails for about a minute and then it comes back and everything starts working again.
Is there a possibility where we can configure OSPF to be stateful just as other sessions terminating on the ASA (VPN or sessions through the ASA)? One can ofcourse tweak the timers of OSPF convergence but I want it to be stateful.
Anyone?
10-08-2008 07:04 AM
Routing information is not replicated to the
standby PIX Firewall on Stateful Failover, including OSPF routing information. The standby
unit will not show OSPF routing information until a failover occurs and its table gets
updated.
One workaround is what you mentioned..tweak hello timers of OSPF..the other option would be due to lower the failover poll time, however don't lower it too much as it might trigger false failover/switchovers
Do rate if helpful !
10-08-2008 11:41 AM
Hi,
Thanks for your answer. It clears things out. However it still is no solution to our problem. Do you know if EIGRP will do a stateful failover? It's all Cisco we run so it should not be too hard to switch over to EIGRP.
10-09-2008 12:50 AM
EIGRP and any other routing protocol on the ASA doesn't have the capability to failover gracefully. There will always be a period of
reconvergance. This means that if you are running a dynamic routing protocol on your ASA and you have a failover you will see a network
outage for as long as it takes the routing protocol to reconverge.
03-08-2012 04:34 AM
Hi Abinjola
Here 4 years later, we have a simular issue on a set of 5585 running 8.4.(3) they are interconnected to 2 6509's with a 2*10 Gig Portchannel.
The software now supports replication off the route tabel from the active FW to the standby unit.
But when we initiate a failover we still see, that the traffic that should be running over the OSPF link (inside) is blackholed in the 5 secounds (OSPF timers 1 sec hello / 4 sec dead) it takes the former standby box / ASA to bring up the OSPF Neighborship with the 6509.
Question is, is it the ASA or the 6509, that flushes the route entrys in the route tabel, when we initiate a failover / reload / or any other event that could initiate a failover.
Is there any way to optimize this problem. ?
This upcomming weekend we will test if its the routetabel on the 6509 or on the ASA, that causes the problem.
Thanks for Your time.
Jesper Damsgaard
04-07-2012 03:30 PM
Hello
I run into the same situation while labbing the stateful failover in 8.4(1) with OSPF enabled. I run 'debug ospf' on ASA and 'debug ip ospf events' 'debug ip ospf packets' on IOS router during the failover. The outputs are not very clear to me regarding who is causing the OSPF adjacency to flap:
On the IOS router:
*Mar 1 03:38:30.611: OSPF: Neighbor change Event on interface FastEthernet0/1
*Mar 1 03:38:30.611: OSPF: DR/BDR election on FastEthernet0/1
*Mar 1 03:38:30.611: OSPF: Elect DR 10.0.1.145
*Mar 1 03:38:30.611: DR: 10.0.1.145 (Id)
*Mar 1 03:38:30.611: OSPF: Neighbor change Event on interface FastEthernet0/1
*Mar 1 03:38:30.611: OSPF: DR/BDR election on FastEthernet0/1
*Mar 1 03:38:30.611: OSPF: Elect DR 10.0.1.145
*Mar 1 03:38:30.611: DR: 10.0.1.145 (Id)
*Mar 1 03:38:30.615: OSPF: End of hello processing
*Mar 1 03:38:33.431: OSPF: Send hello to 224.0.0.5 area 0 on FastEthernet0/1 from 100.0.11.1
R1#un all
On the ASA:
ASA1# failover exec standby failover active
OSPF: rcv. v:2 t:1 l:48 rid:10.0.1.145
aid:0.0.0.0 chk:5439 aut:0 auk: from inside
OSPF: Rcv hello from 10.0.1.145 area 0 from inside 100.0.11.1
OSPF: End of hello processing
OSPF: Interface inside going Down
OSPF: Neighbor change Event on interface inside
OSPF: DR/BDR election on inside
Switching to Standby
OSPF: Elect BDR 0.0.0.0
ASA1# Elect DR 10.0.1.145
OSPF: Elect BDR 0.0.0.0
OSPF: Elect DR 10.0.1.145
DR: 10.0.1.145 (Id) BDR: none
OSPF: 10.0.1.145 address 10.0.1.145 on inside is dead, state DOWN
OSPF: Neighbor change Event on interface inside
OSPF: DR/BDR election on inside
OSPF: Elect BDR 0.0.0.0
OSPF: Elect DR 0.0.0.0
DR: none BDR: none
OSPF: Remember old DR 10.0.1.145 (id)
OSPF: Interface outside going Down
OSPF: Neighbor change Event on interface outside
OSPF: DR/BDR election on outside
OSPF: Elect BDR 0.0.0.0
OSPF: Elect DR 10.0.8.254
OSPF: Elect BDR 0.0.0.0
OSPF: Elect DR 10.0.8.254
DR: 10.0.8.254 (Id) BDR: none
OSPF: 10.0.8.254 address 10.0.8.254 on outside is dead, state DOWN
OSPF: Neighbor change Event on interface outside
OSPF: DR/BDR election on outside
OSPF: Elect BDR 0.0.0.0
OSPF: Elect DR 0.0.0.0
DR: none BDR: none
OSPF: Remember old DR 10.0.8.254 (id)
OSPF: Interface inside going Up
OSPF: Interface outside going Up
R1 (F0/1 100.0.11.1) --- (E0/0 100.0.11.10 inside) ASA (E0/1 outside 100.0.12.10) ---- (F0/1 100.0.12.2) R2
Did you manage to clarify the situation? Looks like zero downtime with OSPF and 8.4x version is not possible. Even with 1 sec hello-interval I had 10 sec outage and FTP sessions running through ASA of course failed.
Thank you
09-11-2012 06:47 PM
I faced the same problem while I was testing my OSPF area with ASA 8.4.4
I have got a similar result: around 5 second for OSPF to converge. But in my case ASA sits between 2 routers in a totally NSSA area. One of them is ABR connected to the backbone and another one is ASBR that redistributes 700+ routes from BGP. OSPF convergence time is now acceptable with 8.4, but when redistribution is involved it takes some 30-40 seconds before connectivity between backbone and BGP is restored. Needless to say it is more than enough to kill all active connections.
I tried many timer configurations and apparently HELLO\DEAD doesn't make any sense any more with 8.4 as it can only affect the exact time when the downtime begins. With routig table replicated to standby ASA passes traffic without any problem until the "new" one sends out its HELLOs which breaks the area. So if the timer is set to 1s it will fail immediately after failover and if it is default 10s it would take up to 10s.
I also tried tuning LSA and SPF timers on all devices but it also had no effect. Well, it actually had: ASA crashed with OSPF page fault when the LSA and SPF timers were set to something tiny like 10 100. ASA has fewer timer configuration commands compared to IOS routers.
I tested the same topology with EIGRP instead of OSPF NSSA area and the results were just impressive. It takes the area just a couple of seconds to converge completely, even with those 700 BGP routes. No connectivity loss, sometimes just 1 or 2 ICMP ping might be missed.
Apparently ASA still does not support graceful OSPF restart while the IOS routers seem to support both Cisco’s own nonstop forwarding and RFC3623 approaches.
IOS routers have nsf commands under router ospf configuration and it seems to be enabled by default. There is nothing like that on ASA.
02-26-2014 11:01 AM
OSPF Failover causes 5 second convergence delay
Symptom:
When using OSPF dynamic routing combined with active/standby failover on an ASA running 8.4, the routes via from OSPF are replicated to the standby ASA. This is so that if a failover event occurs traffic using these routes will continue to pass through the new active unit. The problem is that upon a failover event, a 5 second delay is seen in OSPF convergence, which could cause a brief traffic outage.
Conditions:
ASA running 8.4 or later and OSPF as the routing protocol. This does not impact other routing protocols
Workaround:
Upgrade
8.6(0.0)
100.8(20.1)
8.4(2)
8.5(1.5)
8.5(1.242)
100.7(8.34)
9.0(0.99)
9.0(1)
9.1(1)
Value our effort and rate the assistance!
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide