08-12-2022 09:00 AM - edited 08-16-2022 04:46 AM
I've attached a LLD of my design. The ASRs have eBGP neighborship with WAN side. In the LAN, both ASRs and the firewall are in the same OSPF broadcast network. The L2 switch in between is a 6800 in VSS. Both ASRs learn the same customer routes from eBGP and redistribute into OSPF but ASR2 redistributes with a higher metric.
Our OSPF timers are hello - 3 and dead - 9
updating that we have OSPF priorities set - FW - 50 , ASR1- 30 , ASR2 - 20
We had a planned failover activity wherein both LAN cables on ASR 1 were pulled. We expected failover to happen in 9 seconds. However, it took 13 seconds for the route on the firewall to move from ASR 1 to ASR2. Did the firewall being the DR result in more convergence time? Would it have helped if ASR1 was the DR? How does convergence work in broadcast OSPF network in detail would really like to understand?
Solved! Go to Solution.
08-16-2022 05:40 AM
Hi, yes the L2 switches are in VSS
08-16-2022 06:26 AM - edited 08-16-2022 06:32 AM
OK, the VSS have one side PO to ASR and other Side single Link to FW Pri,
that not so clear for me but make some delay
instead use
PO in both Side,
the ASR and FW Pri and then check again.
BEFORE YOU DO CHANGE LET ME DOUBLE CHECK THIS SOLUTION
08-17-2022 10:35 AM
"""the VSS always tries to forward traffic on the locally available links. This is true for both Layer-2 and Layer-3 links. The primary motivation for local forwarding is to avoid unnecessarily sending of data traffic over the VSL link in order to reduce the latency (extra hop over the VSL) and congestion."""
I have idea but you need to check it in your side,
ASR 60 send Hello message, the message is using Hash algorithm select to pass to VSS61 not VSS60, and since the VSS60 is direct connect to FW Sec. the FW Sec. receive the hello message not FW Pri
and FW Pri start new OSPF elect and new OSPF establish.
the idea from connect to two side PO AND use same HASH algorithm in both side is to prevent this case.
to check disable one port member in ASR and check the OSPF recover again.
08-17-2022 11:40 AM
Hi MHM, The Palo Alto Firewalls are in active/passive mode and only the MAC of the active firewall is learnt by the switches. When traffic from ASR1 via switch1 (upper switch) needs to reach the firewall, it will go over the VSL and then onto the firewall.
The main cause of delay seems to be the ACK and SPF re-calculation. Please do see my response to Joseph along with debugs.
08-17-2022 11:53 AM - edited 08-17-2022 03:56 PM
...see below comment
08-17-2022 01:09 PM - edited 08-17-2022 03:56 PM
... see below comment
08-14-2022 12:27 AM
Hello
I assume you have default SSO enabled but do you have NSF enabled for OSPF, also you mention ospf timers, as thse also default of have you set them manually?
08-14-2022 01:05 AM
Hello
I assume you have default SSO enabled but do you have NSF enabled for OSPF, also you mention ospf timers, as thse also default of have you set them manually?
08-14-2022 09:40 AM
@paul driver SSO and/or NSF, on what device(s)? (Just trying to follow your train of troubleshooting thought.)
08-14-2022 11:54 AM
Hello @Joseph W. Doherty
On the VSS, however checked the OP again and its at L2 so i guess it isnt appicable..
08-17-2022 04:51 PM
I can not sleep when something in my mind
so I start write note to solve issue because 9-13 sec is long
NOTE:-
1-
@lionell01 you mention that "" only the MAC of the active firewall is learnt by the switches."" That so correct and meaning your config for FW HA is correct because only Active FW will participate in OSPF process.
but I ask you again check the PO in both ASR.
2- @David Ruess mention using BFD which for me is best solution to reduce the long time WHY?
the long time is happened because both ASR and FW not detect the failed ASR
but you mention
""Yes, we have considered BFD but caveat in Palo Alto Firewalls is that BFD forms only between DR and BDR. When ASR1 is down there is only DR (FW) and DROTHER (ASR2), there would be no BFD so perhaps OSPF would be shut down by BFD?""
here come the trick, I ask my self WHY there is new election since there are DR BDR and Other, but immediately stop,
there is broadcast domain so there must be one DR one BDR and one or more DROther.
when the ASR failed which is OLD DR the un-failed other router start new election, and since there are two router (ASR & FW)
so one will elect as DR and other will elect as BDR
since this network is contain non cisco FW I return to Palo website and find the following
"When you enable BFD for OSPFv2 or OSPFv3 broadcast interfaces, OSPF establishes a BFD session only with its Designated Router (DR) and Backup Designated Router (BDR). On point-to-point interfaces, OSPF establishes a BFD session with the direct neighbor. On point-to-multipoint interfaces, OSPF establishes a BFD session with each peer."
https://docs.paloaltonetworks.com/pan-os/9-1/pan-os-admin/networking/bfd/bfd-overview/bfd-for-dynamic-routing-protocols
and in our case absolutely ASR will elect as DR or BDR, since there are only remain two Router in broadcast.
so we can use BFD and detect failed ASR and reduce the recover time.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide