cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2258
Views
35
Helpful
14
Replies

HA ASA-5525 pair failed to send gratuitous ARPs during failover. Why?

Hello.

I performed a task as instructed by my senior, to reboot the primary ASA-5525 (9.14(3)), then when it returned online, to reboot the secondary. At the beginning, I believe I did execute "failover active", but I am not certain because I remember that I concluded it was irrelevant. I did reboot the primary ASA. I verified that this primary came back online by witnessing on this device a normal reaction to pressing "enter" a few times. 

Then on the secondary, I entered "failover active", waited for about 15 seconds, then rebooted the secondary-- The secondary went offline, then every connection lost connectivity that traversed the ASA-5525. Clearly the failover technology somehow failed, because it is confirmed that the connected devices did not receive gratuitous ARPs. 

The correct question now is-- In a HA ASA-5525 cluster, when executing "failover active", why would the active secondary device not send gratuitous ARPs to the downstream devices?

Thank you.

2 Accepted Solutions

Accepted Solutions

@jmaxwellUSAF "Generally, when a failover occurs, the new active unit takes over the active IP addresses and MAC addresses. Because network devices see no change in the MAC to IP address pairing, no ARP entries change or time out anywhere on the network."....that is a quote from this guide - https://www.cisco.com/c/en/us/td/docs/security/asa/asa917/configuration/general/asa-917-general-config/ha-failover.html

 

View solution in original post

As a helpful and cautionary note, the below dynamic caused a "big nightmare" event that resulted in significant financial impact to an enterprise. It would be best for professionals to appreciate it by remember to configure VIRTUAL MAC ADDRESSES...

Active/Standby IP Addresses and MAC Addresses
For Active/StandbyFailover,see the following for IPaddress and MAC address usage during a failover event:
1. The active unit always uses the primary unit's IP addresses and MAC addresses.
2. When the active unit fails over, the standby unit assumes the IP addresses and MAC addresses of the failed unit and begins passing traffic.
3. When the failed unit comes back online, it is now in a standby state and takes over the standby IPaddresses
and MAC addresses.

MAC Addresses and IP Addresses in Failover
However, if the secondary unit boots without detecting the primary unit, then the secondary unit becomes the active unit and uses its own MAC addresses, because it does not know the primary unit MAC addresses. When the primary unit becomes available, the secondary (active) unit changes the MAC addresses to those of the primary unit, which can cause an interruption in your network traffic. Similarly, if you swap out the primary unit with new hardware, a new MAC address is used.

Virtual MAC addresses guard against this disruption, because the active MAC addresses are known to the secondary unit at startup, and remain the same in the case of new primary unit hardware. If you do not configure virtual MAC addresses, you might need to clear the ARP tables on connected routers to restore traffic flow. The ASA does not send gratuitous ARPs for static NAT addresses when the MAC address changes, so connected routers do not learn of the MAC address change for these addresses.

CLI Book 1: Cisco ASA Series General Operations CLI Configuration Guide, 9.17 - Failover for High Availability [Cisco Secure Firewall ASA] - Cisco

 

View solution in original post

14 Replies 14

Very relevant info here...

Solved: What protocol does HA in cisco ASA uses??? - Cisco Community

Shared with the Failover Link

Sharing a failover link is the best way to conserve interfaces. However, you must consider a dedicated interface for the state link and failover link, if you have a large configuration and a high traffic network.

 

Dedicated Interface

You can use a dedicated data interface (physical, redundant, or EtherChannel) for the state link. For an EtherChannel used as the state link, to prevent out-of-order packets, only one interface in the EtherChannel is used. If that interface fails, then the next interface in the EtherChannel is used.

Connect a dedicated state link in one of the following two ways:

  • Using a switch, with no other device on the same network segment (broadcast domain or VLAN) as the failover interfaces of the ASAdevice.
  • Using an Ethernet cable to connect the appliances directly, without the need for an external switch.

If you do not use a switch between the units, if the interface fails, the link is brought down on both peers. This condition may hamper troubleshooting efforts because you cannot easily determine which unit has the failed interface and caused the link to come down.

The ASA supports Auto-MDI/MDIX on its copper Ethernet ports, so you can either use a crossover cable or a straight-through cable. If you use a straight-through cable, the interface automatically detects the cable and swaps one of the transmit/receive pairs to MDIX.

For optimum performance when using long distance failover, the latency for the state link should be less than 10 milliseconds and no more than 250 milliseconds. If latency is more than 10 milliseconds, some performance degradation occurs due to retransmission of failover messages.

---

Avoiding Interrupted Failover and Data Links

We recommend that failover links and data interfaces travel through different paths to decrease the chance that all interfaces fail at the same time. If the failover link is down, the ASA can use the data interfaces to determine if a failover is required. Subsequently, the failover operation is suspended until the health of the failover link is restored.

See the following connection scenarios to design a resilient failover network.

---

Scenario 1—Not Recommended

If a single switch or a set of switches are used to connect both failover and data interfaces between two ASAs, then when a switch or inter-switch-link is down, both ASAs become active. Therefore, the following two connection methods shown in the following figures are NOT recommended.

Figure 1. Connecting with a Single Switch—Not Recommended

jmaxwellUSAF_0-1677176122646.jpeg

 

Figure 2. Connecting with a Double-Switch—Not Recommended

jmaxwellUSAF_1-1677176122648.jpeg

---

Scenario 2—Recommended

We recommend that failover links NOT use the same switch as the data interfaces. Instead, use a different switch or use a direct cable to connect the failover link, as shown in the following figures.

Figure 3. Connecting with a Different Switch

jmaxwellUSAF_2-1677176122649.jpeg

 

Figure 4. Connecting with a Cable

jmaxwellUSAF_3-1677176122652.jpeg

Screenshot (318).png

sorry it late reply but some times I need time to make test before reply 
anyway
I see you mention NSK in one side of ASA HA 
you can use 
etheranalyzer local interface inband limit-capture-frames 30 <<- do this in NSK when you do failover active in standby ASA to capture if the ASA send G-ARP or not. 
thanks 
MHM 

Thank you MHM.

This is very helpful, and clearly you put much work into this response.

What is "NSK"?

@jmaxwellUSAF "Generally, when a failover occurs, the new active unit takes over the active IP addresses and MAC addresses. Because network devices see no change in the MAC to IP address pairing, no ARP entries change or time out anywhere on the network."....that is a quote from this guide - https://www.cisco.com/c/en/us/td/docs/security/asa/asa917/configuration/general/asa-917-general-config/ha-failover.html

 

You have located the essential literature for this issue. Thank you Rob!

"Generally, when a failover occurs, the new active unit takes over the active IP addresses and MAC addresses. Because network devices see no change in the MAC to IP address pairing, no ARP entries change or time out anywhere on the network."

This seems to imply that the connected devices' ARP and mac-address tables would hold identical entries for two interfaces, so all traffic destined to the HA pair would always exit 2 interfaces on any redundantly connected device. Is that correct?

@jmaxwellUSAF from the book - Cisco ASA all in one

1.png

2.png

 

is there a link to this text?

And because I own this physical text, may you provide the page #? 

thank you.

@jmaxwellUSAF the top of page 662 - Cisco ASA All-in-One Next Generation Firewall, Third Edition.

Hi Rob. I could not find this on google. May you tell me, or send me a link to what means "cold standby" and "active drain"   in the below data? Thank you!

FW/sec/stby# sh failo hist
==========================================================================
From State                             To State                                            Reason
==========================================================================
12:48:46 EST Feb 22 2023
Not Detected                            Negotiation                                   No Error

12:48:50 EST Feb 22 2023
Negotiation                              Cold Standby                              Detected an Active mate

12:48:52 EST Feb 22 2023
Cold Standby                          Sync Config                                  Detected an Active mate

12:49:03 EST Feb 22 2023
Sync Config                          Sync File System                           Detected an Active mate

12:49:03 EST Feb 22 2023
Sync File System                         Bulk Sync                                  Detected an Active mate

12:49:16 EST Feb 22 2023
Bulk Sync                                Standby Ready                           Detected an Active mate

13:41:35 EST Feb 22 2023
Standby Ready                          Just Active                             Other unit wants me Active

13:41:35 EST Feb 22 2023
Just Active                                Active Drain                            Other unit wants me Active

13:41:35 EST Feb 22 2023
Active Drain                             Active Applying Config              Other unit wants me Active

13:41:35 EST Feb 22 2023
Active Applying Config              Active Config Applied             Other unit wants me Active

13:41:35 EST Feb 22 2023
Active Config Applied Active Other unit wants me Active

13:42:15 EST Feb 22 2023
Active Standby Ready Other unit wants me Standby

 

@jmaxwellUSAF table 3 in this guide https://www.cisco.com/c/en/us/td/docs/security/asa/asa-cli-reference/S/asa-command-ref-S/show-f-to-show-ipu-commands.html

 

Cold Standby

The unit waits for the peer to reach the Active state. When the peer unit reaches the Active state, this unit progresses to the Standby Config state. This is a transient state.

Active Drain

Queues messages from the peer are discarded. This is a transient state.

That not correct as I know' in active/standby the unti that be elect as new active always send g-arp

Why ?

Because it make SW know that the port to new active is change.

That why I mention NSK and how you must detect G-ARP 

Some times this G-ARP missed and SW use previous port which lead to old active pair and hence packet drop.

I now realize I did not fundamentally understand how "protocol 105" technology works. I thought it was the same as HSRP technology, it is NOT.

"Generally, when a failover occurs, the new active unit takes over the active IP addresses and MAC addresses. Because network devices see no change in the MAC to IP address pairing, no ARP entries change or time out anywhere on the network." https://www.cisco.com/c/en/us/td/docs/security/asa/asa917/configuration/general/asa-917-general-config/ha-failover.html

 

 

As a helpful and cautionary note, the below dynamic caused a "big nightmare" event that resulted in significant financial impact to an enterprise. It would be best for professionals to appreciate it by remember to configure VIRTUAL MAC ADDRESSES...

Active/Standby IP Addresses and MAC Addresses
For Active/StandbyFailover,see the following for IPaddress and MAC address usage during a failover event:
1. The active unit always uses the primary unit's IP addresses and MAC addresses.
2. When the active unit fails over, the standby unit assumes the IP addresses and MAC addresses of the failed unit and begins passing traffic.
3. When the failed unit comes back online, it is now in a standby state and takes over the standby IPaddresses
and MAC addresses.

MAC Addresses and IP Addresses in Failover
However, if the secondary unit boots without detecting the primary unit, then the secondary unit becomes the active unit and uses its own MAC addresses, because it does not know the primary unit MAC addresses. When the primary unit becomes available, the secondary (active) unit changes the MAC addresses to those of the primary unit, which can cause an interruption in your network traffic. Similarly, if you swap out the primary unit with new hardware, a new MAC address is used.

Virtual MAC addresses guard against this disruption, because the active MAC addresses are known to the secondary unit at startup, and remain the same in the case of new primary unit hardware. If you do not configure virtual MAC addresses, you might need to clear the ARP tables on connected routers to restore traffic flow. The ASA does not send gratuitous ARPs for static NAT addresses when the MAC address changes, so connected routers do not learn of the MAC address change for these addresses.

CLI Book 1: Cisco ASA Series General Operations CLI Configuration Guide, 9.17 - Failover for High Availability [Cisco Secure Firewall ASA] - Cisco

 

Review Cisco Networking for a $25 gift card