Core issue
PIX Firewall failover is a high-availability mechanism where PIXes maintain stateful connectivity with one another. When one of the PIX devices fails, the other takes over and continues packet forwarding.
Resolution
The two PIX failover units send special failover hello packets to each other over the failover cable and all interfaces every 15 seconds (excluding those that are administratively shut down). In the event the hello packets are not received on an interface, or if an interface waiting for hello more than 2.5 minutes after the other interface went into normal state, the interface is placed in testing mode (if the interface is not shutdown and link status is up).
While an interface is in testing mode, normal traffic can flow, provided the interface is functioning properly. Testing is started only if an error condition has occurred and is therefore based on the idea that "if I'm okay, then you must have failed." Testing consists of the following four consecutive tests:
- NIC status test - This test is a Link Up and Down check of the Network Interface Cards (NIC). If an interface card is not plugged into an operational network, it is considered failed.
- Network activity test - This is a received network activity test. The unit counts all received packets for up to five seconds. If any packets are received at any time during this interval, the interface is considered operational, and testing stops. If no traffic is received, the unit performs an Address Resolution Protocol (ARP) test.
- ARP test - In the ARP test, the unit's ARP cache is read for the ten most recently-acquired entries. Then, one at a time, the unit sends ARP requests to these machines, attempting to stimulate network traffic. After each request, the unit counts all received traffic for up to five seconds. If traffic is received, the interface is considered operational. If no traffic is received, an ARP request is sent to the next machine. If at the end of the list no traffic has been received, the unit performs the Ping test.
- Ping test - To perform the Ping test, the unit sends out a broadcast ping request. The unit then counts all received packets for up to five seconds. If any packets are received at any time during this interval, the interface is considered operational and testing stops. If no traffic is received, the testing starts over again with the ARP test
At the beginning of each test, both units clear the receive count for the interface. At the conclusion of each test, the testing unit first checks to see if it has received any traffic. If so, it considers itself operational and fails the other unit. If not, it asks the other unit if it received any traffic. If it has, the testing unit considers itself failed. If neither unit received any traffic, the testing moves to the next test. If at any time the asking unit does not hear the results of the test from the other unit, it considers the other unit failed, just as if it did not hear the hello message over the failover cable. If an active unit is determined failed, a switchover occurs. If a standby is determined failed, it is not allowed to take active control. The results of these tests are sent through syslog by both the active and standby units.
For more information, refer to How Failover Works on the Cisco Secure PIX Firewall.