cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
350
Views
5
Helpful
3
Replies

ASA55XX dual active problem on a active/standby configuration

wayne wan
Level 1
Level 1

I have a pair of ASA55XX series firewall and configured as active/standby configuration.
I have added an add-on network adapter card on both of them and I configured the management of it. (The original management port was configured as IPS usage)
I had experienced the network adapter suddenly cannot be detected by the firewall on one of the pair.
Then, the network seems very busy/stable and the servers cannot ping to the gateway on the firewall occasionally.
I connected to the console ports on both firewall and both of them showed they are active.
Finally, I need to power-disconnected one of them to quiet down the problem.

It seems that I'm getting into an active/active situation. And because, the prerequisite requirement to maintain an active/standby in-sync mode is that I need to have the same hardware configuration, then this problem cannot be avoided.

I want to ask is that there is no way to fix this kind of problem until all the cables are disconnected from the firewall with the faulty adapter or like me, just power-disconnected one of the firewalls? Is that there is no configuration setup can prevent this from happening?


fk01ssc-1# show run failover
failover
failover lan unit primary
failover lan interface LAN_Link GigabitEthernet0/6
failover polltime unit msec 500 holdtime 2
failover polltime interface msec 500 holdtime 5
failover key <key>
failover link State_Link GigabitEthernet0/7
failover interface ip LAN_Link 192.168.20.1 255.255.255.252 standby 192.168.20.2
failover interface ip State_Link 192.168.20.5 255.255.255.252 standby 192.168.20.6


Case 1) Occurred on APR-2022
fk01ssc-1 (Primary) is active and fk02ssc-1 (Secondary) has the defective add-on network adapter card.
The management port was configured on the add-on network adapter card.
Thus, syslog can still be transferred to the CSM for log consolation.

log in CSM
==========
4/11/22 10:47:03AM Alert (Primary) Lost Failover communications with mate on interface management
4/11/22 10:47:03AM Alert (Primary) Testing Interface management
4/11/22 10:47:03AM Alert (Primary) Testing on Interface management Passed
4/11/22 10:47:03AM Alert (Primary) Testing Interface ek31ssc_vlan_34
:
4/11/22 10:47:03AM Alert (Primary) Testing Interface ek21ssc_vlan_20
4/11/22 10:47:03AM Alert (Primary) Lost Failover communications with mate on interface ek31ssc_vlan_32
:
4/11/22 10:47:03AM Alert (Primary) Lost Failover communications with mate on interface ek31ssc_vlan_34
4/11/22 10:47:03AM Alert (Primary) Testing Interface ek01ssc
4/11/22 10:47:03AM Alert (Primary) Testing on interface ek31ssc_vlan_31 Passed
:
4/11/22 10:47:03AM Alert (Primary) Testing on interface ek01ssc Passed
:
4/11/22 10:47:03AM Alert (Primary) Testing on interface ek11ssc_vlan_13 Status Undetermined
4/11/22 10:47:03AM Alert (Primary) No response from other firewall (reason code = 4) No response from failover mate
4/11/22 10:47:03AM Alert (Primary) No response from other firewall (reason code = 3) No response from failover mate


We connect the console to the firewall (not sure primary or secondary) , a lot of these message prompted.


Number of interfaces on Active and Standby are not consistent.
If the problem persists, you should disable and re-enable failover on the Standby.

After rebooted Secondary, since the add-on network adapter card is defective, we can see the rule cannot applied to the management interface. And also, the messages
“Number of interfaces on active and Standby are not consistent. If the problem persists, you should disable and re-enable failover on the standby.” got prompted after the Secondary was rebooted.

====================================================================================
Console on Secondary
====================================================================================
mtu management 1500
^
ERROR: % Invalid input detected at '^' marker.

http 192.168.10.0 255.255.255.0 management
^
ERROR: % Invalid input detected at '^' marker.

ssh 192.168.10.0 255.255.255.0 management
^
ERROR: % Invalid input detected at '^' marker.

Number of interfaces on Active and Standby are not consistent.
If the problem persists, you should disable and re-enable failover on the Standby.
====================================================================================

We connected the console to the Primary(active) and observed the same error message was prompted. We checked on the front panel both active LEDs were on.

====================================================================================
Console on Primary
====================================================================================
Number of interfaces on Active and Standby are not consistent.
If the problem persists, you should disable and re-enable failover on the Standby.

Switching to Failed state
====================================================================================

My question for this case is that why the whole network seems so busy such that some of the program were down due to they cannot communicate to the gateway when both firewalls were powered on.
I also experienced when use the ASDM to connect the gateway port, it take very long time.
The problem was quiet down when we power disconnected the defective firewall.

 

Case 2) Occurred on DEC-2022

We replaced the defected firewall (Secondary) in APR. Now, on this case, the Primary firewall got the defective add-on network adapter card.

The symptoms were nearly the same like the first case. Some of the programs were up and down between 04:52 PM to 06:09 PM .

After the incident was resolved, we checked on the CSM and find the following log messages

12/20/22 04:52:49 PM Alert (Secondary) Switching to ACTIVE - HELLO not heard from mate.
12/20/22 06:16:16 PM Alert (Secondary) Failover interface Failed
12/20/22 06:16:16 PM Alert (Secondary) No response from other firewall (reason code = 4). No response from failover mate

The main difference is that at 18:09PM, we want to power disconnect the defective firewall to quiet down the problem base on last time's experience.
However, we did it on the wrong firewall, we power disconnected the Secondary. We reviewed the log after the incident and understand that The Secondary firewall is active and switched to active
at 04:52 since the primary got the defective add-on adapter card.

7 minutes later the Secondary is up. At that time period, we observed the network is still "busy" and programs were up and down.
Then we reload the Primary (the defective) one. Things became worse, many programs are totally down.
Then, we realized the Primary (the defective) is the real active and it needed to be up.

At this point, we observed the warning message prompted on both console .
"WARNING: Failover message decryption failure. Please make sure both units have the same failover shared key and crypto license or system is not out of memory"

So, we need to power disconnected the Secondary (healthy one) to make the network quiet down.

Finally, we replaced the Primary firewall with a spare one to solve the problem.

3 Replies 3

balaji.bandi
Hall of Fame
Hall of Fame

just trying to understand what are you looking from community to help you ?

end of all message you mentioned - Finally, we replaced the Primary firewall with a spare one to solve the problem.

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

>just trying to understand what are you looking from community to help you ?

this is my question posted at the begining.
"I want to ask is that there is no way to fix this kind of problem until all the cables are disconnected from the firewall with the faulty adapter or like me, just power-disconnected one of the firewalls? Is that there is no configuration setup can prevent this from happening?

Since Cisco Supoort cannot answer my question, I posted here to see if someone experience similar situation.


>Before asking the question, have you spent some time googled the problem encountered in your situation (rather than dumping the question on hundreds of people’s heads to think about your issue)?
Yes. I have been googled for nearly one month before I post here.
>Have you posted the issue on the right Forum – if not it will take ages for the community to understand the issue and address the issue.
My question is related to Cisco Firewall , a security product.
I can see the label "Cisco ASA" is under "Technology and Support"->"Secuirty"->Network Access Control".

>Describe the problem in a manner so the community can understand it easily and address the problem incorrect way.
>Device Model.

ASA55XX series

>IOS code running on the device.
>What is the issue in brief in your own way of description?
A brief of my problem was posted at the top with a question included.
A detail of my the two cased I faced were posted below.

>Provide the configuration where possible (most of the time) by removing sensitive information.
>post some output with show commands.

"show run failover" was posted.

today I will do lab and share result with you 

Review Cisco Networking for a $25 gift card