cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
348
Views
0
Helpful
4
Replies

FTD failover issues - CD App Sync error is Failure in Standby/Slave

Chess Norris
Participant
Participant

Hi, 

After a power outage and issues with the UPS, we see an issue with failover. 

The first thing I noticed when login in to the CLI of the secondary firewall, was this message - "You have logged in while system startup is in progress. Please wait, some feature may be unavailable until startup is complete". I still see this message, even after 3 days. A rebbot of the firewall didn't helped either. I am not sure this is related to the failover issue or another separate issue. Anyway, the failover doesn't work on the secondary firewall and the state is "Failover Off (pseudo-Standby)" A "show failover history details" gives me the following output

==========================================================================
From State To State Reason
==========================================================================
12:57:25 UTC Aug 19 2022
Not Detected Negotiation No Error

12:57:32 UTC Aug 19 2022
Negotiation Cold Standby Detected an Active mate

12:57:33 UTC Aug 19 2022
Cold Standby App Sync Detected an Active mate

12:57:59 UTC Aug 19 2022
App Sync Disabled CD App Sync error is Failure in Standby/Slave
==========================================================================
PEER History Collected at 00:00:00 UTC Jan 1 1970 (Current Status Failed)
===========================PEER-HISTORY===================================
From State To State Reason
===========================PEER-HISTORY===================================
===========================PEER-HISTORY===================================

At first I thought I should break the failover from FMC and re-create it, but when trying to break it, I received the following message:

Capture.JPG

Is it safe to go on with breaking the failover or should I contact TAC? I'm also curious of the message: "You have logged in while system startup is in progress. Please wait, some feature may be unavailable until startup is complete"  Anyone know how to fix this?

Thanks

/Chess

4 Replies 4

ianwatts
Beginner
Beginner

Similar issue, mine was to move power connections into a redundant config, at least between my HA pair of ASA 5525's... (I kinda wish they had dual power supplies...).

I started with pulling the plug on my secondary and cutting the power cable over to the new location.  Half an hour since plugging it in, the system startup is still in progress...

This doesn't seem like a good failover option to me.. and I'm a but troubled that your post received no traction.  I'll be contacting the TAC for my situation.

My plan was to cut power over on my secondary, bring it back and happy, fail over to the second firewall, cut power over on the first firewall, make things happy again, fail back to my primary.  As it stands, I am now stuck in my maintenance and will have to leave the first appliance plugged into a wonky power situation.  Lovely.

In our case, TAC was able to resolve this issue without the need to re-image. After an investigation, they saw we were hitting the following bug https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvt09318

In order to fix this TAC had to replace some configuration files and run a scripts to repair the Database, so this is not something we could have fixed on our own.

/Chess

Marvin Rhoads
VIP Community Legend VIP Community Legend
VIP Community Legend

Devices running FTD (or ASA Firepower service modules) can be corrupted as a result of non-graceful power loss. When that happens, TAC can sometimes help with some low level database recovery scripts but sometimes it ends up requiring reimaging and restore from backup.

From what I can tell, the root cause is heavy reliance on databases under the covers to store configuration state vs a flat text file like ASA code used.

So they had me run the restore, which looked like it threw a stack trace.. but more importantly my primary firewall stopped passing traffic at all.  We had an outage and were impacted, especially with our hybrid/mostly remote work model.

Indeed the TAC is suggesting a reimage.. I need them to address that other issue before I proceed with them further.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Recognize Your Peers