cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1622
Views
5
Helpful
0
Replies

Recover from "bricked" FTD firewall. Can't make further changes.


Do any of these topics sound familiar to you?

  • FTD Upgrade failed, and I can't do anything on the firewall because there's a "staged" deployment that fails but Discard All won't clear it (such as VRT or VDB updates).
  • For some reason, any attempt to Deploy Now or attempts to create a Troubleshoot file, or changes to other System Settings fail immediately with the error, "Failed to start deployment job".
  • Upgrade fails on an HA (High-Availability) pair from CDO (Cisco Defense Orchestrator) and you can't upgrade or deploy changes now because of different OS versions and deployment changes that can't be deployed on both code versions.


Synopsis

It's been obvious to me that the FTD upgrade process is a bit fragile and requires extreme care. Cisco's hostage state for VRT/VDB deployments is also a real show-stopper in more ways than one. The good news is, Cisco developed the FTD devices to be resilient and sort of "smart" so that you don't completely destroy them. Also, in most cases, HA works even with disparate FXOS versions. That said, they can still self-destruct in a way that TAC will have trouble working around some issues without recommending a rebuild if you don't perform upgrades exactly as you should ("should" is ambiguous, btw, because I still don't truly know what that entails, procedurally).

This post is to show you a method I've pieced together that worked for me in the following scenario:

My HA upgrade failed, and now I can't do anything with the Primary device.

I first attempted an upgrade for an HA pair of FTD devices using CDO. It succeeded in upgrading the Secondary device but failed to upgrade the Primary device, and rollback also failed. The OS versions were too different, requiring changes to be deployed before proceeding--but the deployment of those staged changes also failed (Catch-22). This caused issues because of the HA error that the two OS versions were different. So, I performed a manual upgrade of each device separately, while maintaining HA mode. Again, the Secondary was successful, but (example) when an Intrusion Rule and Vulnerability DataBase update was downloaded and saved to the Primary somehow failed to deploy, and placed the firewall into a "brick" state, where I can't perform any further upgrades, deploy, or discard staged changes.

NOTE This post assumes you use the Management interfaces on the FTD firewalls to access them, and you do NOT use the FTD devices (AnyConnect) for VPN access as a means to access the Management interfaces (why I always recommend a separate VPN solution).


Bricked FTD Firewall

Nothing is more frustrating than a firewall, whose state is working, but you cannot make any changes to it. Most experts will suggest there's a positive way to deploy the stuck change (there must be something wrong), and for most problems like this, they are correct. I've run into many of them.

Murphree's Law (a little background)

In the 1990's I worked for Adtran. My job was to break things and recreate those issues for design engineers. It was the job made for me and my last name was used in, "If anyone can break it, Brian Murphree can!". So, here I am, back at it, but as a long-time (22-years) customer of Cisco. And the relatively new FTD provided me with quite the cannon fodder. I'm seemingly breaking them at every turn. Log4J was certainly a classic example of "hold my beer."

Why I'm here

It's frustrating to the nth degree when you have seriously, no choice but to contact TAC to support an open-source OS that uses a database-driven configuration for firewalls. What was an incredibly reliable Finesse OS that remained rock-solid for well over a decade, now, FTD runs on Linux (or FXOS). It's a bit more complicated now. The configurations are deployable in an Ansible fashion, and OS updates are software upgrades. While these concepts are cool, they're a bit frail and slow.

Having been using FTD FXOS now for over 2 years, I've learned to love it and at times, hate it. But this scenario is where I drew the line and I feel it's time to help.

Recover a bricked firewall

I want to recover a broken FTD firewall, but I don't want to screw up the Management interface config, now default the admin password. Everything else I either have a backup of or I use the firewall in an HA pair. This procedure is written as if you have an HA pair, and the PRIMARY device is all but bricked. Alter the procedure as you see fit.

NOTE This post assumes you use the Management interfaces on the FTD firewalls to access them, and you do NOT use them for VPN access as a means to access the Management interfaces (why I always recommend a separate VPN solution).

NOTE
If the devices are being used in production, you should schedule a maintenance window of at least 2 hours.

 

Procedure:

 

  1. Ensure the Active firewall is the Secondary HA device.
  2. Break HA mode on the Secondary. (You may have to manually re-enable interfaces afterward, so do this in a maintenance window.)
  3. Shut down switchports to the Primary for interfaces that normally pass traffic. Leave the Management interface switchport up. (This is just a safety precaution and not required.)
  4. SSH into the CLI of the Primary.
  5. Perform the following commands in order. Accept the changes as the dialogues request. Wait for results. Each takes command several seconds to complete.
    1. > configure manager delete <enter>.
    2. > configure firewall transparent <enter>. (This will wipe out everything but the Management interface configuration and your Admin password, even though it claims it will wipe out those as well).
    3. > configure firewall routed <enter>. (This converts the firewall back to Routed (default) mode).
    4. Assuming your SSH access is still there, the FDM (Firewall Device Manager) may not be. To bring back FDM you do: > configure https-access-list 0.0.0.0/0 <enter>.
  6. Now, log in to the FDM using your previous admin credentials. Admin123 is the default password if that didn't work. If the FDM is still broken, Try going here to get further help.
  7. Next, we need a new config on this now defaulted Primary firewall. It will not have HA configured, and in Step 2, you did Break HA, which wiped out HA on the Secondary device.
    1. Set up the firewalls' HA orientation in reverse. In other words, their roles will be reversed for HA. The current Secondary FTD firewall will now become your new Primary in HA mode, and the defaulted Primary FTD firewall will become Secondary (Otherwise, you stand to lose your good configuration for a default one). It's possible(?) you may have to manually enable your interface used for HA sync.

      NOTE For Step 7, use the Standby IPs for each interface of the new Secondary FTD device so that you don't experience an IP overlap!

  8. After HA is set up, the two should replicate configurations. Both FTD devices should soon have the full, synchronized configuration.
  9. After that completes, and you verify the newly configured FTD firewall (now currently the HA Secondary) has a fully restored configuration, you can Break HA on both of them again and recreate HA as you require. Just again beware, you may have to manually re-enable and re-IP interfaces, so do this inside a maintenance window.

Thanks to Todd Lammle for the Transparent mode tip, and to Cisco for providing some of the best documentation in the industry!

RFC 1925
0 Replies 0
Review Cisco Networking for a $25 gift card