09-07-2022 02:14 PM
Hello all, we have an FMCv Version 7.1.0.1 supporting 6 devices.
4 FTD 2130's version 7.1.0.1
2 FTD 2110's version 7.1.0.1
The 2130's are in HA pairs, one at my location and one at a remote location. A couple of weeks ago I was doing my morning check to see if new VDB, LSP and SRU files had downloaded and deployed when all of a suddent I got warning messages from the FMC that there was no heart beat detected on the port between the two FTDs. This resulted in a smooth failover to the secondary standby so no connectivity was lost.
I tried to SSH into the other FTD during this alert time and got kicked out. Finally went and checked the device and it was dark. No lights except the power light so no data flowing. After about a minute the unit came back on line and all lights went green.
I went back downstairs and started collecting data via ssh commands from that unit.
I got a TAC case open and they are reviewing the data and have confirmed that it did indeed crashed based off of that.
They continue to review so the case is still open.
RIght now the device that is normally Primary Active is now Primary Standby and I'm not ready to try swapping it back until I find out the why behind it. This means no code upgrades, service updates or patching of any kind. Should we be forced into one due to an IAVA or something that would be different.
Has anyone else experienced a random and unprovoked crash of their FTD's on this or any version?
ej
09-18-2022 11:25 PM
Hello Eric,
FTD's can crash in certain instances - You have high CPU / memory at that instance. Now the reason behind this could be anything - NAT issuses, high IPS rules, memory leak, software bug etc. Usually a core file is generated and TAC should be able to check the traceback. the core file usually gives the reason why crash happened and how to avoid it in future. If your secondary is working fine, let it work until you find RCA.
Also i would recommed to just do a basic health check of your secondary sensor as well just to avoid similar issue on that one.
-----------------------------------------
You can also learn more about Secure Firewall (formerly known as NGFW) through our live Ask the Experts (ATXs) session. Check out Cisco Network Security ATXs Resources [https://community.cisco.com/t5/security-knowledge-base/cisco-network-security-ask-the-experts-resources/ta-p/4416493] to view the latest schedule for upcoming sessions, as well as the useful references, e.g. online guides, FAQs.
-----------------------------------------
Regards
Divya jain
09-19-2022 02:32 PM
ok, figured it out and used the FMC Adv Tshoot to get the file and upload it to cisco TAC.
09-19-2022 02:23 PM
Thank you for the reply. I immediately opened a TAC case and sent all the pertinent information. At that time I didn't check for a Core file generation. When the TAC Eng asked about it I checked but the only core file seen was generated but having an issue getting that file off the device. SecureCRT/FX doesn't recognize the filesystem so I need to fix that.
We were having issues with high CPU usage and throwing more resources at the VM fixed that. We haven't seen any of those warnings come up since then.
We are currently running on the secondary with no recurrence of the problem and TAC is reviewing but hasn't
09-19-2022 11:03 PM
Hi Eric
High CPU / memory issues can cause the devices to crash. But again the core file review can let you know better. TAC is the best way to get the core file analysed. I would say keep the secondary running until you get RCA or atleast get a basic check done for both - primary and secondary.
Regards
Divya Jain
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide