09-11-2024 02:36 PM - edited 09-11-2024 03:00 PM
Hi,
I have two questions on the behaviour of 3100 Secure firewalls in a cluster and interface health checks.
We have a pair of 3100 secure firewalls (running FTD), that we have setup in a cluster, managed by FMC. All devices connected to the two FW's use etherchanel across both the members of the cluster. Each FW has two connections to other devices as most of those devices are clustered devices as well (such as switches).
What we have found is that if one of the members of any of the etherchannel links has a fault for whatever reason (maintanence, etc), the FW cluster member that etherchannel member is connected to, goes into a disabled state, taking down all other links on that FW cluster member. This occurs, it appears due to interface health monitoring detecting that an interface is down on one FW cluster member, but not the other one. No matter if it is only an etherchannel member (1 out of 4), so the FW member still has a connection to the destination device.
This behaviour can be stopped by disabling interface health monitoring globally in the health policy for the cluster.
So question number one. Is there a way of stopping this behaviour of a single member of an 4 link etherchannel causing a whole firewall to disable itself, without lossing all interface monitoring capabilities across the cluster? That is can what the behaviour of how the cluster reacts to the link being down be changed, without lossing interface alerts?
Second question is it seems pretty hard to find out what interface has caused the member of the FW cluster to disable itself. When it happens I check cluster, etc health logs, but they don't tell you what interface having a health issue caused the Cluster member to disable itself in the first place. Only that there is a mismatch so one member has disabled itself. I was not able to find much documentation on how to check this and this behaviour in general. Only thing I could find was:
Troubleshoot Firepower Threat Defense (FTD) Cluster - Cisco
The logs it said to check, did not yield what interface was mismatching that caused the cluster member to be disabled.
Thanks in advance for any help of this.
09-11-2024 11:14 PM
You have cluster or HA active/standby FW?
If you have HA then
I will start see if the PO is correct config or not
In SW
Show port-channel summary
MHM
09-12-2024 06:17 AM
Is the firewall in an active/standby HA pair or is it a "proper" cluster?
As for finding out which interface is failing, if you go into expert mode on the FTD and then look through the file /var/log/messages for the relevant date the issue happened. You might see what interface is causing the issue there.
09-12-2024 07:49 AM
It is a "proper" active, active cluster, so a bit annoying when half of it goes down just due to a single interface issue.
09-12-2024 09:54 AM
Active/active cluster must be careful when config PO
Can I see topology
MHM
09-12-2024 05:27 PM - edited 09-13-2024 01:59 AM
09-12-2024 11:43 PM
09-13-2024 02:08 AM - edited 09-13-2024 02:13 AM
No Spanned Etherchannels are supported with the firewalls in this clustering setup, as from the document I linked:
Your layout would be the only option in an HA-active backup setup. The setup we have in place is proper clustering with both firewalls being active-active. Hence, on the switch output I attached, all 4 etherchannel members show up in the EtherChannel to both firewalls at the same time.
Be aware I have updated the original topology I posted as I did put document the Cluster control link topology properly.
09-13-2024 02:11 AM - edited 09-13-2024 02:33 AM
MHM
09-13-2024 02:22 AM - edited 09-13-2024 02:23 AM
Sure, that is an alternative layout supported, but the way we have ours laid out is supported as in that documented.
09-13-2024 02:24 AM
I suggest open TAC and mention my reply to be sure
thanks a lot
MHM
09-12-2024 02:06 PM
Active/Active is still an Active / standby HA failover pair, it is just that you have an active context on both ASAs. A "proper" cluster would be that all ASAs acting as a single logical firewall and all are actively forwarding traffic. I bring this up since the solution would be different in each of these setups, and therefore important to define what you, the poster, is defining as a cluster.
In your case you would want to configure a failover to occur if 3 interfaces fail. This is possible. Unfortunately you cannot specify which 3 interfaces or portchannels this should apply to. So, you could end up in a situation where you have 3 portchannels with 4 interfaces in each, and 1 interface in each has failed for whatever reason, then you would have a failover situation.
The configuration for this is done in FMC under Devices > Device Management > Edit the device > High Availability > Failover Trigger Criteria and edit failover limit to be 3
09-12-2024 04:49 PM - edited 09-12-2024 04:51 PM
Apologies. I assumed just saying cluster in my original post would make it clear. It is a cluster created as in the below:
09-12-2024 07:52 AM
Hi,
Thanks for that. My response to both of your points, is how would I do that?
09-13-2024 02:20 AM
Think I have answered my second question after some playing. The CLI command 'show cluster history' gives detail down to the port / etherchannel that is causing the issue. The issue is that it matters what FW you run the command on. If run on the FW that has disabled itself, then you do not get the detail. Maybe related to which FW was in 'control' at the time of the interface issue.
You then need to dig into the port / etherchannel interface status to work out which member or members have an issue.
Would still like a way of stopping the Firewall disabling itself without loosing all interface monitoring alerts.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide