cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2830
Views
0
Helpful
7
Replies

Firepower 2110 FTD IMAGE in HA MODE Failed when force a Switch Mode

Hello everyone! We have a Firepower 2110 FTD IMAGE in HA Configuration on different VM Servers on different CPDs, where Vlan Failover is a Layer 2 extended. When we forced a failover on the primary FTD, flows did not migrate to the secondary firewall. We suspect that is because the FTD Primary Firewall's Mac address remains registered on the Primary Firewall's virtual switch port. That makes sense? something set up wrong?

1 Accepted Solution

Accepted Solutions

Hi @GiovanniStavale53399,

For physical part, I would check connectivity - if basic connectivity for given interfaces work properly in normal mode (e.g. ping betwen active and standby address), I would start with the assumption that everyting is ok, on physical network. I never seen Cisco swich behaves this dummy so far.

Since it is VMware deployment, I would start with checking VMware network. Please make sure your vSwitch is configured properly. You can find prerequisites here. I've seen HSRP/VRRP misbehavior if these settings are not honored. Also, explicitly states that you must configure it properly, otherwise it will cause issues with HA (you do experience described behavior).

BR,

Milos

View solution in original post

7 Replies 7

balaji.bandi
Hall of Fame
Hall of Fame

Do you have high level diagram how these connected, since you suspected the ARP table ? are you see the ARP table on switch  ?

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Hi Balaji! Thank you for your attention! Sorry for my mistake! Image is FTD, not ASA! I am really sorry! I have attached the HLD here and yes, we saw the MAC address of the primary FTD registered on the Core switch after the mode switch executed. Here is some evidences we collected after we forced the failover via the switch mode command:

Hostname: BRVIX5SECXFW002 (Primary)


show version
----------------[ BRVIX5SECXFW002 ]-----------------
Model : Cisco Firepower Threat Defense for VMWare (75) Version 6.6.1 (Build 91)
----------------------------------------------------
Cisco Adaptive Security Appliance Software Version 9.14(1)150
SSP Operating System Version 2.8(1.129)
Compiled on Tue 15-Sep-20 23:14 GMT by builders
System image file is "boot:/asa9141-150-smp-k8.bin"
Config file at boot was "startup-config"
BRVIX5SECXFW002 up 7 days 4 hours
failover cluster up 7 days 4 hours
Hardware: NGFWv, 8192 MB RAM, CPU Xeon 4100/6100/8100 series 2300 MHz, 1 CPU (4 cores)
Internal ATA Compact Flash, 50176MB
Slot 1: ATA Compact Flash, 50176MB
BIOS Flash Firmware Hub @ 0x0, 0KB
0: Int: Internal-Data0/0 : address is 0050.5682.fa34, irq 7
1: Ext: GigabitEthernet0/0 : address is 0050.5682.63e8, irq 9
2: Ext: GigabitEthernet0/1 : address is 0050.5682.b48d, irq 11
3: Ext: GigabitEthernet0/2 : address is 0050.5682.a8c4, irq 10
4: Ext: GigabitEthernet0/3 : address is 0050.5682.f581, irq 7
5: Ext: GigabitEthernet0/4 : address is 0050.5682.6d24, irq 9
6: Ext: GigabitEthernet0/5 : address is 0050.5682.5934, irq 11
7: Ext: GigabitEthernet0/6 : address is 0050.5682.cb47, irq 10
8: Ext: GigabitEthernet0/7 : address is 0050.5682.2fe5, irq 7
9: Int: Internal-Control0/0 : address is 0000.0001.0001, irq 0
10: Int: Internal-Data0/0 : address is 0000.0000.0000, irq 0
11: Ext: Management0/0 : address is 0050.5682.fa34, irq 0
12: Int: Internal-Data0/1 : address is 0000.0100.0001, irq 0
13: Int: Internal-Data0/2 : address is 0000.0000.0000, irq 0
14: Int: Internal-Control0/1 : address is 0000.0001.0001, irq 0


> show failover
Failover On
Failover unit Primary
Failover LAN Interface: failover-link GigabitEthernet0/3 (up)
Reconnect timeout 0:00:00
Unit Poll frequency 1 seconds, holdtime 15 seconds
Interface Poll frequency 5 seconds, holdtime 25 seconds
Interface Policy 1
Monitored Interfaces 4 of 311 maximum
MAC Address Move Notification Interval not set
failover replication http
Version: Ours 9.14(1)150, Mate 9.14(1)150
Serial Number: Ours 9A29XJM6DWW, Mate 9A6S9UB4NXQ
Last Failover at: 14:28:32 UTC Aug 18 2021
This host: Primary - Active
Active time: 14630 (sec)
slot 0: ASAv hw/sw rev (/9.14(1)150) status (Up Sys)
Interface diagnostic (0.0.0.0): Normal (Waiting)
Interface idmz (X.X.X.X): Normal (Monitored)
Interface inside (Y.Y.Y.Y): Normal (Waiting)
Interface dmz-satelite (Z.Z.Z.Z): Normal (Waiting)
slot 1: snort rev (1.0) status (up)
slot 2: diskstatus rev (1.0) status (up)
Other host: Secondary - Standby Ready
Active time: 278 (sec)
Interface diagnostic (0.0.0.0): Normal (Waiting)
Interface idmz (X.X.X.X): Normal (Monitored)
Interface inside (Y.Y.Y.Y): Normal (Waiting)
Interface dmz-satelite (Z.Z.Z.Z): Normal (Waiting)
slot 1: snort rev (1.0) status (up)
slot 2: diskstatus rev (1.0) status (up)
Stateful Failover Logical Update Statistics
Link : failover-link GigabitEthernet0/3 (up)
Stateful Obj xmit xerr rcv rerr
General 1739355 0 83899 0
sys cmd 80321 0 80321 0
up time 0 0 0 0
RPC services 0 0 0 0
TCP conn 506321 0 0 0
UDP conn 1151447 0 3578 0
ARP tbl 1208 0 0 0
Xlate_Timeout 0 0 0 0
IPv6 ND tbl 0 0 0 0
VPN IKEv1 SA 0 0 0 0
VPN IKEv1 P2 0 0 0 0
VPN IKEv2 SA 0 0 0 0
VPN IKEv2 P2 0 0 0 0
VPN CTCP upd 0 0 0 0
VPN SDI upd 0 0 0 0
VPN DHCP upd 0 0 0 0
SIP Session 0 0 0 0
SIP Tx 0 0 0 0
SIP Pinhole 0 0 0 0
Route Session 3 0 0 0
Router ID 0 0 0 0
User-Identity 1 0 0 0
CTS SGTNAME 0 0 0 0
CTS PAC 0 0 0 0
TrustSec-SXP 0 0 0 0
IPv6 Route 0 0 0 0
STS Table 0 0 0 0
Umbrella Device-ID 0 0 0 0
Rule DB B-Sync 0 0 0 0
Rule DB P-Sync 54 0 0 0
Rule DB Delete 0 0 0 0
Logical Update Queue Information
Cur Max Total
Recv Q: 0 11 84107
Xmit Q: 0 11 2065244


VLAN Name Status Ports
---- -------------------------------- --------- -------------------------------
2000 BRVIX5SECXFW002-FAILOVER active Po102


BRVIX5TOVMBELESA-RACK36-CORE01#sh mac address-table interface po102 | in f581
2000 0050.5682.f581 DYNAMIC Po102

 

 

Hi @GiovanniStavale53399,

Is this Firepower2110 deployment or FTDv deployment (on some virtualization platform)? You are mentioning FPR2110, while your output shows it is FTDv for VMware. If it is a physical deployment, then there is a possibility that you'll need to troubleshoot connectivitty on physical level. If it is VM deployment, than you need to involve VMware part as well.

Another thing is that you are running FTD v6.6.1. I would advise upgrade to either recommended 6.6.4, or to more recent 6.6.5. There are couple of bugs related to HA, which might help you.

BR,

Milos

Hi Mr. Milos! thank you very much for your attention! Below the answers for your questions:

 

Is this Firepower2110 deployment or FTDv deployment (on some virtualization platform)?

FTDv deployment - VMWare

 

If it is a physical deployment, then there is a possibility that you'll need to troubleshoot connectivitty on physical level. If it is VM deployment, than you need to involve VMware part as well.

 

We agree with you but our expectation is to find someone here who lived this experience, to find out if there is any fine tuning to be done in the VMWare environment. Anyway, we will schedule a change to reproduce the problem and try to analyze it with the help of the VMware team

 

Another thing is that you are running FTD v6.6.1. I would advise upgrade to either recommended 6.6.4, or to more recent 6.6.5. There are couple of bugs related to HA, which might help you.

 

So we´ll start a search about that bugs on Cisco site

Hi @GiovanniStavale53399,

For physical part, I would check connectivity - if basic connectivity for given interfaces work properly in normal mode (e.g. ping betwen active and standby address), I would start with the assumption that everyting is ok, on physical network. I never seen Cisco swich behaves this dummy so far.

Since it is VMware deployment, I would start with checking VMware network. Please make sure your vSwitch is configured properly. You can find prerequisites here. I've seen HSRP/VRRP misbehavior if these settings are not honored. Also, explicitly states that you must configure it properly, otherwise it will cause issues with HA (you do experience described behavior).

BR,

Milos

Hi Mr. Milos! I hope all is good! just today we had the opportunity to apply the VMWARE prerequisites (along with a Cisco TAC Engineer in the maintenance window) that you mentioned in your message a long time ago. I come back here to write to you that after applying the 3 tunning indicated in the document on the Firewall Interfaces (all interfaces not only on the management and failover interfaces like the aforementioned document !!) the problem was solved !!! !
But... unfortunately, during the maintenance window, we faced another problem... when we forced failover using CISCO FDM, the failover worked fine, but the output of the SH FAILOVER command  showed us the WAITING status when the correct status would be MONITORED for all interfaces in use. We decided to keep the TAC open and the CISCO Engineer will continue to look into this issue.
Have you ever seen this kind of problem?

 

> show failover
Failover On
Failover unit Secondary
Failover LAN Interface: failover-link GigabitEthernet0/3 (up)
Reconnect timeout 0:00:00
Unit Poll frequency 1 seconds, holdtime 15 seconds
Interface Poll frequency 5 seconds, holdtime 25 seconds
Interface Policy 1
Monitored Interfaces 4 of 311 maximum
MAC Address Move Notification Interval not set
failover replication http
Version: Ours 9.14(1)150, Mate 9.14(1)150
Last Failover at: 14:51:20 UTC Sep 30 2021
This host: Secondary - Active
Active time: 14336 (sec)
slot 0: ASAv hw/sw rev (/9.14(1)150) status (Up Sys)
Interface idmz (172.31.3.140): Testing (Waiting)
Interface inside (10.199.0.253): Normal (Waiting)
Interface dmz-satelite (10.12.20.2): Testing (Waiting)
Interface diagnostic (0.0.0.0): Normal (Waiting)
slot 1: snort rev (1.0) status (up)
slot 2: diskstatus rev (1.0) status (up)
Other host: Primary - Standby Ready
Active time: 1598 (sec)
Interface idmz (172.31.3.141): Normal (Waiting)
Interface inside (10.199.0.250): Normal (Waiting)
Interface dmz-satelite (10.12.20.3): Normal (Waiting)
Interface diagnostic (0.0.0.0): Normal (Waiting)
slot 1: snort rev (1.0) status (up)
slot 2: diskstatus rev (1.0) status (up)

Stateful Failover Logical Update Statistics
Link : failover-link GigabitEthernet0/3 (up)
Stateful Obj xmit xerr rcv rerr
General 2234881 0 349581397 0
sys cmd 575612 0 575606 0
up time 0 0 0 0
RPC services 0 0 0 0
TCP conn 497437 0 119291793 0
UDP conn 1151016 0 229657866 0
ARP tbl 10769 0 43784 0
Xlate_Timeout 0 0 0 0
IPv6 ND tbl 0 0 0 0
VPN IKEv1 SA 0 0 0 0
VPN IKEv1 P2 0 0 0 0
VPN IKEv2 SA 0 0 0 0
VPN IKEv2 P2 0 0 0 0
VPN CTCP upd 0 0 0 0
VPN SDI upd 0 0 0 0
VPN DHCP upd 0 0 0 0
SIP Session 0 0 0 0
SIP Tx 0 0 0 0
SIP Pinhole 0 0 0 0
Route Session 0 0 3 0
Router ID 0 0 0 0
User-Identity 0 0 2 0
CTS SGTNAME 0 0 0 0
CTS PAC 0 0 0 0
TrustSec-SXP 0 0 0 0
IPv6 Route 0 0 0 0
STS Table 0 0 0 0
Umbrella Device-ID 0 0 0 0
Rule DB B-Sync 0 0 1 0
Rule DB P-Sync 47 0 12340 0
Rule DB Delete 0 0 2 0

Logical Update Queue Information
Cur Max Total
Recv Q: 0 49 351904077
Xmit Q: 0 1 2250942

Hi @GiovanniStavale53399,

Glad to hear that it helped.

I haven't noticed where in the document is stated that it applies only to management interface. From my standpoint, it is clear that it must be applied to all interfaces, as these settings will allow ASA to learn traffic not destined for its physical MAC address (in simplified explanation). This is required as active MAC address is a floating address, and can be seen on both devices at some point.

Have you applied this configuration on all hosts in your infrastructure (all on which ASA can exist)? Based on the outputs, it looks to me that it could happen that change was not done on all vSwitch-es.

BR,

Milos

Review Cisco Networking for a $25 gift card