cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
570
Views
5
Helpful
4
Replies

FTD 4145 Firewall Cluster Etherchannel

MARTIN HUERTER
Level 1
Level 1

We have two FirePower FTD 4145 firewalls operating in FTD cluster mode. The devices below the firewall cluster is a pair of Catalyst 6807 switches running in vss mode. Above the firewall cluster is a pair of Catalyst 9500 switches that have been stacked. We have two 10GigE interfaces in an etherchannel group going from the primary 4145 chassis to each of the Catalyst 9500 stacked switches for the cluster control link. And we have two 10GigE interfaces in a different etherchannel group going from the secondary 4145 chassis to each of the Catalyst 9500 stacked switches for the second cluster control link. The northbound receiving/forwarding interfaces are four 10GigE interfaces all in one etherchannel between the firewall cluster and the catalyst 9500 switch stack. The southbound receiving/forwarding interfaces are four 10GigE interfaces all in one etherchannel between the firewall cluster and the catalyst the Catalyst 6807 vss pair. 

This deployment has been running in our production network this way for not quite a year now without any problems. However, when we were constructing this infrastructure, we had a few problems getting all four of the 10G forwarding/receiving interfaces to come up in the etherchannels with out some of them being suspended. We worked on this problem for a while and eventually we were able to get all four 10GigE interfaces to come up in the north and south etherchannel groups. I am not certain and do not recall what specifically was done to get them all up, but they came up and were stable for several months. 

A couple days ago we detected two of the10GigE interfaces in the north and south eterchannels groups were suspended (see diagram below, secondary 4145 Te1/1, 1/2/, 1/3, and 1/4) on the secondary firewall cluster chassis. The primary chassis continued to pass traffic without an outage, but the secondary could not. The cluster control links on the secondary cluster chassis remained up, just the two 10G interfaces in the north and south etherchannel were suspended. 

We looked at the firewall chassis with the suspended interfaces and could not see a reason why the interfaces were suspended. We looked at logs on the Catalyst 9500 stack and Catalyst 6807 vss and saw these interfaces we suspended because the firewall interfaces were not sending LACP PDUs to the switches. There was no port blocking in the spanning-tree on the catalyst 9500 or Catalyst 6807 switches. Both the Catalyst 9500 and the 6807 switches had the same LACP error in their logs. We disabled/enabled the interfaces on the firewalls and the switches this did not un-suspend them. We re-booted the secondary firewall chassis and when it came back up, the interfaces continued to stay suspended. The frightening thing is, while we were collecting log information and creating troubleshooting files on the firewall, the interfaces just came back up on their own. 

We have opened a TAC case on this, but have not had received any feedback as of yet. This is twice this has happened on these platforms and because now they are in production it has created a bigger concern. I am 99.99% certain this is a bug in the FTD software, but this same scenario happened in two different revisions of software. So I thought I would post to the Cisco community to see if anyone else has ever encountered this type of problem in a FTD firewall cluster.

Thanks!!

MARTINHUERTER_0-1663772738370.png

 

 

 

 

 

 

 

4 Replies 4

Marvin Rhoads
Hall of Fame
Hall of Fame

I did encounter similar weirdness with a 9300 cluster into a VSS core switch. Configs had been working fine. Stopped working after a reload for hardware replacement. TAC was engaged (firepower and switching engineers both), debugs taken, heads collectively scratched. At the end of the day only a power down (cold power down - not reload) and reload after physically reseating the netmods fixed the issue. Never did get a bug assigned but we were just happy to get it working again.

Marvin,

Thanks for your reply. How often did you see this event on the 9300’s? Ours has only happened twice with almost a year between events? Hopefully it will not become more frequent.

Thanks again,
Martin

You're welcome @MARTIN HUERTER 

It only happened that once and it has been about 10 months so far without any more such occurrences.

Friend the suspend is happened when the LACP is not receive, 
so the issue as I think that one side run LACP other is ON mode, this make on link in suspend. 

Review Cisco Networking for a $25 gift card