cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Announcements
Whebcast-Overview of Cisco's Branch Routing Portfolio

How To: TACACS Failover with F5 BIG-IP Virtual Servers

525
Views
10
Helpful
1
Comments
VIP Collaborator

This is directed at those already leveraging F5's with TACACS or those that do in the future. It is not meant to be an all encompassing guide, rather an addition for an issue you need to be aware of. There is a well known guide jointly developed between Cisco and F5 that covers configuring Big-IP for use with ISE. It was developed during the era where ISE did not yet support TACACS, as such it completely omits the requirements and planning for TACACS. Lucky for us, most of the configuration is identical, you are just tacking on TCP 49.

https://community.cisco.com/t5/security-documents/how-to-cisco-amp-f5-deployment-guide-ise-load-balancing-using/ta-p/3631159

 

I recently had the opportunity to develop and test a TACACS deployment leveraging F5 load balancers, and I want to bring your attention to an issue that has a solution, just not an obvious one. For the purpose of this discussion we will utilize two ISE PSN's in a F5 virtual server group. Health checks are performed by the F5 to each PSN so that the load balancer is aware when the nodes are in a failed state. There is nothing unusual about this, and it is a commonly performed task from a load balancing design perspective.

One of the tests we performed included taking down both PSNs in one of the F5 virtual server groups. This is when the issue presented itself, the NAD, a 3850 in this case, would never fail over to the other servers configured in its TACACS server group. The 3850 will continue attempting to authenticate with what is a dead VIP. A packet capture followed and I have included the exchange below. The 3850 is represented by 10.10.10.10 and F5 VIP 10.20.20.20.

10.10.10.10 -> 10.20.20.20 TCP 5157 > tacacs [SYN]
10.20.20.20 -> 10.10.10.10 TCP TACACS > 5157 [SYN, ACK]
10.10.10.10 -> 10.20.20.20 TCP 5157 > tacacs [ACK]
10.10.10.10 -> 10.20.20.20 TACACS+ Q: Authentication
10.20.20.20 -> 10.10.10.10 TCP TACACS > 5157 [SYN, ACK]
10.20.20.20 -> 10.10.10.10 TCP TACACS > 5157 [RST, ACK]

Flags: 0x14 (RST, ACK)
Reset cause: BIG-IP: [0x23d6b9c:2328] No pool member available

The F5 VIP is going through with the TCP handshake even though it knows that the virtual server members (PSNs) are down. This causes the 3850 to think that the TACACS server (VIP) is still good. The process will continuously repeat on the switch, establishing a connection, then being reset by the F5. As administrators, we can see that the exchange resulting in a connection reset (RST) is not ideal, but the switch does not interpret this as bad.

In order to make the F5 ignore the NAD's while the PSN's were in a failed state, it required adding the following configuration, your naming will vary slightly.


list ltm virtual ise_tcp49_vs | grep -E "ltm virtual|service-down"
ltm virtual ise_tcp49_vs {
service-down-immediate-action drop

Once the drop action is introduced, the behavior of the 3850 is what we would want. In the following packet capture, the 3850 is represented by 10.10.10.10, the F5 VIP that should be dead is 10.20.20.20, and the alternate F5 VIP that is active is 10.30.30.30.

10.10.10.10 -> 10.20.20.20 TCP 8317 > tacacs [SYN]
10.10.10.10 -> 10.20.20.20 TCP 8317 > tacacs [SYN]
10.10.10.10 -> 10.30.30.30 TCP 8318 > tacacs [SYN]
10.30.30.30 -> 10.10.10.10 TCP tacacs > 8318 [SYN, ACK]
10.10.10.10 -> 10.30.30.30 TCP 8318 > tacacs [ACK]
10.10.10.10 -> 10.30.30.30 TACACS+ Q: Authentication
10.30.30.30 -> 10.10.10.10 TCP tacacs > 8318 [SYN, ACK]
10.30.30.30 -> 10.10.10.10 TACACS+ R: Authentication

After failing to handshake with the first configured TACACS server (VIP), the NAD continues on to authenticate against the alternate, 10.30.30.30 in this case.


If you already have ISE load balanced on F5's or are setting up new F5's, test for this failure scenario, and apply the fix above if the same behavior is observed.  It's possible that your F5's may already be set up this way, but it was not the default behavior to drop connections. 

 

This TCP RST/ACK behavior is documented in a couple F5 knowledge base articles for anyone interested in reading.
https://support.f5.com/csp/article/K9812
https://support.f5.com/csp/article/K8082

1 Comment
VIP Engager

Nice one!  I will have to re-visit my last F5 deployment and see whether the customer implemented this - it was handled by a 3rd party and I never saw the details under the hood.  Thanks for the details!

CreatePlease to create content
Content for Community-Ad

Spotlight awards-March 2019