cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1709
Views
1
Helpful
3
Replies

Load Balancing and ISE

craiglebutt
Level 4
Level 4

HI

Just a query, currently we are using 6 PSN for wireless only, 3 sites, 2 ISE and WLC in HA for each site.  We have mobility anchors with several organisations, issue is they are limited to how many of our ISE they can connect to as wlc can only support up to 17 AAA.

With this in mind, looking at using the F5 and to support, have been watching the Cisco Live from Barcelona and reading the notes for the F5, question is is there any gotchas I should know about first?

The reason I ask, I have a current call open with TAC, one of the first things they asked was are these load balanced?  This to me caused alarm bells in my head.

just want to see if anyone has any advice?


Cheers

3 Replies 3

Jason Kunst
Cisco Employee
Cisco Employee

Not sure I would recommend load balancer for such a small deployment. You would be introducing more complexity where not likely needed

Likely having each site point to a local psn and another site psn as a backup would be sufficient but this all depends how you have your data centers implemented and where the psns are

Poor mans load balancing would likely be fine

There are other ways to skin that cat as well right? You could have a main site with 2 load balancers and 3 psns behind each as well and if you have lots of load to distribute you can point half site NADs at 1 vip as primary and the other as secondary and then other half the other way

Damien Miller
VIP Alumni
VIP Alumni

Load balancers do add some complexity and as you have noticed, TAC AAA engineers will often ask it as a discovery question first.  I have had cases where I have had to prove that the issue is not a load balancer problem as one of the troubleshooting steps.  Troubleshooting can be more complicated with load balancers depending on how you are load balancing.  If you are hashing two attributes then sessions might not be as deterministic to say X endpoint session will hit X PSN.  Only really bothers me if i'm trying to run a tcpdump from the gui or watch a debug log on a PSN.

I find that once a load balancer config has been ironed out, it's usually rock solid.  So long as no one is in there changing anything then it has not been the source of my concerns when troubleshooting.  The exception to this was when I had an engineer silently modify the global MTU setting to 9000.  The next morning was a little rough as client authentications were being reassembled and sent on to ISE as jumbos.  Looked as if clients were just misconfigured and failing to respond until you dig in to the debugs. 

There is a very relevant Cisco Live session, BRKSEC-3699 "Designing ISE for Scale & High Availability". It has a whole section on load balancing ISE, hopefully BRKSEC-3699 is what you were referencing to.  Craig Hyps presented this yesterday and made a great argument for load balancing.  It requires a significant amount of planning and forethought to manually distribute load across your PSNs.  If you can place 3 PSNs behind a two different VIP's it simplifies config and reduces this manual effort. 

One extra thing to keep in mind and that has helped us out with an odd but impactful failure scenario. 

Configure radius health checks from the load balancer to use an Active Directory/LDAP account and not a local ISE account.  If for some reason (AD engineer deletes your ISE computer account) ISE loses it's ability to authenticate against the directory, the load balancer will be able to mark this PSN down and pull it out of the loop.

Thanks Damien for the plug!

There is a lot of feedback listed at the bottom of the guide post from different teams where certain tweaks may be required for specific environment or software versions deployed.   In addition to fragmentation and reassembly caveats, the two key call-outs in the customer notes include:

  • May need to set CoA SNAT to "IP Forwarding" versus "Standard".
  • Variances in use of CLIENT_DATA versus CLIENT_ACCEPTED in iRules.  I met with F5 at Cisco Live and will be working together to hash this topic out.  F5 is reportedly deprecating CLIENT_DATA, but issues have been seen if switch to CLIENT_ACCEPTED.

As Damien mentioned, once vet out the nuances of your specific config/environment, the deployment tends to be solid.  It is the on-going operations cost of the network, or ability to facilitate upgrades and migrations where the load balancers pay dividends beyond basic load distribution.  F5s also have the ability to handle more sophisticated stitching of flows to help ensure DHCP or even HTTP hits same PSN that is handling RADIUS.  This increases ISE stability and scale and you avoid the need to constantly manage distribution manually at the NAD.

Craig