10-13-2024 08:10 AM
Running Cisco ISE v3 with 6 PSNs no load balancer across a campus network.
On 3 individual occasions over the last few weeks clients have stopped authenticating and we have seen the health score drop. The workaround has been to restart the ISE services on 2 of the PSNs.
I'm not sure if this is a problem with an individual PSN or if its due to the lack of load balancing, and certain PSNs are being over loaded at large amounts of students authenticate as they move building.
Any suggestions to likely causes or troubleshooting steps? Is there a way of seeing how many clients are authenticated to each PSN at any one time?
Thanks
10-13-2024 08:34 AM
If they are large user base, make sure you split the load as expected so all the PSN can able to handle the load.
If you do not have LB, then configure building and campus wide different PSN order to see if that stable connection.
what model of ISE ? do you think the model hitting overloaded ?
some of the troubleshooting tips :
You can run the reports :
10-13-2024 02:06 PM
If your NAD devices are IOS-XE based, then I highly recommend using the built-in RADIUS load balancer in IOS-XE. I was sceptical at first, but it works astoundingly well. After configuring all my IOS devices with this, I eventually removed the F5 VIP. It's literally one command under the aaa group command
load-balance method least-outstanding
You can load balance across 2 or more PSNs - and if a PSN is not in service (reboot/patch etc) then the load balancer skips it until it's back again. I think this is Cisco's best kept secret.
In ISE 3.3 you can also see how many sessions have been authenticated in Log Analytics - go to Authentication Summary, scroll down and there will be a table of PSNs and their counts - at the top of the page the range defaults to 15min - you can increase the range up to 7 days to see the values over a longer time range. it's a nice way to validate the IOS load balancer is working
10-13-2024 11:40 PM
Hi @balaji.bandi the model is 3695 so in theory should be able to handle the load.
Hi @Arne Bier will this command load-balance based on number of connections to avoid over loading? Can we add all 6 PSNs in the group, i believe originally the command restricted the number of PSNs in a group, therefore not allowing us to spread the load over all 6. Maybe this situation is now not the case
10-13-2024 11:48 PM
Hi @Arne Bier sorry meant to add, we have the load - balancing command on the Wireless LAN controllers but not on the Wired switches. Given the majority of our auth load is on wireless do feel this will make much difference?
10-14-2024 12:54 AM
I have done this on C9800 Cisco Wireless controllers, as well as on countless number of Catalyst LAN switches. The concept is the same and so is the result. It works like a charm. Cisco doesn't tell you too much on how the algorithm works, and there are a few tweaks you can play with - but I would use the defaults. The "round robin" algorithm (or whatever they are using) works in batches of 25 requests - and I think that in an EAP request where there are multiple Access-Requests and Challenges occurring, you will still get persistence to the same PSN - that means, the IOS is smart enough to NOT split an EAP conversation across 6 PSNs - it remains sticky to one PSN. I wish Cisco would document how exactly this works. From my observations, the EAP clients are very happy, and the ISE Log Analytics shows an almost perfect balance across all PSNs. I only put it across 2 PSNs, but other customers have done it over 6. I you are interested in this detail, there was a very good podcast on Packet Pushers about Yale University experience. That podcast gave me the idea to try it myself.
10-14-2024 04:00 AM
Thank you @Arne Bier I had no idea there was a load balancing method in IOS-XE I will be looking into that with great interest. I recently had to fix the F5 persistence profiles that our supplier originally created for us but being able to do without it at all would be a bonus.
10-14-2024 01:15 PM
Yes - give it a try. I enabled this on a few devices and observed for a while. Then gradually converted all the RADIUS services away from using the VIP, to native IOS-XE load balancing. The F5 LTM is an excellent product, and it has its place - but there is no technical reason to implement RADIUS load balancing using any external vendor product when IOS-XE does such an admirable job.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide