Cisco ISE 2.7 Patch 7 - High Load Average
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-04-2022 01:31 AM
Hi All,
I am running two Cisco ISE 2.7 Patch 7 on physical SNS-3515-K9. Doing their expected personas of both being PSN, P MnT, S MnT, P PAN and S PAN and PxGrid. There is also SDA deployment and we have 3 DNAC running 2.2.3.6
Every morning I am greeted with alarms on DNAC that Radius Server was not reachable from network switches. Time when it happens is always close to the same, around 03:40 in the morning.
When looking at Cisco ISE Alerts, there are two logged in close succession, High Load Average and High Authentication Latency.
I have already looked at scheduled tasks like feed updates and endpoint purge to see if any of them coincide with the time of alert generation, but have not found anything so far.
Email alerts I get indicate that High Load and Authentication Latency are noticed on Secondary PAN, this node is the one which is reported as active PxGrid.
Has anyone experienced similar issue and if you managed to locate the culprit/fix.
- Labels:
-
AAA
-
Identity Services Engine (ISE)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-04-2022 06:49 AM
I haven't encountered this exact issue but in case you aren't aware, the 3515 is EOL (from a software prospective) and does not support ISE 3.1 or 3.2. If you aren't already it would be a good time to start thinking about replacing this appliance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-04-2022 08:53 AM
I would get off the SNS-3515-K9 units for ISE v2.7 immediately. A while back we had big issues with our SNS-3515-K9 units. They were lab units we'd been running ISE v2.2 for on a long time. We tried to use them to test the upgrade process (and look for issues) before upgrading our production SNS-3595-K9 units from v2.2 to v2.7. Unfortunately, the SNS-3515-K9 lab units sometimes installed v2.7 cleanly, other times had service crash / restart loops, other times restored prod data fine for testing, other times crash / restart loops after "restoring". The patch level we ran didn't matter (none, 3, 6). ISE v2.7 was so unstable on that 3515 platform we had to switch to using VMs for testing the upgrade process for each of our different ISE cubes/clusters. I'm amazed you're able to get v2.7 running stably on it. But my Cisco advanced services people told me to get off the SNS-3515/3595 before we upgrade next year from v2.7 to v3.1 (or maybe v3.2) so I'd do the same if I were you.
Regards,
David
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-04-2022 11:32 AM
Thanks for your reply's,
I am fully aware that this is dead platform, but I am trying to get the most out of it as previous guy bundled it together with DNAC and SDA deployment in 2019 (even so 36x5 was already available)
It has been stable for most of the time, but I have to say that we did encounter a crash couple of months ago where full rebuild and restore was necessary. One morning I found that Application Server process was stuck in Initializing and full application restart got PSN and Guest portal working, but had issues signing in and permissions were messed up.
However looking at this current problem which existed even before it crashed for good appears to be some triggered event. AS mentioned I have already moved all reoccurring tasks to be dispersed over week without any of them starting or ending at the time alerts are logged.
I will keep on establishing pattern and trying to switch active PxGrid node to see if it follows which ever handles it. If so, this will point at DNAC being the culprit, then I will keep on investigating.
