cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
331
Views
1
Helpful
6
Replies

High Load Average and Monitoring issue on Cisco ISE

cghaderpour
Level 1
Level 1

Hello

I have two different issues that they might be related as well.

I have small Cisco ISE deployment with two nodes. One on azure and second node on Hyper-V. Low number of clients and Auth requests. Client attribute filter is enabled. No huge profiling in use.

I opened a TAC with cisco and provided the log bundle and they recommended to install patch 4 on ISE 3.3 as they could not find a real issue for the high load alarm.

I installed the patch last weekend and the issue was fixed for a couple days. last night I got two more alarms for high load and today when I go to Operation/Reports/Monitoring I see no data for the system health. All graphs are empty showing "no data"

so the issues are two now.

1- No data on monitoring. I can not pick a host as it shows non on drop down menu.

2- High load average that have no clue what is wrong.

6 Replies 6

What node sizes do you have deployed?  What is the latency between them?

Hi @cghaderpour ,

 if you Restart both Nodes, does Operation/Reporting/Monitoring return to normal ? If yes, for how long ?

 

Note: on ISE 3.3 P2 I had an issue in the MFC Profiling and AI Rules (at Administration > System > Settings > Profiling) that I had to disable it (my Authentication per Day that was 1.2 Mi went to 14.0 Mi) ... this issue was solved in ISE 3.3 P4 .

 

Regards.

cghaderpour
Level 1
Level 1

Hello @Marcelo Morais @ahollifield 

I reload the ISE application and Monitoring issue is resolved.

I have followed the cisco recommendation for two nodes deployment. 16 vCPUs 32GB RAM 600GB Storage.

the latency seems to be fine but there is an issue with disk IO on my second node on Hyper-V due to using older disks. 

I followed Cisco with turning off the Log Analytics and installing patch 4 but that did not help to get rid of the alarms.

deployment is as follow:

Primary on Azure: PAN, PSN and  MnT

Secondary on prem: secondary PAN, Secondary MnT and PSN

can I turn off the secondary MnT on Hyper-V to reduce the load for this node?

this is the output of tech top when the issue occurred 
--------------------------------------
top - 18:50:14 up 26 days, 21:23, 0 users, load average: 29.75, 16.97, 12.02
Threads: 1767 total, 1 running, 1766 sleeping, 0 stopped, 0 zombie
%Cpu(s): 6.8 us, 9.0 sy, 0.0 ni, 80.9 id, 2.2 wa, 0.5 hi, 0.5 si, 0.0 st
MiB Mem : 31874.1 total, 5964.2 free, 19370.0 used, 6540.0 buff/cache
MiB Swap: 7999.9 total, 4986.7 free, 3013.2 used. 7025.7 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2000922 iseadmi+ 20 0 18.1g 7.4g 57356 R 85.0 23.7 579:00.36 VM Thre+
2000899 iseadmi+ 20 0 18.1g 7.4g 57356 S 0.0 23.7 1:27.40 jsvc
2000907 iseadmi+ 20 0 18.1g 7.4g 57356 S 0.0 23.7 74:41.62 GC task+
2000908 iseadmi+ 20 0 18.1g 7.4g 57356 S 0.0 23.7 74:42.86 GC task+

Hi @cghaderpour ,

1st please take a look at: ISE - Slow Replication, search for Troubleshooting - Slow Replication ("external actors").

2nd you can turn off the SMnT of your Secondary On Prem Node to test if the SMnT is the issue.

 

Hope this helps !!!

 


@cghaderpour wrote:
top - 18:50:14 up 26 days, 21:23, 0 users, load average: 29.75, 16.97, 12.02
MiB Swap: 7999.9 total, 4986.7 free, 3013.2 used. 7025.7 avail Mem

26 days of uptime is not really indicative something is wrong.  

FN74271 - Cisco Identity Services Engine - Operating System Swap Memory Issue May Cause System Instability and Latency Issues

 

imhessam
Level 1
Level 1

I had the same problem on ISE 3.3 Patch 4 and tested everything. Finally, I found that it was a hardware issue. When I moved ISE to another host server with the same configuration (CPU, RAM, etc.), the problem was resolved.