10-02-2023 01:21 PM
I have a cluster of ISE 3.2 patch-3:
node1: PAN/MNT
node2: SAN/SMNT
node3: PSN
node4: PSN
There are no activity on this cluster and yet I received this message on the secondary Admin/Secondary MNT;
ISE Alarm : Critical : High Load Average: Server=ISE32P3SANMNT
Any ideas?
10-02-2023 01:52 PM
Furthermore, I am also getting this message: ISE Alarm : Critical : NTP Sync Failure : Server=ISE32P3SANMNT. However, when I do a "show ntp", everything looks legit:
ISE32P3SANMNT/admin#show ntp
Configured NTP Servers:
ntp1.cisco.com
ntp2.cisco.com
ntp3.cisco.com
Reference ID : 0A072896 (ntp1.cisco.com)
Stratum : 2
Ref time (UTC) : Mon Oct 02 20:48:01 2023
System time : 0.010108152 seconds fast of NTP time
Last offset : -0.034693789 seconds
RMS offset : 0.129994914 seconds
Frequency : 38.751 ppm slow
Residual freq : -1538.192 ppm
Skew : 0.067 ppm
Root delay : 0.000583669 seconds
Root dispersion : 0.046592101 seconds
Update interval : 25.0 seconds
Leap status : Normal
210 Number of sources = 2
MS Name/IP address Stratum Poll Reach LastRx Last sample
===============================================================================
^* ntp1.cisco.com 1 6 377 9 +5651us[ -29ms] +/- 311us
^- ntp2.cisco.com 1 6 377 34 +5263us[ +48ms] +/- 3277us
M indicates the mode of the source.
^ server, = peer, # local reference clock.
S indicates the state of the sources.
* Current time source, + Candidate, x False ticker, ? Connectivity lost, ~ Too much variability
Warning: Output results may conflict during periods of changing synchronization.
ISE32P3SANMNT/admin#
10-02-2023 11:31 PM
Regarding topic #1 (High Load Average), have you checked Reports / Diagnostics / Health Summary? What values do you see there? I had a similar issue where one node was constantly reported warnings, and it turned out that mistake was made with resource reservation (instead of reservation, it was limitation).
For topic #2 (NTP), I can see that you have 3 NTP servers configured, but only 2 are synced. I would assume alarm is for 3rd one.
Kind regards,
Milos
10-03-2023 03:51 AM
#1: I checked and it looks normal. Haven't had time to open a TAC case yet because Cisco TAC is not very helpful
#2: Yes, I have three NTP servers configured but always two shows up. I see that with both version 3.0, 3.1 and 3.2 with the latest patches. Therefore, I don't think your assumption is a correct one.
10-03-2023 04:18 AM
#1 I meant to check on your VM infra, assuming it is VM. TAC is next option.
#2 Quite possible. I never had more than 2 NTP servers on my installation, so I have no idea how it is visible in that case, but it was my first idea. When I see this alarm, I analyze where is NTP, and how does traffic traverse there. Most often, it is behind multiple hops and behind FW. You need to lose some traffic (NTP is UDP based) and this alarm would pop out. Since you are using Internet NTP servers, I'm assuming sync at some point doesn't happen against one server at least, which is enough to raise this alarm. In my case, that was explanation that I gave to myself, since alarms are irregular and nothing else can be concluded.
Kind regards,
Milos
10-03-2023 04:33 AM - edited 10-03-2023 04:34 AM
- VM infrastructure checked out and there was no issue during the time of the alarm in ISE,
- I had the span port to capture traffic between ISE and NTP servers and confirmed that the NTP traffics DID get to the NTP server and back to the ISE (It got to the ISE interface) so the communication. I saw this issue quite often with ISE 3.2 patch-2 but not as much in patch-3 so I assume the issue is resolved but I guess NOT. I put Internet NTP servers to mask out my internal NTP servers. The ISE and NTP servers are on the same network with NTP servers being Stratum 1 server.
10-06-2023 02:52 PM
@adamscottmaster2013 MNT performs housekeeping tasks on its data in early morning hours each day so you may ignore the alarms if they come around 3 or 4 AM.
As to NTP servers, I hope you are not using Windows Servers for such as they are not as reliable.
10-07-2023 06:29 AM
#1: It is NOT in the early morning hours, it is actually in the afternoon
#2: I am using Stratum 1 NTP servers from Microsync, not Windows Servers.
10-07-2023 12:35 PM
@adamscottmaster2013 If they have no obvious impact, then you may ignore them. If they surface for a prolong period of time, please engage Cisco TAC to troubleshoot.
On 1, we make get some better idea by "show tech". Below are sample outputs from my lab:
...
*****************************************
IO On Host - under threshold
*****************************************
Threshold % : 20
Actual IO Wait% : 0.35
Linux 4.18.0-372.9.1.el8.x86_64 (hslai-i32d) 10/07/2023 _x86_64_ (4 CPU)
12:00:00 AM CPU %user %nice %system %iowait %steal %idle
12:10:01 AM all 7.49 0.00 3.47 3.07 0.00 85.97
12:20:01 AM all 4.13 0.00 3.01 0.49 0.00 92.37
...
Scheduler Jobs:
==================================================
JOB_NAME REPEAT_INT ENABLED STATE LAST_START_DATE LAST_RUN_DURATION
-------------------- ---------- ---------- ---------- -------------------- --------------------
STATS_JOB freq=daily TRUE SCHEDULED 07-OCT-23 02.00.00.8 +000000000 00:00:02.
;byhour=2; 11640 AM ETC/UTC 041135
byminute=0
; bysecond
=0
NORMALISING_RACC_JOB freq=minut TRUE SCHEDULED 07-OCT-23 06.51.00.5 +000000000 00:00:00.
ely;byseco 70271 PM ETC/UTC 192542
nd=0;
NORMALISING_RAUTH_JO freq=minut TRUE SCHEDULED 07-OCT-23 06.51.00.6 +000000000 00:00:00.
B ely;byseco 03777 PM ETC/UTC 012001
nd=0;
HOURLY_STATS_JOB freq=hourl TRUE SCHEDULED 07-OCT-23 06.15.58.1 +000000000 00:00:00.
y;byminute 45313 PM UTC 030927
=15
COLLATION_JOB freq=minut TRUE SCHEDULED 07-OCT-23 06.51.00.3 +000000000 00:00:00.
ely;byseco 66588 PM ETC/UTC 287541
nd=0;
COLLATIONPURGE_JOB freq=hourl TRUE SCHEDULED 07-OCT-23 06.45.00.6 +000000000 00:00:00.
y;byminute 89337 PM ETC/UTC 032142
=0,15,30,4
5;bysecond
=0;
...
*****************************************
Hourly Database Metrics
*****************************************
DAY Avg Redo Per Sec-MB Avg TPS Avg Read IOPS Avg Write IOPS Avg Read MBPS Avg Write MBPS Max Redo Per Sec-MB Max TPS Max Read IOPS Max Write IOPS Max Read MBPS Max Write MBPS
-------------------------- ------------------- ---------- ------------- -------------- ------------- -------------- ------------------- ---------- ------------- -------------- ------------- --------------
06-OCT-2023 00:00 0 .09 .91 .41 .06 0 .01 .89 3.15 1.53 .3 .03
06-OCT-2023 01:00 0 .11 .95 .42 .07 0 .01 1.16 2.76 1.5 .29 .03
06-OCT-2023 02:00 0
...
*****************************************
db_log info for last 48 hours
*****************************************
TIMESTAMP COMPONENT TEXT
------------------------------ ------------------------- --------------------------------------------------------------------------------
...
06-OCT-23 04.41.48.623840 AM collation Total Data max_space = 154, threshold_space = 123, total_space = 3, free_space =
06-OCT-23 04.41.48.624914 AM purge_audit Total Data threshold_space = 123 GB, used_space = 2 GB
06-OCT-23 04.41.48.974386 AM purge_audit purge skipped; no data available for 06-SEP-23
06-OCT-23 04.41.49.073688 AM purge_tbl MNT_AAA_DIAGNOSTICS purging skipped
06-OCT-23 04.41.49.131767 AM purge_tbl MNT_SYSTEM_DIAGNOSTICS purging skipped
06-OCT-23 04.41.49.133845 AM purge_tbl MNT_SECURESYSTEM_DIAGNOSTICS purging skipped
06-OCT-23 04.41.49.463200 AM purge_tbl dropping partition...SYS_P3509 in DB_LOG for 29-SEP-23
06-OCT-23 04.41.49.464342 AM purge_tbl DB_LOG purging completed successfully
06-OCT-23 04.41.49.469887 AM purge_tbl RADIUS_AUTH_AGGR purging skipped
06-OCT-23 04.41.49.470978 AM purge_tbl MISCONFIGURED_NAS purging skipped
06-OCT-23 04.41.49.471978 AM purge_tbl MISCONFIGURED_SUPPL_MONTH purging skipped
06-OCT-23 04.41.49.472986 AM purge_tbl RADIUS_ERRORS_MONTH purging skipped
06-OCT-23 04.41.49.475525 AM purge_tbl RADIUS_ERRORS_48 purging skipped
06-OCT-23 04.41.49.476574 AM purge_tbl MISCONFIGURED_SUPPLICANTS_48 purging skipped
06-OCT-23 04.41.49.477562 AM purge_tbl RADIUS_AUTH_SUPPRESSED purging skipped
06-OCT-23 04.41.49.478815 AM purge_tbl COLLATION_RADIUS_AUTH purging skipped
06-OCT-23 04.41.49.673046 AM purge_tbl dropping partition...SYS_P3485 in RADIUS_AUTH_48_LIVE for 28-SEP-23
06-OCT-23 04.41.49.674725 AM purge_tbl RADIUS_AUTH_48_LIVE purging completed successfully
06-OCT-23 04.41.49.677111 AM purge_tbl RADIUS_ACC_48_LIVE purging skipped
06-OCT-23 04.41.49.859627 AM purge_tbl dropping partition...SYS_P3486 in RADIUS_AUTH_DETAILS for 28-SEP-23
06-OCT-23 04.41.49.859969 AM purge_tbl RADIUS_AUTH_DETAILS purging completed successfully
06-OCT-23 04.41.49.861922 AM purge_tbl ALARM_EVALUATION_DETAILS purging skipped
06-OCT-23 04.41.49.863059 AM purge_tbl TACACS_ACC_48_LIVE purging skipped
06-OCT-23 04.41.49.864088 AM purge_tbl TACACS_AUTH_48_LIVE purging skipped
06-OCT-23 04.41.49.867339 AM purge_audit purging Tacacs data older than 06-SEP-23
06-OCT-23 04.41.49.956713 AM purge_tbl TACACS_ACC_MONTH purging skipped
06-OCT-23 04.41.49.957934 AM purge_tbl TACACS_AUTHZ purging skipped
06-OCT-23 04.41.49.959337 AM purge_tbl TACACS_ACC_DETAILS purging skipped
06-OCT-23 04.41.49.960372 AM purge_tbl TACACS_AUTH_MONTH purging skipped
06-OCT-23 04.41.49.961379 AM purge_tbl TACACS_ACC_ARCHIVE purging skipped
06-OCT-23 04.41.49.962411 AM purge_tbl TACACS_AUTH_48_LIVE purging skipped
06-OCT-23 04.41.49.963423 AM purge_tbl TACACS_AUTH_DETAILS purging skipped
06-OCT-23 04.41.49.964397 AM purge_tbl TACACS_ACC_48_LIVE purging skipped
06-OCT-23 04.41.49.965379 AM purge_tbl TACACS_AUTH_AGGR purging skipped
06-OCT-23 04.41.49.966387 AM purge_tbl TACACS_AUTH_ARCHIVE purging skipped
...
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide