cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
478
Views
5
Helpful
2
Replies

ISE distributed deployment nodes freeze but no clue about what has happened

ajtm
Level 1
Level 1

Four nodes distributed in two Datacenters (all Virtual Machines).

Datacenter1: Node 1:Pri Adm+Pri Mnt; Node 2:PSN1; Datacenter2: Node 3:Sec Adm+Sec Mnt; Node 4:PSN2

 

Last week both nodes in Datacenter1 become unresponsive - no WebUI access, SSH access working but unable to login.

All RADIUS requests, in both PSNs, receive authz failed reply (AD and internal users)

TACACS+ failing in PSN1 but working fine in PSN2 (internal users only)

 

Only way we were able to restore service was to reload Datacenter1 nodes (using VMware).

Not able to find any evidence of what has happened in the logs bundle - no logs at all during the incident.

Not understanding why PSN2 was affected.

2 Accepted Solutions

Accepted Solutions

Surendra
Cisco Employee
Cisco Employee
Seems like a classic VMWare Snapshots or Hot vMotion playing a role. I would suggest to check with your VM team if any of the mentioned operations were done on those VMs on that day.

View solution in original post

Marvin Rhoads
Hall of Fame
Hall of Fame

This symptom can occur when a snapshot of the VMs occur due a storage backup.

ISE VMs cannot be backed up with snapshots while they are running.

View solution in original post

2 Replies 2

Surendra
Cisco Employee
Cisco Employee
Seems like a classic VMWare Snapshots or Hot vMotion playing a role. I would suggest to check with your VM team if any of the mentioned operations were done on those VMs on that day.

Marvin Rhoads
Hall of Fame
Hall of Fame

This symptom can occur when a snapshot of the VMs occur due a storage backup.

ISE VMs cannot be backed up with snapshots while they are running.