cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1104
Views
9
Helpful
2
Replies

My nightmare of upgrading from ISE 3.0 to ISE 3.2 is finally over

Here is my nightmare of refreshing my ISE environment from ISE 3.0 patch-3 to ISE 3.2 patch-4. It took over 24 hours
to complete everything on a four nodes cluster. Here is what it took to finish the job with ZERO downtime:
Appliances are SNS-3655
nodeA=Primary Admin/MNT
nodeB=Secondary Admin/MNT
nodeC=PSN
nodeD=PSN

Week #1: Took around 11 hours
0- nodeB and nodeD leave Active Directory
1- Deregister nodeB and nodeD from the existing cluster - 15 minutes
2- reimage nodeB and nodeD at the same time using the CMIC interface - 3 hours and 30 minutes
4- run "setup" - 75 minutes
5- patch nodeB with patch-2 (this is because we tested the restore with patch-2 before patch-4 became available) - 45 minutes
6- patch nodeD with patch-4 - 45 minutes
6- restore the backup taken from nodeA into nodeB - 60 minutes
7- patch nodeB to patch-4 - 45 minutes
8- add nodeD from nodeA to form a cluster (ISE restarted on nodeD) - 30 minutes
9- add external certificates for EAP, Admin, Radius DTLS, Portal, only one node can be at a time (ISE restarted) - 50 minutes
10- add nodeB and nodeD into Active Directory - 5 minutes
11- perform BIOS and firmware update on the SNS-3655 from 4.1(3)d to 4.2(3)c - 60 minutes
12- confirmed that the new ISE cluster is up and running without any issues - 30 minutes

Week #2:
0- nodeA and nodeC leave Active Directory
1- reimage nodeA and nodeC at the same time using the CMIC interface - 4 hours and 45 minutes
2- run setup - 75 minutes
3- patch nodeA and nodeC with patch-4 - 45 minutes
4- add external certificates for EAP, Admin, Radius DTLS, Portal - 25 minutes
5- add nodeA and NodeC into the cluster - 35 minutes
6- Add nodeA and nodeC into Active Directory - 5 minutes
7- Perform BIOS and firmware update on the SNS-3655 from 4.1(3)d to 4.2(3)c - 5 hours
8- confirmed that all four nodes are up and running without any issues - 30 minutes

It took 5 hours to troubleshoot the BIOS/Firmware issue because during the BIOS upgrade, somehow it decided to install this
BIOS version C220M5.4.2.3c.0.0129230853 instead of C220M5.4.2.3c.0_ISE so it couldn't boot ISE partition. Had to re-install the BIOS
again to resolve the issue.

It was a smoothing sailing and it still took 24 hours to refresh the cluster from ISE 3.0 to ISE 3.2 patch-4. There must be a better
way to do this in 2024. It seems like Cisco ISE still stucks in 1999....

2 Replies 2

marce1000
VIP
VIP

 

                           >.... There must be a better way to do this in 2024.
  -  I agree , the basic challenge (and that itself is an understatement) ; is that ISE is 1)complex 2)business critical (can't be missed) ; I used to install prepare  upgrades on 'offline nodes' ; and then switch radius servers on the fly on network devices (e.g.) when the new ISE cluster was ready.  Cisco should basically address this and move forward to providing architectures where upgrades are no longer executed on an online business servicing ISE environment. I kind of organized that manually , but it should be engineered for-customers-ready in ISE too.

 M.



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

Arne Bier
VIP
VIP

When there are physical appliances involved, then you're in for a world of hurt (CIMC access, BIOS updates and spinning disks etc.). Life gets a bit easier with VM-only deployments - you can stage a lot of this, and automate a lot too (as a bare minimum, using ZTP to install and patch). With ISE 3.3 you can (theoretically) use APIs to do a lot of the other heavy lifting.

All of the above comes with a massive "Theoretically" disclaimer - because

  • ISE code is notoriously buggy and one person's success does not ensure another's success
  • automation takes a little while to perfect (and to handle all possible failure scenarios) 
  • automation take a little while to perfect ... OH did I mention that already? - how many hours/days are you going to invest for a potentially once-in-a-blue-moon upgrade event?

The dream would be for us engineers to NOT have to care about any of this and rather, to use a SaaS service (e.g. Meraki Dashboard etc.) - send your encrypted RADIUS traffic over the internet to the cloud and let them deal with it. But that pipe dream comes with its own list of disclaimers.

Once you've thought this through and finally caught your own tail again, you realise that upgrading ISE is actually not that bad - a rite of passage, and good character building.  And that little nagging voice on your shoulder reminding you to get your head around Ansible ...