cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1280
Views
0
Helpful
2
Replies

Questions about ISE fail over

Automatic Failover to the Secondary PAN

You can configure ISE to automatically the promote the secondary PAN when the primary PAN becomes unavailable. The configuration is done on the primary administrative node (Primary PAN) on the Administration > System > Deployment page. The failover period is defined as the number of times configured in Number of Failure Polls Before Failover times the number of seconds configured in Polling Interval. With the default configuration, that time is 10 minutes. Promotion of the secondary PAN to primary takes another 10 minutes. So by default, the total time from primary PAN failure to secondary PAN working is 20 minutes.

 

https://www.cisco.com/c/en/us/td/docs/security/ise/2-3/admin_guide/b_ise_admin_guide_23/b_ise_admin_guide_22_chapter_010.html#ID330

 

 

I am reading an article on ISE fail over.

 

What does the default configuration in bold mean?

 

The document states that Auto Fail over takes a total of 20 minutes.

 

Does Manually Fail over also take 20 minutes?

 

 

1 Accepted Solution

Accepted Solutions

Arne Bier
VIP
VIP

The default applies because it's based on the failed heartbeat timers.  You can make the polling more aggressive to achieve a faster time to detection.  Don't do it!   Failover is not a fun topic.  It causes processes to restart and ISE is not fast to restart.  Let's say you innocently restart the active PAN processes because of a TAC case or whatever.  If your polling is too aggressive, then the standby PAN could try to take over.  What's the rush anyway?  Failover in 30 minutes is more than enough in the greater scheme of things.  When PAN is down then the worst thing that can happen is that you cannot create a new policy, or you cannot create a new Sponsored Guest, and things like that.  In my books this doesn't constitute a need for fast failover.

 

Manual fail-over only takes as long as the duration that is needed for the standby to restart its processes.    So if you typically take 10 minutes to stop, and then start the application processes, then that is how long it will take from the time you manually promote the standby PAN.  You do this on the Standby PAN GUI.

You'll notice also that the previously active PAN node will also restart its processes.  it's been a while but I believe that is still the case.

 

View solution in original post

2 Replies 2

Arne Bier
VIP
VIP

The default applies because it's based on the failed heartbeat timers.  You can make the polling more aggressive to achieve a faster time to detection.  Don't do it!   Failover is not a fun topic.  It causes processes to restart and ISE is not fast to restart.  Let's say you innocently restart the active PAN processes because of a TAC case or whatever.  If your polling is too aggressive, then the standby PAN could try to take over.  What's the rush anyway?  Failover in 30 minutes is more than enough in the greater scheme of things.  When PAN is down then the worst thing that can happen is that you cannot create a new policy, or you cannot create a new Sponsored Guest, and things like that.  In my books this doesn't constitute a need for fast failover.

 

Manual fail-over only takes as long as the duration that is needed for the standby to restart its processes.    So if you typically take 10 minutes to stop, and then start the application processes, then that is how long it will take from the time you manually promote the standby PAN.  You do this on the Standby PAN GUI.

You'll notice also that the previously active PAN node will also restart its processes.  it's been a while but I believe that is still the case.

 

Thanks arne, you can also look at BRKSEC-3432 and the other performance and scale overview at https://community.cisco.com/t5/security-documents/ise-performance-amp-scale/ta-p/3642148#toc-hId-118574828