cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2508
Views
0
Helpful
6
Replies

ISE HA Split Brain ?

gugonza2
Cisco Employee
Cisco Employee

Hi Team,

 

Checking an ISE design for 2 Data Centers, using 2 appliances for PAN and MnT in redundancy and 2 appliances as PSN I would like to confirm if there is any way to avoid Split Brain situation.

 

For example;  installing one appliance with PAN and MnT Personas and one appliance with PSN Persona in each Data Center, we can configure Automatic failover and each PSN appliance can be configured as Health Check node.

 

If the connection between the Data Centers fail, what is the procedure or mechanism to avoid a Split Brain ?

(The active PAN will continue working as an Active PAN and the Passive PAN would be automatically promoted to Active PAN because the interruption of keep alive messages and answers of Health Check node in Data Center).

 

Any comment ?

 

Guillermo.

2 Accepted Solutions

Accepted Solutions

Arne Bier
VIP
VIP

The node that is performing the healthcheck should be in the same DC as the node it's monitoring. It checks its local Admin Node. If the local Admin node croaks, then the healthcheck node tries to promote the Standby. If that Standby is reachable over the WAN then promotion succeeds. If the WAN is down then the promotion is not even attempted.  And the Standby will never promote itself. Therefore there is no risk of split brain as long as you follow the Admin Guide recommendations.

View solution in original post

The health check node only checks that one specific node that you are interested in. It does not jump across the WAN and snoop on the health of the other PAN ... hence, there is no confusion.

P(A) <--- Hcheck(a)

|

WAN

|

P(S) <--- Hcheck(b)

 

If (a) detects that P(A) is down, then it has a duty to try and promote P(S).  If WAN is down then promotion will not happen. P(S) does not promote itself if P(A) is down. There is no mechanism for that to happen.

Put another way, (b) does not perform any health check on P(A).

Promotion cannot occur in isolation - the WAN must be up for promotion to occur.

 

Some folks may choose not to use Automatic PAN failover and instead, when they know PAN(A) is dead, they https to the standby PAN and click the "Promote" button themselves. But the auto failover does all that for you too.

 

View solution in original post

6 Replies 6

Arne Bier
VIP
VIP

The node that is performing the healthcheck should be in the same DC as the node it's monitoring. It checks its local Admin Node. If the local Admin node croaks, then the healthcheck node tries to promote the Standby. If that Standby is reachable over the WAN then promotion succeeds. If the WAN is down then the promotion is not even attempted.  And the Standby will never promote itself. Therefore there is no risk of split brain as long as you follow the Admin Guide recommendations.

Thx Arne,

Is it possible to have 2 Health Check nodes ?,  what is the best practice to define the Health Check nodes ?

Absolutely. I have done this in some deployments. You can configure two health check nodes if you have two nodes to perform this role and that are not Admin nodes. Eg PSN or MnT.

 Check the admin guide and it also advises to put the health check node as close to the PAN you want that health check node to monitor. A health check node only checks health of a single node. 

PAN HA and failover from Admin Guide 

 

It says “If the PANs are in different data centers, you must have a health check node for each PAN“

 

 

 

Thx Arne,  

In case of 2 Data Centers, with a PAN in each Data Center and one Health Check node in each Data Center, if the communication between two DCs fails, the Health Check node near of Passive PAN will detect a failure on Active PAN, but the Active PAN will still active.  I meant;  if I have 2 DCs, with PANs and Health Check nodes in each DC, is not possible to have a Split Brain ?   Sorry for confusion.

 

Guillermo.

The health check node only checks that one specific node that you are interested in. It does not jump across the WAN and snoop on the health of the other PAN ... hence, there is no confusion.

P(A) <--- Hcheck(a)

|

WAN

|

P(S) <--- Hcheck(b)

 

If (a) detects that P(A) is down, then it has a duty to try and promote P(S).  If WAN is down then promotion will not happen. P(S) does not promote itself if P(A) is down. There is no mechanism for that to happen.

Put another way, (b) does not perform any health check on P(A).

Promotion cannot occur in isolation - the WAN must be up for promotion to occur.

 

Some folks may choose not to use Automatic PAN failover and instead, when they know PAN(A) is dead, they https to the standby PAN and click the "Promote" button themselves. But the auto failover does all that for you too.

 

Thx Arne, Thx a lot for clarification.
Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: