10-15-2019 07:12 AM
Hi Team,
Checking an ISE design for 2 Data Centers, using 2 appliances for PAN and MnT in redundancy and 2 appliances as PSN I would like to confirm if there is any way to avoid Split Brain situation.
For example; installing one appliance with PAN and MnT Personas and one appliance with PSN Persona in each Data Center, we can configure Automatic failover and each PSN appliance can be configured as Health Check node.
If the connection between the Data Centers fail, what is the procedure or mechanism to avoid a Split Brain ?
(The active PAN will continue working as an Active PAN and the Passive PAN would be automatically promoted to Active PAN because the interruption of keep alive messages and answers of Health Check node in Data Center).
Any comment ?
Guillermo.
Solved! Go to Solution.
10-16-2019 04:46 AM
The node that is performing the healthcheck should be in the same DC as the node it's monitoring. It checks its local Admin Node. If the local Admin node croaks, then the healthcheck node tries to promote the Standby. If that Standby is reachable over the WAN then promotion succeeds. If the WAN is down then the promotion is not even attempted. And the Standby will never promote itself. Therefore there is no risk of split brain as long as you follow the Admin Guide recommendations.
10-20-2019 03:13 PM
The health check node only checks that one specific node that you are interested in. It does not jump across the WAN and snoop on the health of the other PAN ... hence, there is no confusion.
P(A) <--- Hcheck(a)
|
WAN
|
P(S) <--- Hcheck(b)
If (a) detects that P(A) is down, then it has a duty to try and promote P(S). If WAN is down then promotion will not happen. P(S) does not promote itself if P(A) is down. There is no mechanism for that to happen.
Put another way, (b) does not perform any health check on P(A).
Promotion cannot occur in isolation - the WAN must be up for promotion to occur.
Some folks may choose not to use Automatic PAN failover and instead, when they know PAN(A) is dead, they https to the standby PAN and click the "Promote" button themselves. But the auto failover does all that for you too.
10-16-2019 04:46 AM
The node that is performing the healthcheck should be in the same DC as the node it's monitoring. It checks its local Admin Node. If the local Admin node croaks, then the healthcheck node tries to promote the Standby. If that Standby is reachable over the WAN then promotion succeeds. If the WAN is down then the promotion is not even attempted. And the Standby will never promote itself. Therefore there is no risk of split brain as long as you follow the Admin Guide recommendations.
10-16-2019 11:20 PM
Thx Arne,
Is it possible to have 2 Health Check nodes ?, what is the best practice to define the Health Check nodes ?
10-17-2019 04:41 AM
Absolutely. I have done this in some deployments. You can configure two health check nodes if you have two nodes to perform this role and that are not Admin nodes. Eg PSN or MnT.
Check the admin guide and it also advises to put the health check node as close to the PAN you want that health check node to monitor. A health check node only checks health of a single node.
PAN HA and failover from Admin Guide
It says “If the PANs are in different data centers, you must have a health check node for each PAN“
10-17-2019 11:06 PM
Thx Arne,
In case of 2 Data Centers, with a PAN in each Data Center and one Health Check node in each Data Center, if the communication between two DCs fails, the Health Check node near of Passive PAN will detect a failure on Active PAN, but the Active PAN will still active. I meant; if I have 2 DCs, with PANs and Health Check nodes in each DC, is not possible to have a Split Brain ? Sorry for confusion.
Guillermo.
10-20-2019 03:13 PM
The health check node only checks that one specific node that you are interested in. It does not jump across the WAN and snoop on the health of the other PAN ... hence, there is no confusion.
P(A) <--- Hcheck(a)
|
WAN
|
P(S) <--- Hcheck(b)
If (a) detects that P(A) is down, then it has a duty to try and promote P(S). If WAN is down then promotion will not happen. P(S) does not promote itself if P(A) is down. There is no mechanism for that to happen.
Put another way, (b) does not perform any health check on P(A).
Promotion cannot occur in isolation - the WAN must be up for promotion to occur.
Some folks may choose not to use Automatic PAN failover and instead, when they know PAN(A) is dead, they https to the standby PAN and click the "Promote" button themselves. But the auto failover does all that for you too.
10-21-2019 08:11 AM
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide