Solved: Re: Operation process in case of apic cluster failure

MInchulKim6989 · ‎04-09-2020

In 4 multi-pod configurations, 1 APIC in pod1, 2 APIC in pod2, 2 APIC in pod3, 5 clusters, 1 apic each, 2 dead, 3 I want to know the details of the operation process when I die. How many APICs must die to limit read / write? I am curious what the APIC will do when it continues to decrease by one. I know that APIC has no service impact when all die. Also, if you have any data on how to distribute shards in detail about apic that has 5 clusters, please. For reference, the number of leaves is 100 or less.

Sergiu.Daniluk · ‎04-09-2020

Hi,

I would like to start by making a small correction: if you are in a multi-pod environment, all APIC controllers are part of the same cluster. So in your scenario, you have 1 cluster with 5 controllers.

The general rule when it comes to APIC distribution in a multi-pod environment, is to avoid placing more then two controllers in the same pod. As a guideline, you can refer to this:

The reason for this distribution is to avoid the total loss of a information for a shard. Now what is a shard you might ask? No worries, I got you:

The APIC cluster uses a technology from large databases called sharding - very similar concept as horizontal database partitioning, but better. Better in the sense that it increase redundancy and performance because the db tables are split across servers, and smaller tables are replicated as complete units. Cisco APIC uses a replication factor of 3, meaning each shards has 3 replicas across the cluster.

To have a visual understanding on how sharding actually looks like, here is how it looks in a 3-node cluster:

In a cluster only one shard is active, while the other 2 are standby. If one out of 2 shards dies, the other two are still read-write. If 2 out of 3 dies, the remaining shard becomes read only. This is why, in the moment when 2 out of 3 controllers from an APIC cluster experiences a hardware failure, the remaining APIC goes into what is called a minority and becomes read-only.

If you increase the number of APICs, the replication factor remains the same (3) and the shards are distributed between all APICs, and very important, some APICs might have all shards, others only a subset of shards. Why is this important? Because in case of a failure, there might be a chance that some shards to be in read-only and other in read-write.

This is a bad scenario for a Single Pod Fabric, where some shards are in read only, other are in read write:

In your topology, the distribution is ok, as you have maximum 2 nodes per pod.

Now as we discussed about the sharding and shards distribution, we can talk about the potential loss of data in case of a h/w failure. How can that happen? Like this:

Here the "green" shard is lost. Why this happend is pretty clear in the picture - all 3 APICs containing the green shard died.

If you have max 2 APICs per POD, you eliminate this potential problem. However, if two APICs goes down, there is still a chance that some shards to go into read only, but no data is lost.

If there is a 3 out of 5 failure, the chance to loose some shards is high. if you get in a situation like this (or like the one illustrated above), you will need to contact TAC and BU. The procedure is called ‘ID Recovery’ and is used to restore the whole fabric state to the latest taken configuration snapshot (if you have one).

I hope I did not bored you with all these details, and you will find them useful.

Note: I took all these details and images from the following documents:

https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/unified-fabric/white-paper-c11-730021.html

Ciscolive presentation - BRKACI-2003 https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2019/pdf/BRKACI-2003.pdf

Regards,

Sergiu

View solution in original post

Gaurav Gambhir · ‎04-09-2020

one thing I would like to add on top of what Sergiu already shared here.

Cold Standby APIC feature was introduced in 2.2x version to cover exactly these type of scenarios, where if you loose 2 APICs at the same time due to lost connectivity to a POD which has 2 APICs, you can promote the standby APIC into active APIC to get the cluster in healthy state.

https://www.cisco.com/c/en/us/support/docs/cloud-systems-management/application-policy-infrastructure-controller-apic/215209-configure-standby-apic.html

View solution in original post

Sergiu.Daniluk · ‎04-09-2020