cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2845
Views
10
Helpful
2
Replies

All APIC Controllers in Fabric Go Down

williammanurung
Level 1
Level 1

Hi All,

 

I want asking about problem with APIC and how to resolve it.

Let's say, I have three APIC Controller in fabric and all of the fabric go down.

1. what happens when all apic controllers in fabric go down?

2. How to troubleshoot this problem?

3. If I have backup configuration, can we just replace the APIC with the new one and import backup configuration to the new APIC?

4. What if I don't have backup configuration, Should I configure it manually to the new APIC?

 

Because, as I know traffic on the fabric doesn't impact if all of APIC go down, correct?

 

Thank you.

1 Accepted Solution

Accepted Solutions

Sergiu.Daniluk
VIP Alumni
VIP Alumni

Hello,

 

Interesting scenario, although the probability for all 3 APICs to fail is very low, so allow me to answer your queries:


1. what happens when all apic controllers in fabric go down?


The fabric (spine and leaves) will continue to forward traffic as if nothing happened. Of course, you will not be able to make any configuration changes, but from the perspective of the services, there will be no impact.  

 


2. How to troubleshoot this problem?


If the APICs when down just because of a reload, then once they are up, the controllers will form the cluster like nothing happened.

If the scenario is that all 3 APICs have failed (like h/w failure) and are not coming up, then there is not much troubleshoot, other then replacing the APICs.

If however at least one of the APICs comes back alive, you can replace the other two APICs using the well known procedure: https://www.cisco.com/c/en/us/support/docs/cloud-systems-management/application-policy-infrastructure-controller-apic/118918-technote-aci-00.html 

 


3. If I have backup configuration, can we just replace the APIC with the new one and import backup configuration to the new APIC?


If all 3 APICs are dead, but you have a backup of the configuration, you are still in luck. You can recover the fabric with low impact (Note: there will be impact so this needs to be done outside of production hours)

Steps are detailed here: https://www.cisco.com/c/en/us/support/docs/cloud-systems-management/application-policy-infrastructure-controller-apic/118935-technote-aci-00.html 

 


4. What if I don't have backup configuration, Should I configure it manually to the new APIC?


This is a situation where nobody would want to be, so make sure you make the backups every day :-)

This is where you need to re-configure everything from scratch, then erasing all fabric nodes, and hoping when you join them to the new fabric, not a lot of servers are impacted. 

 

Note: in both situation 3 and 4, the fabric will continue to forward traffic, until you run the setup-clean-config.sh on each leaf/spine.

 

Cheers,

Sergiu

View solution in original post

2 Replies 2

Sergiu.Daniluk
VIP Alumni
VIP Alumni

Hello,

 

Interesting scenario, although the probability for all 3 APICs to fail is very low, so allow me to answer your queries:


1. what happens when all apic controllers in fabric go down?


The fabric (spine and leaves) will continue to forward traffic as if nothing happened. Of course, you will not be able to make any configuration changes, but from the perspective of the services, there will be no impact.  

 


2. How to troubleshoot this problem?


If the APICs when down just because of a reload, then once they are up, the controllers will form the cluster like nothing happened.

If the scenario is that all 3 APICs have failed (like h/w failure) and are not coming up, then there is not much troubleshoot, other then replacing the APICs.

If however at least one of the APICs comes back alive, you can replace the other two APICs using the well known procedure: https://www.cisco.com/c/en/us/support/docs/cloud-systems-management/application-policy-infrastructure-controller-apic/118918-technote-aci-00.html 

 


3. If I have backup configuration, can we just replace the APIC with the new one and import backup configuration to the new APIC?


If all 3 APICs are dead, but you have a backup of the configuration, you are still in luck. You can recover the fabric with low impact (Note: there will be impact so this needs to be done outside of production hours)

Steps are detailed here: https://www.cisco.com/c/en/us/support/docs/cloud-systems-management/application-policy-infrastructure-controller-apic/118935-technote-aci-00.html 

 


4. What if I don't have backup configuration, Should I configure it manually to the new APIC?


This is a situation where nobody would want to be, so make sure you make the backups every day :-)

This is where you need to re-configure everything from scratch, then erasing all fabric nodes, and hoping when you join them to the new fabric, not a lot of servers are impacted. 

 

Note: in both situation 3 and 4, the fabric will continue to forward traffic, until you run the setup-clean-config.sh on each leaf/spine.

 

Cheers,

Sergiu

Hi Sergiu,

 

Thanks for your great answer.

So I wanna makesure, in sitution 3 and 4 we must setup clean one by one leaf/spine after new APIC comes up, right?

I think anyone will get a big mess if all of the apic go down :D

 

Best Regards,

 

William

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Save 25% on Day-2 Operations Add-On License