cancel
Showing results for 
Search instead for 
Did you mean: 
cancel

ACI Multi-Pods: Caveats and Considerations

946
Views
15
Helpful
5
Comments
Community Manager

By: Jody

 

Image 1.jpg

Hey folks! Welcome to another blog where we tackle yet again, another ACI topic.


I was going to create a detailed configuration guide for Multi-Pod, however, after checking out the Cisco ACI MultiPod Configuration Whitepaper on CCO, I realized I would be duplicating efforts at best.


The configuration whitepaper on CCO contains detailed configuration examples for the IPN and screenshots from the APIC. If you are looking to deploy Multi-Pod, this is what you will want to use!


So instead, I have worked to collect important caveats, design guidelines, and requirements that can serve as a one-stop shop for ACI Multi-Pod deployments.

 

Why Multi-Pod?


Before we dig in, let’s examine some of the benefits that come with deploying ACI Multi-Pod.

 

Image 2.jpg

  • ACI Multi-Pod allows for fault isolation of Control Plane Protocols inside of each respective Pod (IS-IS and COOP protocols operate intra-Pod). This is important because it ensures that a routing or protocol issue in one Pod will not take down the entire fabric.
  • Centralized Management through one APIC Cluster (Cluster of 3, 5, of 7).
  • Eliminates fate sharing if one DC goes down
  • End-to-end Policy Enforcement
  • Pervasive GW across all leafs in all Pods

 

Here are some hardware and software requirements

 

APIC Software Requirement

 

  • 2.0(1) on APIC
  • 12.0(1) on Leafs and Spines
  • Ability to change MTU for Inter-Pod traffic (CPU generated traffic) – available starting with 2.2(2e)
  • 50msec latency across Pods – Supported starting 2.3(1)

APIC Hardware Requirement

 

  • Any APIC Model
  • All Nexus 9000 Platforms for Fabric nodes

 

Here are IPN node options


While the list below is not all inclusive, it gives a quick reference of platforms that can act as Inter-Pod Node devices.

 

  • Nexus 9200, 9300-EX or later (not first generation 9300/9500).
  • Nexus 7000
  • ASR9k
  • ASR1k

 

Regardless of which one you go with, the IPN device must support the following:

 

  • OSPF
  • PIM Bi-Dir (will handle Unknown unicast, multicast and broadcast traffic)Jumbo MTU (will allow for VXLAN encapsulated traffic between Pods)
  • DHCP Relay (will allow for Spine and Leaf nodes to be discovered across the IPN)
  • QOS (will support the prioritization of APIC-to-APIC communication over the IPN)
  • Support for 40/100G interfaces (or 10Gig with QSA adapters on the Spine)


Here are the Multi-Pod scalability numbers

 

  • 2.2(x) – 10 pods / 200 leafs per pod /No more than 400 leafs per fabric
  • 3.2(x) – 12 pods / 200 leafs per pod / No more than 400 leafs per fabric


Multi-Site vs Multi-Pod from an Availability Zone / Region perspective


First, some definitions (courtesy of AWS). If you’d like to know more about AZs/Regions on AWS, check that out here.


Regions: Geographically separated areas, each of which can be comprised of multiple availability zones.


Availability Zones: Locations within a Region that are designed to be isolated fault domains which provide low latency network connectivity to other Availability zones in the same Region.


Note: What you read below is not mandate for every use-case of a Multi-Pod or Multi-Site design, rather it is a guideline that you can use when deciding which is more appropriate for your overall design.


When it comes to mapping the Availability Zone / Region concepts to ACI designs, here are 5 things to keep in mind:

 

  • Each “Fabric”, which is made up of multiple Pods, is considered a Single Availability Zone.
  • Configuration Zones (an ACI concept) can be used to span certain switches in your fabric, and can be used to map the configuration zone to an availability zone (applies to configuration and policy only).
  • “Multiple Fabrics (whether a single Pod or Multi-Pod) connected by Multi-Site” is considered a Region with multiple Availability Zones.
  • The original use case for Multi-Pod is to have multiple “pods” of ACI at a customer location, linked together via Multi-Pod (i.e., AZ in one Region)
  • Multi-Site allows you to manage and connect multiple sites (Several Fabrics) via the Multi-Site Orchestrator (MSO)*

 

* The ability to connect Multi-Pod Fabrics with Multi-Site became available beginning with APIC 3.2 code.*


Check out this Multi-Pod vs Multi-Site at a glance.


Here are 4 Guidelines and Limitations:

 

  1. Spine switches must have an active Leaf facing link (LLDP up), otherwise it is deemed unused and cannot be used cannot be used by the fabric.
  2. LLDP must be enabled on the IPN switch.
  3. At least one Spine must be configured with BGP-EVPN session for peering with remote Pod(s). For Production environments, you should have redundant connections.
  4. Up to 50msec RTT between Pod(s) is supported (this support was added beginning with the 2.3 software release).


Here are 2 Caveats:

 

1) Traffic loss for Multi-Pod traffic after a Spine reboot (CSCvd75131)

 

In a Multi-Pod setup when a Spine switch is rebooted, traffic should not flow through the switch for 10 minutes after its routing protocol adjacencies come up. This is achieved by advertising a high metric into OSPF and IS-IS for 10 minutes. This works fine for OSPF, but not for IS-IS. The problem is that within the Cisco APIC, the default metric for routes redistributed into IS-IS (such as the remote POD TEP pool) is 63.

 

This also happens to be the max-metric that can be set for redistributed routes into IS-IS in Cisco APIC. So when the spine that was just rebooted comes back up, it cannot advertise a worse metric than the default, which causes all of the leaf switches immediately to install the switch as an ECMP path for the route even though things such as COOP might not have converged yet. If traffic is hashed to this ECMP path, there could be traffic loss for several seconds.

 

This issue can be seen in any Multi-Pod environment. Beginning with APIC 3.0 code, users have a way to change the default IS-IS metric to prevent this issue.

 

To change the default ISIS metric for redistributed routes, go to Fabric > Fabric Policies > Pod Policies > ISIS Policy default, and change the value from 63 to 32.

 

Image 3.jpg

2) Spines dropping COS6 Inter-Pod traffic (CSCva15076)


ACI Spines will drop Inter-Pod traffic (traffic coming across the IPN) when it is marked CS6, except iTraceroute. The bigger problem with this is that several Cisco platforms (Nexus 7000) will send control plane protocol traffic (i.e. STP, BGP, OSPF, etc.) marked as CS6 (CoS 6), and this cannot be changed on those platforms.


If this traffic is dropped, it can result in the drop of BUM traffic between ACI Pods.


As a workaround, change the QOS settings and re-mark COS6 traffic to COS4. If you need a detailed configuration guide to walk you through setting up QOS for Multi-Pod, check out the ACI Multi-Pod QOS guide.


Design Considerations for APICs


When it comes to deploying APICs in an ACI Multi-Pod Design, there are several ways to spread out your APICs across the various Pods, depending on how many Pods you intend to connect together. Cisco has a lot of good information to consider located on their Multi-Pod Whitepaper on CCO.


If you are just going with (2) Pods, which is very prevalent in DR/BC deployments, my rule of thumb is (2) APICs in Pod1, (1) APIC in Pod2, and a Standby APIC in both Pod1/2. My main reason for this is that you will alway have a full copy of the APIC Shards in both Pods.

 

 

  • You always are assured that your APIC Shards are replicated between APIC1/2/3. This means if Pod1 goes down, you will have a full replica available on APIC3 in Pod2.
  • If Pod1 were to go down, you can promote the Standby APIC in Pod2 to active and be ready to resume configuration (if needed).


Design Considerations with Firewalls


Deploying a pair of Active/Standby firewalls across Pods in an ACI Multi-Pod Fabric is a common requirement from most customers. However, please be aware that Active/Standby FW connectivity was broken with older versions of APIC code.

 

Please review the options below and ensure you select the appropriate solution, depending on your needs.


Option 1 – Active/Standby FW with NO use of vPC

 

  • Active and Standby FWs are connected to a single Leaf node with a physical link or a local port-channel.
  • The “bounce” of traffic between BorderLeaf in Pod2 and BorderLeaf in Pod1 will work.
  • This is supported in the 2.1 release starting with 2.1(2e) and in the 2.2 release starting with 2.2(2e) on ACI Leaf nodes (E, EX, FX)


Option 2 – Active/Standby FW with vPC

 

  • Active and Standby FWs can be connected to respective Leaf pairs in each Pod.
  • This is supported in the 2.3 release starting with 2.3(1) on EX/FX Leaf switches ONLY. This option is not supported on first generation ACI Leaf switches due to CSCux29124.

Pointer about troubleshooting


APIC in Pod1 cannot ssh to spine/leaf in Pod2


APICs are not VRF aware; because of this, if you are trying to SSH from your APIC to Leaf or Spines in other Pods you will have to specify the the interface or use the attach command.


Using the “attach” command will automatically source the correct interface to allow for SSH.


coast-apic1# acidiag fnvread
ID Pod ID Name Serial Number IP Address Role State LastUpdMsgId
————————————————————————————————————–
101 1 Spine1 FOX2123PLDK 10.0.32.65/32 spine active 0
102 2 Spine2 FOX2123PLD2 10.1.152.64/32 spine active 0
201 1 Leaf201 FDO21250W9K 10.0.32.67/32 leaf active 0
202 1 Leaf202 FDO21260TFC 10.0.32.66/32 leaf active 0
203 1 Leaf203 FDO21242YD1 10.0.32.68/32 leaf active 0
204 1 Leaf204 FDO21260TBZ 10.0.32.64/32 leaf active 0
205 2 Leaf205 FDO21250W1Y 10.1.152.66/32 leaf active 0
206 2 Leaf206 FDO21253CGH 10.1.152.65/32 leaf active 0

Total 8 nodes

coast-apic1# attach Leaf206
This command is being deprecated on APIC controller, please use NXOS-style equivalent command
# Executing command: ssh Leaf206 -b 10.0.0.1
Warning: Permanently added ‘leaf206,10.1.152.65’ (RSA) to the list of known hosts.

Password:
Cisco Nexus Operating System (NX-OS) Software
TAC support: http://www.cisco.com/tac
Specifying the interface in your SSH command
coast-apic1# ssh admin@10.1.152.65 -b 10.0.0.1
Password:
Last login: Mon Jul 23 11:24:48 2018 from 10.0.0.1
Cisco Nexus Operating System (NX-OS) Software
TAC support: http://www.cisco.com/tac

 

That’s all for now folks. But don’t fret, the ACI blog series will continue right here on the ACI Board on Cisco Community.


In an effort to make sure we’re providing you with top-notch content that’s helpful and most fitting to where you are in your current journey, drop us a comment and let us know if the caveats and considerations we provided for ACI multi-pods was helpful!


And while you’re at it, let us know what specific ACI topics you’d like to see addressed in this blog series.

 

 

5 Comments
Beginner

very well explanation and alot of thanks.

also what about active/active FW across multi-pod fabric.

also i'm interested in multi-site fabric.

Community Manager

Hi Shehab,

 

Thanks so much for sharing the feedback! It's great to hear that you found this helpful!

 

I'm in the process of gathering some additional information for you regarding active/active FW across multi-pod fabric as well as multi-site fabric for you to review. I'll be in touch as soon as possible.

 

Thanks so much.

 

Ashley

Beginner

Thanks Ashley, and waiting to see your helpful information.

Community Manager

Hi Shehab,

 

I hope that you're doing well - and that you're continuing to have a great week!

 

Below are links to some additional resources regarding active/active FW across multi-pod fabric as well as multi-site fabric for you to check out:

 

Cisco ACI MultiPod and Service Node Integration WP

 

ACI MultiPod Whitepaper

 

For MultiSite

 

I hope this helps! But don't hesitate to let me know if you have any other questions!

 

Thanks so much!

 

Ashley

Beginner

Thanks a lot Ashley,

 

will check it and feed back You.

CreatePlease to create content
Content for Community-Ad
August's Community Spotlight Awards
This widget could not be displayed.