cancel
Showing results for 
Search instead for 
Did you mean: 
cancel

Everything You Need to Know About ACI’s Common Pervasive Gateway.

532
Views
0
Helpful
0
Comments
Community Manager

By: Jody

 

Let’s start at the beginning…what is it?

 

Common Pervasive Gateway is an older feature that was used to connect multiple ACI fabrics together via a L2 connection prior to the availability of ACI MultiPod and ACI MultiSite. While most customers will use a newer feature such as ACI MultiPod, or ACI MultiSite, there are still a number of customers who use the Common Pervasive Gateway feature to support L2 connectivity between fabrics.

 

What will we be covering here?

 

We’ll be discussing the Common Pervasive Gateway feature in ACI, which has been available since 1.2(1i). This feature addresses the requirement to extend Bridge Domain(BD) across multiple ACI Fabrics to provide a same default gateway to servers in each ACI Fabrics.

 

A normal default gateway configured as ACI BD subnet is called pervasive gateway. Hence why this feature is called “Common Pervasive Gateway.”

 

This cool thing about this feature is that it enables moving one or more virtual machines (VM) or conventional hosts across different ACI Fabrics seamlessly without any configuration changes or operations. It’s as if VMs are moving within the same ACI Fabric.

 

Interested in checking out in-depth documentation? We’ve listed some out for you below!

 

 

What problem is solved by Common Pervasive Gateway?

 

Short answer: EndPoint (EP) learning on another ACI Fabric is what Common Pervasive Gateway resolves.Image 1.jpg

However, here’s the detailed walkthrough of the problem:

 

※ pMAC : physical MAC

1) H1 ARPs for GW IP 192.168.0.254
    L1 responds to ARP with pMAC

2) H1 sends the packet with DMAC as pMAC and DIP as 192.168.1.2

3) L1 routes the packet and sends it to one of the Spines for proxy

4) Spine, being unable to find the entry for 192.168.1.2, sends glean packets

5) BL1 ARPs for 192.168.1.2 with pMAC as ARP sender MAC and 192.168.1.254 as ARP sender IP. BL2 doesn’t learn pMAC and 192.168.1.254 since ACI Fabric2 owns these as its own MAC and IP as well.

6) If 192.168.1.2 is already learned:
ARP request from BL1 is forwarded to L2. And L2 sends it to H2.
If 192.168.1.2 is not learned yet:
ARP request from BL1 is sent to one of the Spines for proxy and glean packets are sent for 192.168.1.2 to learn      192.168.1.2 as EP.
Then, next ARP request from BL1 is forwarded to H2.
If ARP flood is enabled:
ARP request from BL1 is flooded to H2

7) H2 responds to ARP with DMAC as pMAC, DIP as 192.168.1.254, SMAC as H2MAC, SIP as 192.168.1.2

8) L2 receives the ARP reply and doesn’t forward it to anywhere because DMAC and DIP is L2’s router MAC and IP.
Hence ACI Fabric1 cannot learn H2.

 

SIDE NOTE: This may work if we configure different pMACs on each ACI Fabric BDs. However, that means having two different MACs for the same IP address. This implies VM may keep using wrong ARP entry when vMotion happened from one Fabric to another.

 

Your next question might be: ‘What if we configure different pMACs and different subnet IPs on each ACI Fabrics?”

Well this means that we would need to reconfigure the default gateway IP address on VM when VM moves between ACI Fabrics.

 

But lo and behold, this problem is resolved by common pervasive gateway. 

 

How does Common Pervasive Gateway even work?

 

We’re glad you asked!

 

Common pervasive gateway allows us to have a virtual MAC and virtual IP which is common (i.e., the same) across multiple ACI Fabrics.

 

All devices in the Bridge Domain with the Common Pervasive Gateway feature is supposed to point at the Common Pervasive Gateway (virtual IP) as its default gateway.

 

However, it is still required to have a non-virtual IP on the BD in each of the ACI Fabrics on top of virtual IP. This non-virtual IP should be in the same subnet as virtual IP and should be unique for each of the ACI Fabrics. This is similar to HSRP physical IP and virtual IP configuration.

 

It is also required to have unique physical MAC addresses (Custom MAC address in APIC GUI) on each BD in each ACI Fabrics that will be stretched.Image 2.jpg

Let’s dive into how Common Pervasive Gateway really works:

 

  1. H1 ARPs for GW IP 192.168.0.254 (vIP)
    L1 responds to ARP with vMAC
  2. H1 sends the packet with DMAC as vMAC and DIP as 192.168.1.2
  3. L1 routes the packet and sends it to one of the Spines for proxy
    When vMAC is configured on BD, a packet with DMAC set to pMAC1 won’t be routed since router-mac for the BD becomes vMAC.
  4. Spine, being unable to find the entry for 192.168.1.2, sends glean packets
  5. BL1 ARPs for 192.168.1.2 with pMAC1 as ARP sender MAC and 192.168.1.251 as ARP sender IP.
    -> Both sender MAC/IP are set to physical not virtual
    BL2 learns pMAC1 and 192.168.1.251.
  6. If 192.168.1.2 is already learned:
    ARP request from BL1 is forwarded to L2. And L2 sends it to H2.
    If 192.168.1.2 is not learned yet:
    ARP request from BL1 is sent to one of the Spines for proxy and glean packets are sent for 192.168.1.2 to learn 192.168.1.2 as EP.
    Then, next ARP request from BL1 is forwarded to H2.
    If ARP flood is enabled:
    ARP request from BL1 is flooded to H2
  7. H2 responds to ARP with DMAC as pMAC1, DIP as 192.168.1.251, SMAC as H2MAC, SIP as 192.168.1.2
    L2 learns H2MAC,IP(192.168.1.2) and updates COOP on Spines in ACI Fabric2.
  8. L2 bridges the packet to BL2
  9. BL2 bridges the packet to BL1
    BL1 learns H2MAC,IP(192.168.1.2) and updates COOP on Spines in ACI Fabric1.
  10. Subsequent packets from H1 are routed to ACI Fabric2 through BL1.
    BL1 sends them out on L2OUT with SMAC as pMAC1, DMAC as H2MAC

What are the Configuration Components?

 

  • Multiples ACI Fabrics
  • Bridge Domain (BD) for each ACI Fabrics
  • Unique physical BD MAC (Custom MAC) for each ACI Fabrics
  • Unique non-virtual IP for each ACI Fabrics
  • Identical virtual MAC across ACI Fabrics
  • Identical virtual IP across ACI Fabrics
  • L2Out between each BDs on each ACI FabricsImage 3.jpg

     

Here's some clarification about terminology.

 

L2Out in ACI means we have a Layer 2 Network, which could have other switches or different networks while normal EPG basically is supposed to have only EndPoints such as servers.

 

The term “L2Out” technically implies External Bridged Network which is associated directory to BD. However, attaching L2 networks via an EPG (L2EPG) can accomplish the same thing. Which is why a lot of documents use L2Out which means either configuration. 

 

Note: L2Out via EPG is the most widely deployed version of the L2Out, and is the recommended configuration.

 

Cool. Got it. But what are some design prerequisites?

 

When using Common Pervasive Gateway, we have three requirements:

 

  1. The Bridge domain that is configured to communicate across ACI fabrics must be configured for flood mode.

    This means we need to set L2 Unknown Unicast to flood on the BD for Common Pervasive Gateway

 

  1. Only one EPG from a bridge domain (If the BD has multiple EPGs) should be configured on a border Leaf on the port which is connected to the second Fabric.

    This means we should have only one L2Out for each BD with Common Pervasive Gateway. And any normal EPGs in the same BD with Common Pervasive Gateway cannot use the port used for L2Out connection between two ACI Fabrics

  2. Do not connect hosts directly to an inter-connected Layer 2 network that enables a pervasive common gateway among the two ACI fabrics.

    This means L2Out connection between two ACI Fabrics for Common Pervasive Gateway must be used only for L2Out connectivity between two ACI Fabrics.
    For example, even if we have an external L2 switch between two ACI Fabrics for L2Out connectivity for Common Pervasive Gateway, we cannot connect any hosts to that switch with same vlan as L2Out. Also we cannot connect any hosts to that switch even with other vlans for any EPGs in the same BD which is the second prerequisite above.

 

Here’s a picture of what is described above for you visual folks!Image 4.jpg

And because the common pervasive gateway is also a feature to extend BD, each BD with common pervasive gateway requires one L2Out respectively.

 

Here’s a visual of the GUI:Image 5.jpg

It’s Questions & Answer time!

 

What is the purpose of virtual IP?

The BD subnet marked as virtual IP should be default gateway for servers. This IP must be identical across multiple ACI Fabrics.

 

Also if a BD subnet is configured as virtual IP, that IP address won’t be used as a source for ARP request originated from ACI Leaf BD SVI as long as other non-virtual IP exists in the same subnet. This is to make sure that BL1 in ACI Fabric1 in the previous scenario generates ARP request to H2 with unique sender MAC,IP for ACI Fabrics.

 

What is the purpose of virtual MAC?

 

When virtual MAC is configured, it becomes router-mac for the BD. Hence the packet with DMAC set to physical MAC (custom MAC in GUI) is no longer routed on the BD but just bridged.

 

ARP requests for any of SVI subnets on the BD is resolved with this virtual MAC. If it wasn’t resolved with virtual MAC, it may hit CSCux73998 .

 

How is physical MAC (custom MAC in GUI) used after virtual MAC is configured?

 

When traffic is routed on BD and source MAC needs to be rewritten, physical MAC is used as a source MAC for the packet.

 

Also it is used as a source MAC when traffic is generated from BD SVI such as ping from BD SVI.

 

This applies to both virtual IP and non-virtual IP. So even when ping is sourced from virtual IP, physical MAC is used as source MAC.

 

How is a packet with DMAC set to physical MAC (custom MAC in GUI) processed on ACI Leaf after virtual MAC is configured?

 

It will be bridged as normal Layer2 traffic. It cannot be routed on BD with virtual MAC since router-mac for the BD is already replaced with virtual MAC. Hence as long as ACI Leaf doesn’t learn the physical MAC from outside, it will be processed as unknown L2 unicast.

 

Can we still ping to BD SVI with DMAC set to physical MAC (custom MAC in GUI) after virtual MAC is configured?

 

Yes. Given Layer2 forwarding table, it looks like a packet should not be processed in CPU but just bridged. However, due to sup-tcam entry below, ICMP packets destined to BD SVI (router-ip for BD) are sup-redirected.Image 6.JPG

Can we still use non-virtual IP as default gateway for servers when virtual MAC and virtual IP are configured?

 

Technically yes but not recommended. It would work because ARP is always resolved with VMAC even for non-virtual IP.

 

What if virtual MAC is configured without any virtual IP?

 

Same thing still happens as described in the question “What is the purpose of virtual MAC?”

 

However, ACI Leaf switches cannot be sure which BD subnet is used as common across ACI Fabrics. Hence ARP request for H2 from BL1 in the previous scenario may be generated with sender IP of the one which is supposed to be virtual IP, in other words, the one configured as BD SVI on another ACI Fabrics as well. ACI Fabric1 fails to resolve ARP for H2 if that happened.

 

Why do we have to configure non-vitual IP on top of virtual IP in the same subnet?

 

If there was only virtual IP, target IP for ARP reply from H2 in previous scenario would be virtual IP because BL1 had no choice other than generating ARP with sender IP set to virtual IP. Hence ARP reply from H2 would be sup-redirected on L2 in ACI Fabric2 due to this sup-tcam entry.Image 7.JPG

Why do we have to configure unique non-virtual IP for each ACI Fabrics?

 

Same reason as above.

 

Why do we have to configure unique physical MAC (custom MAC in GUI) for each ACI Fabrics?

 

Basically same reason as non-virtual IP. If DMAC of ARP reply is same as one of router-mac, it is sup-redirected and not forwarded.Image 8.JPG

All About Verification.

 

We need to know how to verify the current software and hardware status is correct. We can do that by checking each processes described at previous section.

 

Policy Manager (Logical Object)Image 9.JPGPolicy Manager (Concrete Object)Image 10.JPG

Policy Element (Concrete Object)Image 11.JPG

SVIMgrImage 12.JPG

IPMgrImage 13.JPG

ARPImage 14.JPG

ELTMCImage 15.JPG

 

Image 16.JPG

Here are some additional troubleshooting tips.

 

As described in FAQ section, BD SVI always uses physical BD MAC (custom MAC in GUI) as source MAC even though it replies to ARP request with virtual MAC.

 

Hence the expected dst/src MAC in packets through BD with virtual MAC are as follows.

 

(that is, assuming both BDs are using virtual MAC)

 

ICMP echo:

 

H1(192.168.0.1) —-> [(192.168.0.254)BD—BD(192.168.1.254)] —-> H2(192.168.1.2)

DMAC: VMAC                                                                                 DMAC: H2MAC

SMAC: H1MAC                                                                               SMAC: pMAC

 

ICMP reply:

 

H1(192.168.0.1) <—- [(192.168.0.254)BD—BD(192.168.1.254)] <—- H2(192.168.1.2)

DMAC: H1MAC                                                                                 DMAC: VMAC 

SMAC: pMAC                                                                                    SMAC: H2MAC

 

As described in FAQ section, router-mac is replaced with virtual MAC. So if DMAC of incoming packets were pMAC instead of vMAC, it won’t be routed but just handled as L2 unicast and most likely flooded or proxy’ed within BD.

 

Oh and by the way…

 

There are some devices such as Netapp with fast path feature which doesn’t do ARP when it replies to packets but instead uses source MAC of incoming packets as dest MAC for reply packet. Those devices may reply to ACI Leaf with pMAC as dest MAC since pMAC is used as source for packets coming out from ACI BD with virtual MAC. If that happens, traffic won’t be routed.

 

Example Product & Feature

 

 That’s all for now folks. But don’t fret, the ACI blog series will continue right here on the ACI Board on Cisco Community.

 

In an effort to make sure we’re providing you with top-notch content that’s helpful and most fitting to where you are in your current journey, drop us a comment and let us know if this deep dive into the common pervasive gateway was helpful!

 

And while you’re at it, let us know what specific ACI topics you’d like to see addressed in this blog series.

 

A special thanks to Takuya Kishida for his contribution to this blog.

 

CreatePlease to create content
Content for Community-Ad
August's Community Spotlight Awards