Let’s start at the beginning…what is it?
Common Pervasive Gateway is an older feature that was used to connect multiple ACI fabrics together via a L2 connection prior to the availability of ACI MultiPod and ACI MultiSite. While most customers will use a newer feature such as ACI MultiPod, or ACI MultiSite, there are still a number of customers who use the Common Pervasive Gateway feature to support L2 connectivity between fabrics.
What will we be covering here?
We’ll be discussing the Common Pervasive Gateway feature in ACI, which has been available since 1.2(1i). This feature addresses the requirement to extend Bridge Domain(BD) across multiple ACI Fabrics to provide a same default gateway to servers in each ACI Fabrics.
A normal default gateway configured as ACI BD subnet is called pervasive gateway. Hence why this feature is called “Common Pervasive Gateway.”
This cool thing about this feature is that it enables moving one or more virtual machines (VM) or conventional hosts across different ACI Fabrics seamlessly without any configuration changes or operations. It’s as if VMs are moving within the same ACI Fabric.
Interested in checking out in-depth documentation? We’ve listed some out for you below!
What problem is solved by Common Pervasive Gateway?
Short answer: EndPoint (EP) learning on another ACI Fabric is what Common Pervasive Gateway resolves.
However, here’s the detailed walkthrough of the problem:
※ pMAC : physical MAC
1) H1 ARPs for GW IP 192.168.0.254
L1 responds to ARP with pMAC
2) H1 sends the packet with DMAC as pMAC and DIP as 192.168.1.2
3) L1 routes the packet and sends it to one of the Spines for proxy
4) Spine, being unable to find the entry for 192.168.1.2, sends glean packets
5) BL1 ARPs for 192.168.1.2 with pMAC as ARP sender MAC and 192.168.1.254 as ARP sender IP. BL2 doesn’t learn pMAC and 192.168.1.254 since ACI Fabric2 owns these as its own MAC and IP as well.
6) If 192.168.1.2 is already learned:
ARP request from BL1 is forwarded to L2. And L2 sends it to H2.
If 192.168.1.2 is not learned yet:
ARP request from BL1 is sent to one of the Spines for proxy and glean packets are sent for 192.168.1.2 to learn 192.168.1.2 as EP.
Then, next ARP request from BL1 is forwarded to H2.
If ARP flood is enabled:
ARP request from BL1 is flooded to H2
7) H2 responds to ARP with DMAC as pMAC, DIP as 192.168.1.254, SMAC as H2MAC, SIP as 192.168.1.2
8) L2 receives the ARP reply and doesn’t forward it to anywhere because DMAC and DIP is L2’s router MAC and IP.
Hence ACI Fabric1 cannot learn H2.
SIDE NOTE: This may work if we configure different pMACs on each ACI Fabric BDs. However, that means having two different MACs for the same IP address. This implies VM may keep using wrong ARP entry when vMotion happened from one Fabric to another.
Your next question might be: ‘What if we configure different pMACs and different subnet IPs on each ACI Fabrics?”
Well this means that we would need to reconfigure the default gateway IP address on VM when VM moves between ACI Fabrics.
But lo and behold, this problem is resolved by common pervasive gateway.
How does Common Pervasive Gateway even work?
We’re glad you asked!
Common pervasive gateway allows us to have a virtual MAC and virtual IP which is common (i.e., the same) across multiple ACI Fabrics.
All devices in the Bridge Domain with the Common Pervasive Gateway feature is supposed to point at the Common Pervasive Gateway (virtual IP) as its default gateway.
However, it is still required to have a non-virtual IP on the BD in each of the ACI Fabrics on top of virtual IP. This non-virtual IP should be in the same subnet as virtual IP and should be unique for each of the ACI Fabrics. This is similar to HSRP physical IP and virtual IP configuration.
It is also required to have unique physical MAC addresses (Custom MAC address in APIC GUI) on each BD in each ACI Fabrics that will be stretched.
Let’s dive into how Common Pervasive Gateway really works:
What are the Configuration Components?
Here's some clarification about terminology.
L2Out in ACI means we have a Layer 2 Network, which could have other switches or different networks while normal EPG basically is supposed to have only EndPoints such as servers.
The term “L2Out” technically implies External Bridged Network which is associated directory to BD. However, attaching L2 networks via an EPG (L2EPG) can accomplish the same thing. Which is why a lot of documents use L2Out which means either configuration.
Note: L2Out via EPG is the most widely deployed version of the L2Out, and is the recommended configuration.
Cool. Got it. But what are some design prerequisites?
When using Common Pervasive Gateway, we have three requirements:
Here’s a picture of what is described above for you visual folks!
And because the common pervasive gateway is also a feature to extend BD, each BD with common pervasive gateway requires one L2Out respectively.
Here’s a visual of the GUI:
It’s Questions & Answer time!
What is the purpose of virtual IP?
The BD subnet marked as virtual IP should be default gateway for servers. This IP must be identical across multiple ACI Fabrics.
Also if a BD subnet is configured as virtual IP, that IP address won’t be used as a source for ARP request originated from ACI Leaf BD SVI as long as other non-virtual IP exists in the same subnet. This is to make sure that BL1 in ACI Fabric1 in the previous scenario generates ARP request to H2 with unique sender MAC,IP for ACI Fabrics.
What is the purpose of virtual MAC?
When virtual MAC is configured, it becomes router-mac for the BD. Hence the packet with DMAC set to physical MAC (custom MAC in GUI) is no longer routed on the BD but just bridged.
ARP requests for any of SVI subnets on the BD is resolved with this virtual MAC. If it wasn’t resolved with virtual MAC, it may hit CSCux73998 .
How is physical MAC (custom MAC in GUI) used after virtual MAC is configured?
When traffic is routed on BD and source MAC needs to be rewritten, physical MAC is used as a source MAC for the packet.
Also it is used as a source MAC when traffic is generated from BD SVI such as ping from BD SVI.
This applies to both virtual IP and non-virtual IP. So even when ping is sourced from virtual IP, physical MAC is used as source MAC.
How is a packet with DMAC set to physical MAC (custom MAC in GUI) processed on ACI Leaf after virtual MAC is configured?
It will be bridged as normal Layer2 traffic. It cannot be routed on BD with virtual MAC since router-mac for the BD is already replaced with virtual MAC. Hence as long as ACI Leaf doesn’t learn the physical MAC from outside, it will be processed as unknown L2 unicast.
Can we still ping to BD SVI with DMAC set to physical MAC (custom MAC in GUI) after virtual MAC is configured?
Yes. Given Layer2 forwarding table, it looks like a packet should not be processed in CPU but just bridged. However, due to sup-tcam entry below, ICMP packets destined to BD SVI (router-ip for BD) are sup-redirected.
Can we still use non-virtual IP as default gateway for servers when virtual MAC and virtual IP are configured?
Technically yes but not recommended. It would work because ARP is always resolved with VMAC even for non-virtual IP.
What if virtual MAC is configured without any virtual IP?
Same thing still happens as described in the question “What is the purpose of virtual MAC?”
However, ACI Leaf switches cannot be sure which BD subnet is used as common across ACI Fabrics. Hence ARP request for H2 from BL1 in the previous scenario may be generated with sender IP of the one which is supposed to be virtual IP, in other words, the one configured as BD SVI on another ACI Fabrics as well. ACI Fabric1 fails to resolve ARP for H2 if that happened.
Why do we have to configure non-vitual IP on top of virtual IP in the same subnet?
If there was only virtual IP, target IP for ARP reply from H2 in previous scenario would be virtual IP because BL1 had no choice other than generating ARP with sender IP set to virtual IP. Hence ARP reply from H2 would be sup-redirected on L2 in ACI Fabric2 due to this sup-tcam entry.
Why do we have to configure unique non-virtual IP for each ACI Fabrics?
Same reason as above.
Why do we have to configure unique physical MAC (custom MAC in GUI) for each ACI Fabrics?
Basically same reason as non-virtual IP. If DMAC of ARP reply is same as one of router-mac, it is sup-redirected and not forwarded.
All About Verification.
We need to know how to verify the current software and hardware status is correct. We can do that by checking each processes described at previous section.
Policy Manager (Logical Object)Policy Manager (Concrete Object)
Policy Element (Concrete Object)
Here are some additional troubleshooting tips.
As described in FAQ section, BD SVI always uses physical BD MAC (custom MAC in GUI) as source MAC even though it replies to ARP request with virtual MAC.
Hence the expected dst/src MAC in packets through BD with virtual MAC are as follows.
(that is, assuming both BDs are using virtual MAC)
H1(192.168.0.1) —-> [(192.168.0.254)BD—BD(192.168.1.254)] —-> H2(192.168.1.2)
DMAC: VMAC DMAC: H2MAC
SMAC: H1MAC SMAC: pMAC
H1(192.168.0.1) <—- [(192.168.0.254)BD—BD(192.168.1.254)] <—- H2(192.168.1.2)
DMAC: H1MAC DMAC: VMAC
SMAC: pMAC SMAC: H2MAC
As described in FAQ section, router-mac is replaced with virtual MAC. So if DMAC of incoming packets were pMAC instead of vMAC, it won’t be routed but just handled as L2 unicast and most likely flooded or proxy’ed within BD.
Oh and by the way…
There are some devices such as Netapp with fast path feature which doesn’t do ARP when it replies to packets but instead uses source MAC of incoming packets as dest MAC for reply packet. Those devices may reply to ACI Leaf with pMAC as dest MAC since pMAC is used as source for packets coming out from ACI BD with virtual MAC. If that happens, traffic won’t be routed.
Example Product & Feature
That’s all for now folks. But don’t fret, the ACI blog series will continue right here on the ACI Board on Cisco Community.
In an effort to make sure we’re providing you with top-notch content that’s helpful and most fitting to where you are in your current journey, drop us a comment and let us know if this deep dive into the common pervasive gateway was helpful!
And while you’re at it, let us know what specific ACI topics you’d like to see addressed in this blog series.
A special thanks to Takuya Kishida for his contribution to this blog.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.