cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
832
Views
0
Helpful
6
Replies

ACI MCP with double-sided vpc to legacy network

spongqu
Level 1
Level 1

Hello experts,

I'm having some issue with the MCP faults F2533 (desc: Loop is detected on lower priority MCP interface po3 in vlan-xxx on node 3002) occurs almost every day after we've upgraded our ACI fabric from 4.2 to 5.2

First of all, let me explain our scenario (diagram as attached);

  • In our network, we have Leaf-A&B (VPC pair) forming a double-sided VPC to core switch N9K-A&B (also a VPC pair) then N9K connecting to campus networks
  • Leaf-A has port 1&2 connecting to N9K-A port5, and N9K-B port7
  • Leaf-B has port 3&4 connecting to N9K-A port6, and N9K-B port8
  • every day, we have received fault F2533 on Leaf-B, and MCP shutting down port3&4 then recovered by err-disable recovery after 5 mins (most of it occurred during non-office hours)

I've been doing some researches about the MCP behavior that it will be blocking the port which receiving MCP packets of its own fabric, Refer from this document: Cisco Application Centric Infrastructure (ACI) Design Guide - Mis-Cabling Protocol 

 

● Cisco ACI leaf switch ports generate MCP frames at the frequency defined in the configuration. 
When everything is normal, Cisco ACI doesn’t receive MCP frames. If Cisco ACI receives MCP frames,
it can be the symptom of a loop.

● In a port channel, MCP frames are sent only on the first port that became operational in the port channel.
● With vPCs, Cisco ACI sends MCP frames from both vPC peers.
● If a Cisco ACI leaf switch port receives an MCP frame generated by the very same fabric,
this is a symptom of a loop. Hence, after receiving N MCP frames (with N configurable),
Cisco ACI compares the MCP priority to determine which port will be shut down.

● To determine which port stays up and which one is shut down, Cisco ACI compares the fabric ID,
the leaf switch ID, the vPC information, and the port ID. The lower number has the higher priority.
If a loop is between the ports of the same leaf switch, then vPC has higher priority than port channels,
and port channels have higher priority than physical ports.

 

Here's some questions I've been struggling;

  • In our case, MCP will be sent from first operational of PO, meaning that Leaf-A will be sending MCP from port1 and Leaf-B will be sending MCP from port3?
  • N9K-A receive the MCP from LeafA&B then forward the MCP packet to the campus network having this VLAN tagged, and also to its VPC peer N9K-B?
  • Once N9K-B received the MCP packet, it will forward the MCP packet back to Leaf-A&B?
  • I understand that MCP blocking on Leaf-B port3&4 since it's received MCP packet of its own fabric (maybe from Leaf-A?) and it's got a lower priority compared from VPC role?
  • In our fabric, we're using per-vlan MCP PDU and the fault F2533 were raising in just this particular VLAN while it's having other VLANs connecting in the same design, why do we observe the fault only on this VLAN?
  • If the issue symptoms would be like this, does increasing loop multiplication factors or transmission frequency can remediate the issue?

Model & software version for reference:
ACI Version 5.2(8h)
Leaf switches : N9K-C93180YC-FX / 15.2(8h)
Legacy N9K: N9K-C9504 / N9K-SUP-A / NXOS 9.3(10)

//adding diagram

MCP-Diagram.jpg

1 Accepted Solution

Accepted Solutions

spongqu
Level 1
Level 1

Update: after tracking down further into the issue, we found that the LEAF's MAC address was somehow learned from 1 of the campus switch incorrectly on the VLAN that triggered MCP err-disabled.

We removed this vlan from the trunk on this campus switch (the interface is used as backup link and connecting to DR site's core switch which didn't have this vlan tagged in the trunk). and then we haven't observed the MCP err-disable anymore.

View solution in original post

6 Replies 6

AshSe
Level 1
Level 1

Dear @spongqu 

Basic Suggestion:

  • On Leaf-B can you replace interfaces 3 & 4 by 1 & 2 and check.

Some Basic Questions:

  • Was this setup working sans issue with APIC 4.2?
  • Have you created any SR for this?

Request:

  • Please share output of the command:
    • show vpc
    • show port-channel summary

Let's keep this discussion actively up.

Hi @AshSe 

Basic Suggestion:

  • On Leaf-B can you replace interfaces 3 & 4 by 1 & 2 and check. >>> nice idea, but unfortunately we couldn't have a MW to try this out

Some Basic Questions:

  • Was this setup working sans issue with APIC 4.2? >>> yes, it was working before with APIC 4.2
  • Have you created any SR for this? >>> haven't created the SR yet

Request:

  • Please share output of the command: >> please find below
    • show vpc
    • show port-channel summary

 

LEAF-A // show port-channel sum
Flags: D - Down P - Up in port-channel (members)
I - Individual H - Hot-standby (LACP only)
s - Suspended r - Module-removed
b - BFD Session Wait
S - Switched R - Routed
U - Up (port-channel)
M - Not in use. Min-links not met
F - Configuration failed
-------------------------------------------------------------------------------
Group Port- Type Protocol Member Ports
Channel
-------------------------------------------------------------------------------
2 Po2(SU) Eth LACP Eth1/1(P) Eth1/2(P) >>> port 1&2


LEAF-A // show vpc brief
Legend:
(*) - local vPC is down, forwarding via vPC peer-link
vPC domain id : 10
Peer status : peer adjacency formed ok
vPC keep-alive status : Disabled
Configuration consistency status : success
Per-vlan consistency status : success
Type-2 consistency status : success
vPC role : primary
Number of vPCs configured : 2
Peer Gateway : Disabled
Dual-active excluded VLANs : -
Graceful Consistency Check : Enabled
Auto-recovery status : Enabled (timeout = 200 seconds)
Delay-restore status : Enabled (timeout = 120 seconds)
Delay-restore SVI status : Enabled (timeout = 0 seconds)
Operational Layer3 Peer : Disabled

vPC Peer-link status
---------------------------------------------------------------------
id Port Status Active vlans
-- ---- ------ --------------------------------------------------
1 up -

vPC status
----------------------------------------------------------------------
id Port Status Consistency Reason Active vlans
-- ---- ------ ----------- ------ ------------
343 Po2 up success success 7,11-15,17, ...

LEAF-B // show port-channel sum
Flags: D - Down P - Up in port-channel (members)
I - Individual H - Hot-standby (LACP only)
s - Suspended r - Module-removed
b - BFD Session Wait
S - Switched R - Routed
U - Up (port-channel)
M - Not in use. Min-links not met
F - Configuration failed
-------------------------------------------------------------------------------
Group Port- Type Protocol Member Ports
Channel
-------------------------------------------------------------------------------
3 Po3(SU) Eth LACP Eth1/1(P) Eth1/2(P) >>> port 3&4

LEAF-B // show vpc brief
Legend:
(*) - local vPC is down, forwarding via vPC peer-link

vPC domain id : 10
Peer status : peer adjacency formed ok
vPC keep-alive status : Disabled
Configuration consistency status : success
Per-vlan consistency status : success
Type-2 consistency status : success
vPC role : secondary
Number of vPCs configured : 2
Peer Gateway : Disabled
Dual-active excluded VLANs : -
Graceful Consistency Check : Enabled
Auto-recovery status : Enabled (timeout = 200 seconds)
Delay-restore status : Enabled (timeout = 120 seconds)
Delay-restore SVI status : Enabled (timeout = 0 seconds)
Operational Layer3 Peer : Disabled

vPC Peer-link status
---------------------------------------------------------------------
id Port Status Active vlans
-- ---- ------ --------------------------------------------------
1 up -

vPC status
----------------------------------------------------------------------
id Port Status Consistency Reason Active vlans
-- ---- ------ ----------- ------ ------------
343 Po3 up success success 7,11-15,17, ...


CORE-A# show port-channel summary 
Flags: D - Down P - Up in port-channel (members)
I - Individual H - Hot-standby (LACP only)
s - Suspended r - Module-removed
b - BFD Session Wait
S - Switched R - Routed
U - Up (port-channel)
p - Up in delay-lacp mode (member)
M - Not in use. Min-links not met
--------------------------------------------------------------------------------
Group Port- Type Protocol Member Ports
Channel
--------------------------------------------------------------------------------
1 Po1(SU) Eth LACP Eth1/1(P) Eth2/1(P) >>> peer-link
90 Po90(SU) Eth LACP Eth1/2(P) Eth2/2(P)  >>> port 5&6

CORE-A# show vpc brief
Legend:
(*) - local vPC is down, forwarding via vPC peer-link

vPC domain id : 11
Peer status : peer adjacency formed ok
vPC keep-alive status : peer is alive
Configuration consistency status : success
Per-vlan consistency status : success
Type-2 consistency status : failed
Type-2 inconsistency reason : SVI type-2 configuration incompatible
vPC role : primary
Number of vPCs configured : 6
Peer Gateway : Enabled
Dual-active excluded VLANs : -
Graceful Consistency Check : Enabled
Auto-recovery status : Enabled, timer is off.(timeout = 240s)
Delay-restore status : Timer is off.(timeout = 120s)
Delay-restore SVI status : Timer is off.(timeout = 10s)
Operational Layer3 Peer-router : Disabled
Virtual-peerlink mode : Disabled

vPC Peer-link status
---------------------------------------------------------------------
id Port Status Active vlans
-- ---- ------ -------------------------------------------------
1 Po1 up 1,5-7,9-18,20-28,31-39,53,60-65,68-97,99-100, ...

vPC status
----------------------------------------------------------------------------
Id Port Status Consistency Reason Active vlans
----------------------------------------------------------------------------
90 Po90 up success success 7,11-15,17,23, ...


CORE-B# show port-channel summary 
Flags: D - Down P - Up in port-channel (members)
I - Individual H - Hot-standby (LACP only)
s - Suspended r - Module-removed
b - BFD Session Wait
S - Switched R - Routed
U - Up (port-channel)
p - Up in delay-lacp mode (member)
M - Not in use. Min-links not met
--------------------------------------------------------------------------------
Group Port- Type Protocol Member Ports
Channel
--------------------------------------------------------------------------------
1 Po1(SU) Eth LACP Eth1/1(P) Eth2/1(P) >>> peer-link
90 Po90(SU) Eth LACP Eth1/2(P) Eth2/2(P)  >>> port 7&8

CORE-B# show vpc brief
Legend:
(*) - local vPC is down, forwarding via vPC peer-link

vPC domain id : 11
Peer status : peer adjacency formed ok
vPC keep-alive status : peer is alive
Configuration consistency status : success
Per-vlan consistency status : success
Type-2 consistency status : failed
Type-2 inconsistency reason : SVI type-2 configuration incompatible
vPC role : secondary
Number of vPCs configured : 6
Peer Gateway : Enabled
Dual-active excluded VLANs : -
Graceful Consistency Check : Enabled
Auto-recovery status : Enabled, timer is off.(timeout = 240s)
Delay-restore status : Timer is off.(timeout = 120s)
Delay-restore SVI status : Timer is off.(timeout = 10s)
Operational Layer3 Peer-router : Disabled
Virtual-peerlink mode : Disabled

vPC Peer-link status
---------------------------------------------------------------------
id Port Status Active vlans
-- ---- ------ -------------------------------------------------
1 Po1 up 1,5-7,9-18,20-28,31-39,53,60-65,68-97,99-100, ...

vPC status
----------------------------------------------------------------------------
Id Port Status Consistency Reason Active vlans
----------------------------------------------------------------------------
90 Po90 up success success 7,11-15,17,23, ... 

 

AshSe
Level 1
Level 1

Few more Qs.

  • .......unfortunately we couldn't have a MW to try this out  >> What is MW?
  • Any reason of using interfaces (in diagram) in switches in sequential order like: 1,2,3,4,5,6? Is this a simulated environment?
  • On Leaf-A > Allowed VLANs >> How many VLAN pools, have you created? Have you called all of them in External L3 Domain? Please share VLAN configuration.
  • on Core-A aka N9K-A, in the diagram you are showing interface 5 & 6 but in the output it is Eth1/2 & Eth2/2. Any reason for using different numbers in diagram and in configuration? If no, can you please mention correct interfaces in the diagram.
  • The same applies with Core-B.
  • Why are you using different names of Core devices in diagram and in configuration. It confuses. Same you are doing for interfaces.
  • In between Core switches, Type-2 Consistency is failing. Is there any VLAN mismatch? Please check and correct.
  • In total there are 6 vPC configured on Core switches. With which devices these vPCs are configured?

 

Please find my answers below, and sorry for some confusion I've made

@AshSe wrote:

Few more Qs.

  • .......unfortunately we couldn't have a MW to try this out  >> What is MW? // MW = Maintenance Window
  • Any reason of using interfaces (in diagram) in switches in sequential order like: 1,2,3,4,5,6? Is this a simulated environment? // This is a production, I just use number to be simple for explanation
  • On Leaf-A > Allowed VLANs >> How many VLAN pools, have you created? Have you called all of them in External L3 Domain? Please share VLAN configuration. // only 1 vlan pool for production, and connecting to Core switch using EPG static ports
  • on Core-A aka N9K-A, in the diagram you are showing interface 5 & 6 but in the output it is Eth1/2 & Eth2/2. Any reason for using different numbers in diagram and in configuration? If no, can you please mention correct interfaces in the diagram. // interface Eth1/2 means interface 5, and Eth2/2 means interface 6 on Core-A
  • The same applies with Core-B. // that means the same for Core-B, Eth1/2 for interface 7, and Eth2/2 for interface 8
  • Why are you using different names of Core devices in diagram and in configuration. It confuses. Same you are doing for interfaces. // sorry for that, this should explain it already
  • In between Core switches, Type-2 Consistency is failing. Is there any VLAN mismatch? Please check and correct. // there're some interface VLANs that must be different on each core switch, and the Vlan is allowed on peer-link 
  • In total there are 6 vPC configured on Core switches. With which devices these vPCs are configured? // it's for other interfaces that not related to our discussion, so I cut the output to focus on the discussion

 

AshSe
Level 1
Level 1

Do you see any Fault code for MCP or vPC under the Fault tab?

If No, please recheck entire configuration (including vPC domain) and sanitise wherever required. We respect the NDA (Non Disclosure Agreement) of your production environment and can't ask to share the configuration in this public forum. 

My one cent suggestion:

  • Create Tac SR. There may be a bug in the v5.2 (I personally don't think so). 

spongqu
Level 1
Level 1

Update: after tracking down further into the issue, we found that the LEAF's MAC address was somehow learned from 1 of the campus switch incorrectly on the VLAN that triggered MCP err-disabled.

We removed this vlan from the trunk on this campus switch (the interface is used as backup link and connecting to DR site's core switch which didn't have this vlan tagged in the trunk). and then we haven't observed the MCP err-disable anymore.

Review Cisco Networking for a $25 gift card

Save 25% on Day-2 Operations Add-On License