cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
12917
Views
35
Helpful
15
Replies

Ask the expert: Virtual Port-Channel on Nexus – Configuration, Best Practices and Troubleshooting

Cisco Moderador
Community Manager
Community Manager

This topic is a chance to discuss more about VPC configuration, best practices and troubleshooting. vPC is a virtualization technology that presents both Cisco Nexus family paired devices as a unique Layer 2 logical node to access layer devices or endpoints. A virtual port channel (vPC) allows links that are physically connected to two different Cisco Nexus devices to appear as a single port channel to a third device. The third device can be a switch, server, or any other networking device that supports link aggregation technology. A vPC can provide Layer 2 multipathing, which allows you to create redundancy by increasing bandwidth, enabling multiple parallel paths between nodes and load-balancing traffic where alternative paths exist.

 

To participate in this event, please use the Join the Discussion : Cisco Ask the Expert button below to ask your questions

Ask questions from Monday, August 22 to September 2nd, 2016

Featured Expert

Parminder Nath is a customer support engineer in Cisco HTTS (High Touch Technical Support). He is an expert on LAN Switching and Data Center products. His areas of expertise include Cisco Catalyst 2x00, 3x00, 4x00, 6500, Cisco Nexus 7700, Nexus 7000, Nexus 6K, Nexus 5K, Nexus 3K and N2K. He has over 11 years of industry experience working with large Enterprise and Service Provider networks. Parminder is CCIE Data Center(#51436).



Mahabir Prasad is a customer support engineer in Cisco HTTS (High Touch Technical Support). He has around 10 years of total experience and has been working with Cisco for 4 years and 10 months in multiple domains. His area of expertise are Nexus switches (N2K, N3k, N5k, N9K) and UCS infrastructure. He holds a Bachelor’s degree in Electronics and Communication Engineering from J.I.E.T. college of Kurukshetra University. He also holds CCIE Data Center (CCIE#44060) and RHCE certification.

Parminder Nath & Mahabir Prasad might not be able to answer each question due to the volume expected during this event. Remember that you can continue the conversation on the Other Data Center Community 

Find other  https://supportforums.cisco.com/expert-corner/knowledge-sharing.

**Ratings Encourage Participation! **
Please be sure to rate the Answers to Questions

3 Accepted Solutions

Accepted Solutions

Hi Taj,

Thanks for question. I reviewed the configurations for  both nexus devices. These configurations are good and we are not violating any best practice. With that said, we are missing following configuration on Po100 and Po101 which are are fex-fabric ports.

Interface Po100

switchport

Switchport mode fex-fabric

fex associate 100

vpc 100                        <--- As our fex are dual homed we need to configure VPC under these ports.

Interface po101

switchport

switchport mode fex-fabric

fex associate 101

vpc 101                      <----

We would also need to enable associated features such as "feature vpc" and "feature lacp".

As we know sequence of events while configuring/bringing up VPC first time is very important, we need to make sure peer keepalive is up and passing traffic before further VPC configuration. In our case we need to configure links associated with peer keepalive first. So following configuration we need to apply first;

vlan 9              <-- we need to configure this vlan to be used later for peer-keepalive
name peer-keepalive

vrf context vPC-PEER-KEEPALIVE
interface e3/30
 desc vPC Peer Keepalive link
 switchport mode trunk
 switchport trunk allowed vlan 9
 spanning-tree port type edge trunk
 spanning-tree bpdufilter enable
 no shutdown

interface vlan 9
 description vPC Peer-keepalive
 vrf member  vPC-PEER-KEEPALIVE
 ip address 198.168.9.57/30
 no shutdown

Also following is the link to VPC best practices;

http://www.cisco.com/c/dam/en/us/td/docs/switches/datacenter/sw/design/vpc_design/vpc_best_practices_design_guide.pdf

Thanks,

Parminder

View solution in original post

Hi Fhgallardo

As i understand you have two N7K connected directly with FI-A and FI-B. Downstream to which UCS 5108 chassis is connected. Also you have storage array VNX5500 and spectra tape library for backup purpose. I am not sure but these storage devices might be connected to N7K may be via MDS or some other storage switches and the issue is backup is not working.

I think we need to look in to this issue first from application perspective, starting from any reported error/logs.Based on that we can determine if issue is application specific or network and we need to troubleshoot further.

This session is specifically about the VPC which as you know is a way to provide connectivity with multiple level of redundancy ensuring high availability and link-level resiliency. It should not affect backup process in anyway.

If you are seeing any anomalies with VPC or have any query about VPC, please feel free to ask so that i can help you with that.

Thanks
Mahabir

View solution in original post

Hi John

VLANs on peer-link are going in to inconsistent state because BPDUs are timing out and Bridge Assurance is kicking in.
So basically we need to check why BPDU are not receiving intermittently. Please check the below:

- Check if CPU goes high during the BPDU timeout. High CPU utilization can cause these BPDU timeouts.

show processes cpu sort | ex 0.0
show processes cpu history

- Are their any errors on the physical interfaces of peer-link. If yes then try replacing cable/SFP if that resolves the issue.


- If there are no errors/drops on the link then check if there are any drops on COPP. If you are using strict COPP policy and running very high number of VLANs then there are chances that BPDU can be dropped because of high BPDU traffic. 

show policy-map interface control-plane | i drop|violate

- Check if STP is stable

- Check if you have very high traffic on peer-link. Although data-plane traffic shouldn't affect control plane traffic but it is something you need to avoid.


- Also in a similar issue, i found customer was not using COPP with a thinking that BW for control-plane is not limited with COPP disabled. But that's not true in case of N9K. In N9K when COPP is disabled by default CPU bound queues are restricted with 50 PPS rate which is very low BW
and can cause BPDUs and other control plane traffic to drop easily.

There is bug already there for this behavior:

https://bst.cloudapps.cisco.com/bugsearch/bug/CSCun09035/?reffering_site=dumpcr

Thanks
Mahabir

View solution in original post

15 Replies 15

fhgallardo
Level 1
Level 1

I was admin a Nexus 7010, that one is a principal conection in all office, in fact it have 2 nexus 7010, those is coneccted to a UCS 5108 throught vpc between FAB 6248UP with FAB2208XP and that last is coneccted with a EMC [VNX5500], but in a Spectra T50e i can't do nothig to backup in tape.

What i can do, to backup in T50e because at this moment i can see spectra partition or device to backup in that one.

Hi Fhgallardo

As i understand you have two N7K connected directly with FI-A and FI-B. Downstream to which UCS 5108 chassis is connected. Also you have storage array VNX5500 and spectra tape library for backup purpose. I am not sure but these storage devices might be connected to N7K may be via MDS or some other storage switches and the issue is backup is not working.

I think we need to look in to this issue first from application perspective, starting from any reported error/logs.Based on that we can determine if issue is application specific or network and we need to troubleshoot further.

This session is specifically about the VPC which as you know is a way to provide connectivity with multiple level of redundancy ensuring high availability and link-level resiliency. It should not affect backup process in anyway.

If you are seeing any anomalies with VPC or have any query about VPC, please feel free to ask so that i can help you with that.

Thanks
Mahabir

Dears,

whenever I configure the vpc between 2no's of 7K, it doesn't come easily I have to do plenty of times shut, no shut, default interface for all the links  it show in logging buffer error no operational members for port-channel.

is it there is certain order of applying the commands while configuring VPC.

thanks

Hi Clark,

Thanks for question. Though there is no specific order and it is almost same as configuring normal port-channel but we do have some guidelines and best practices for same.

Parameters such as speed, duplex, MTU, allowed vlans etc should match exactly on all the member ports. As you advised that defaulting interface configuration helps in bringing up the ports in VPC, that  makes me believe that there is some existing configuration on these ports which is not consistent across all the interfaces in port-channel.

We do have vpc consistency checker to make sure that parameters are consistent per interface and globally. Kindly review “Checking vPC Configuration Consistency When You Build a vPC Domain” and “Recommendations for vPC Member Port Configuration” sections in following best practices document.

http://www.cisco.com/c/dam/en/us/td/docs/switches/datacenter/sw/design/vpc_design/vpc_best_practices_design_guide.pdf

It is good to use LACP for VPC connectivity. If you can share session logs when you face such issue, I will be happy to assist further for resolution of issue.


Regards,

Parminder

I have atatched  one topology ,It does not look like  double or single sided vpc topology .What would be the pros and cons of the topology .
How the traffic would flow if we implement the attached topology.
How the server A,B and c traffic flow works ?

What if the aggregator switch -1 failed  

Thanks

Hi KP,

Thanks for question. This topology is indeed bit strange. Theoretically this should work but I have not seen this kind of setup either in design/configuration documents or in field.

Though it does not look like one of supported topology, Let me do some more research and get back to you with findings.

Regards,

Parminder

Taj Mohd
Level 1
Level 1

Dear Parminder, Mahabir, 

We are setting up a new data centre (small size). We are plan to use Nexus 7004 as Core/distribution switch. We are planning to run vPC in Nexus 7004. We do not plan to use any access layer switch (server farm), instead we are planning to use Nexus 2K as FEX. Our endpoint severs will be connected to the FEX (Attached is the brief diagram). 

Can you please let me know if the configuration (Attached) is the best practised for this vPC and FEX design. Please let me know your thoughts.

Regards

Taj

Hi Taj,

Thanks for question. I reviewed the configurations for  both nexus devices. These configurations are good and we are not violating any best practice. With that said, we are missing following configuration on Po100 and Po101 which are are fex-fabric ports.

Interface Po100

switchport

Switchport mode fex-fabric

fex associate 100

vpc 100                        <--- As our fex are dual homed we need to configure VPC under these ports.

Interface po101

switchport

switchport mode fex-fabric

fex associate 101

vpc 101                      <----

We would also need to enable associated features such as "feature vpc" and "feature lacp".

As we know sequence of events while configuring/bringing up VPC first time is very important, we need to make sure peer keepalive is up and passing traffic before further VPC configuration. In our case we need to configure links associated with peer keepalive first. So following configuration we need to apply first;

vlan 9              <-- we need to configure this vlan to be used later for peer-keepalive
name peer-keepalive

vrf context vPC-PEER-KEEPALIVE
interface e3/30
 desc vPC Peer Keepalive link
 switchport mode trunk
 switchport trunk allowed vlan 9
 spanning-tree port type edge trunk
 spanning-tree bpdufilter enable
 no shutdown

interface vlan 9
 description vPC Peer-keepalive
 vrf member  vPC-PEER-KEEPALIVE
 ip address 198.168.9.57/30
 no shutdown

Also following is the link to VPC best practices;

http://www.cisco.com/c/dam/en/us/td/docs/switches/datacenter/sw/design/vpc_design/vpc_best_practices_design_guide.pdf

Thanks,

Parminder

Hi Taj,

Correction to my last reply, I realized that you are trying to configure enhanced VPC (2 layer VPC) which has dual homed fex with VPC extended to host itself. This design is not yet supported on Nexus 7k.

Though dual homed fex is supported on N7k from release 7.2 onwards, but host VPC with dual homed fex is not yet supported. But same is supported on Nexus 5k and Nexus 6k.

You would need to modify your design and configuration as per supported topologies on N7k. Following link has most recent supported/un-supported topolgies;

https://www.cisco.com/c/en/us/support/docs/switches/nexus-2000-series-fabric-extenders/200363-Nexus-2000-Fabric-Extenders-Supported-Un.html

Kindly let me know if you have any further questions.

Thanks,

Parminder Nath

Hi Parminder,

Thanks for providing the link on the supported topologies. 

I will plan to design based on the "Host VPC (Dual Links) and FEX Single Homed (Port Channel Mode) Straight through VPC Design" and "Host VPC (Single Link) and Active-Active FEX (Enhanced VPC) Design" for the Dual NIC hosts. Can you please tell me if the same design can support the single link hosts (Single NIC servers). 

Could you please tell me if the NX-OS 6.2 or NX-OS 7.3 support the above designs. 

For the configuration, I used "http://www.cisco.com/c/dam/en/us/td/docs/switches/datacenter/sw/design/vpc_design/vpc_best_practices_design_guide.pdf", page 111-113 as guideline. 

If there is a better design and configuration document, can you please share it to me.

Thanks

Taj

Hi Taj,

"Host VPC (Dual Links) and FEX Single Homed (Port Channel Mode) Straight through VPC Design" is supported on N7k but as supported designs link "Host VPC (Single Link) and Active-Active FEX (Enhanced VPC) Design"  is not yet supported on Nexus 7k.

"Host VPC (Dual Links) and FEX Single Homed (Port Channel Mode) Straight through VPC Design" is supported in 6.2 also but it will be better to go with 7.2 or latest 7.x release.

For VPC, the document you have is latest best practices and design document.

For for further references you can you other configuration and design guides for Nexus 7K switches published in below location:

http://www.cisco.com/c/en/us/support/switches/nexus-7000-series-switches/tsd-products-support-series-home.html

Hope this information is usefull.

Thanks,

Parminder

John Ventura
Level 1
Level 1

Hello,

We are running N9K pairs in core running VPC. We are facing issue randomly where many of VLANs becomes suspended on peer-link and comes upcausing disruption in traffic. During the issue we gets below logs.

Can you please help us understand what can be the issue.

 

 

core1-ehp %STP-2-VPC_PEER_LINK_INCONSIST_BLOCK: vPC peer-link detected BPDU receive timeout blocking port-channel1 VLAN0152.

core1-ehp %STP-2-VPC_PEER_LINK_INCONSIST_BLOCK: vPC peer-link detected BPDU receive timeout blocking port-channel1 VLAN0157.

core1-ehp %STP-2-VPC_PEER_LINK_INCONSIST_UNBLOCK: vPC peer-link inconsistency cleared unblocking port-channel1 VLAN0152.

core1-ehp %STP-2-VPC_PEER_LINK_INCONSIST_UNBLOCK: vPC peer-link inconsistency cleared unblocking port-channel1 VLAN0141.

core1-ehp %STP-2-BRIDGE_ASSURANCE_BLOCK: Bridge Assurance blocking port port-channel1 VLAN0104.

core1-ehp %STP-2-BRIDGE_ASSURANCE_UNBLOCK: Bridge Assurance unblocking port port-channel1 VLAN0104.

core1-ehp %STP-2-VPC_PEER_LINK_INCONSIST_UNBLOCK: vPC peer-link inconsistency cleared unblocking port-channel1 VLAN0104.

core1-ehp %STP-2-BRIDGE_ASSURANCE_BLOCK: Bridge Assurance blocking port port-channel1 VLAN0124.

core1-ehp %STP-2-BRIDGE_ASSURANCE_UNBLOCK: Bridge Assurance unblocking port port-channel1 VLAN0124.

core1-ehp %STP-2-VPC_PEER_LINK_INCONSIST_UNBLOCK: vPC peer-link 

Hi John

VLANs on peer-link are going in to inconsistent state because BPDUs are timing out and Bridge Assurance is kicking in.
So basically we need to check why BPDU are not receiving intermittently. Please check the below:

- Check if CPU goes high during the BPDU timeout. High CPU utilization can cause these BPDU timeouts.

show processes cpu sort | ex 0.0
show processes cpu history

- Are their any errors on the physical interfaces of peer-link. If yes then try replacing cable/SFP if that resolves the issue.


- If there are no errors/drops on the link then check if there are any drops on COPP. If you are using strict COPP policy and running very high number of VLANs then there are chances that BPDU can be dropped because of high BPDU traffic. 

show policy-map interface control-plane | i drop|violate

- Check if STP is stable

- Check if you have very high traffic on peer-link. Although data-plane traffic shouldn't affect control plane traffic but it is something you need to avoid.


- Also in a similar issue, i found customer was not using COPP with a thinking that BW for control-plane is not limited with COPP disabled. But that's not true in case of N9K. In N9K when COPP is disabled by default CPU bound queues are restricted with 50 PPS rate which is very low BW
and can cause BPDUs and other control plane traffic to drop easily.

There is bug already there for this behavior:

https://bst.cloudapps.cisco.com/bugsearch/bug/CSCun09035/?reffering_site=dumpcr

Thanks
Mahabir

Hi Mahabir,

 

while searching solution for the same type of issue in my environment with N5K VPC, I see the bridge Assurance block kicked in our vpc-peer-link due to peer hardware failure but at the same time we lost the redundancy. could you please explain why we lost the redundancy since the other switch working fine. The issue resolved after reloading the problematic device and the problematic device booted with below error message. After hardware issue we lost all connectivity to this VPC domain.

"Testing partial write-read loop without delay"

 

Thanks & Regards,

Bharathi

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: