cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2540
Views
0
Helpful
3
Replies

Best Practice for VPC Domain failover with One M2 per N7K switch and 2 sups

Nick Cutting
Level 1
Level 1

I Have been testing some failover scenarios with 4 nexus 7000 switches with an M2 and an F2 card in each. Each Nexus has two supervisor modules.

I have 3 VDC's Admin, F2 and M2

all ports in the M2 are in the M2 VDC and all ports on the F2 are in the F2 VDC.

All vPC's are connected on the M2 cards, configured in the M2 VDC

We have 2 Nexus representing each "site"

In one site we have a vPC domain "100"

The vPC Peer link is connected on ports E1/3 and E1/4 in Port channel 100

The peer-keepalive is configured to use the management ports. This is patched in both Sups into our 3750s. (this is will eventually be on a management out of band switch)

Please see the diagram.testing-ciscover.jpg

There are 2 vPC's 1&2 connected at each site which represent the virtual port channels that connect back to a pair of 3750X's (the layer 2 switch icons in the diagram.)

There is also the third vPC that connects the 4 Nexus's together. (po172)

We are stretching vlan 900 across the "sites" and would like to keep spanning tree out of this as much as we can, and minimise outages based on link failures, module failures, switch failures, sup failures etc..

ONLY the management vlan (100,101) is allowed on the port-channel between the 3750's, so vlan 900 spanning tree shouldnt have to make this decision.

We are only concerned about layer two for this part of the testing.

As we are connecting the vPC peer link to only one module in each switch (a sinlge) M2 we have configured object tracking as follows:

n7k-1(config)#track 1 interface ethernet 1/1 line-protocol

n7k-1(config)#track 2 interface ethernet 1/2 line-protocol

n7k-1(config)#track 5 interface ethernet 1/5 line-protocol

track 101 list boolean OR

n7k-1(config-track)# object 1

n7k-1(config-track)# object 2

n7k-1(config-track)# object 5

n7k-1(config-track)# end

n7k-1(config)# vpc domain 101

n7k-1(config-vpc-domain)# track 101

The other site is the same, just 100 instead of 101.

We are not tracking port channel 101, not the member interfaces of this port channel as this is the peer link and apparently tracking upstream interfaces and the peer link is only necessary when you have ONE link and one module per switch.

As the interfaces we are tracking are member ports of a vPC, is this a chicken and egg scenario when seeing if these 3 interfaces are up? or is line-protocol purely layer 1 - so that the vPC isnt downing these member ports at layer 2 when it sees a local vPC domain failure, so that the track fails?

I see most people are monitoring upstream layer3 ports that connect back to a core? what about what we are doing monitoring upstream(the 3750's) & downstream layer2 (the other site) - that are part of the very vPC we are trying to protect?

We wanted all 3 of these to be down, for example if the local M2 card failed, the keepalive would send the message to the remote peer to take over.

What are the best practices here? Which objects should we be tracking? Should we also track the perr-link Port channel101?

We saw minimal outages using this design. when reloading the M2 modules, usually 1 -3 pings lost between the laptops in the diff sites across the stretched vlan. Obviously no outages when breaking any link in a vPC

Any wisdom would be greatly appreciated.

Nick

3 Replies 3

Jayakrishna Mada
Cisco Employee
Cisco Employee

Hi  Nick,

I assume that 1/1,1/2 and 1/5 refer to 0/1 ,0/2 and 0/5 in the diagram.

Yes you should be tracking the peer-link as well. If for some reason only these 3 interfaces goes down and your peer-link is still up a failover is unnecessary.

The reason why most people track upstream interfaces is that because during a module failure where vPC peer-link and uplink is on same interface traffic is blockholed, in these scenarios the downstream (like your vPC towards 3750) are on different modules.

In your case you have your peer-link, upstream link and downstream link on a single module. The thing thats saving you here is the port-channel between the two 3750 switches. Its not common to see a pair of 3750 switches interconneted which are vPCed to Nexus switches as spanning-tree would be blocking the port-channel between the 3750 but in your case I guess it needed since there is only module here. The 1-3 ping losses that you are seeing here I believe is because the port-channel between the 3750 is going into blocking to forwarding state.

Hope this helps.

Thanks

JayaKrishna

Thank you very much for your reply,           

When we bring something similar into production, the mgmt0 links that carry the keep-alives will be plugged into an out of band management switch.  I imagine the nexus switches will be managed on these same keep-alive addresses.  Do you mean the STP convergence of the management vlan (the only vlan carried in testing) made the secondary VPC switch wait befoire it was told be come the primary when we restarted the module on switch 1?  I thought rstp was quicker than 4 lost pings?

The 3750's in this diagram represent 6500's, and the VPC between the nexus and theese will be trunked, but ONLY carrying a point-to-point vlan for routing between the old and new kit.  - Should this not  be a vpc link? should we not mix vpc and layer3?

For the stretched vlan we may carry this over the vpc link, or use a seperate link.

For layer3 links do people use vPC and HSRP on each side?

i.e

6500        .253                                                         .2 NExus

hsrp         .254              ------VPC------                     .1 hsrp

                .252                                                         .3

Also with the tracking do we only need to track the peer link plus enough "resilient tracking" that we know a module has failed, or is it recommended to track every link out of the module not connected to a FEX?

Nick,

I was not talking about the mgmt0 interface. The vlan that you are testing will have a link blocked between the two 3750 port-channel if the root is on the nexus vPC pair.

Logically your topology is like this:

     -----------------------------

    |                             |

    |   Nexus Pair          |

    ------------------------------

       /                        \

      /                          \

     /                            \

3750-1-----------------------3750-2

Since you have this triangle setup one of the links will be in blocking state for any vlan configured on these devices.

When you are talking about vPC and L3 are you talking about L3 routing protocols or just intervaln routing.

Intervlan routing is fine. Running L3 routing protocols over the peer-link and forming an adjaceny with an router upstream using L2 links is not recommended. Teh following link should give you an idea about what I am talking here:

http://bradhedlund.com/2010/12/16/routing-over-nexus-7000-vpc-peer-link-yes-and-no/

HSRP is fine.

As mentioned tracking feature purpose is to avoid block hole of traffic. It completely depends on your network setup. Don't think you would be needing to track all the interfaces.

JayaKrishna