04-12-2011 12:20 AM - edited 03-06-2019 04:33 PM
We have two 6500 in a VSS-deployment in the core. To that cluster we connect 4 Nexus 7018, which have uplinks to both 6500 in a port-channel. We run the port-channel as a routed port and use subinterfaces for each vrf. The problem is when we turn off one of the chassis in the VSS, the Nexus takes down the OSPF-adjacency and rebuilds it. The FIB is also wiped clear and rebuilt. This happens immediately when one of the VSS-chassis is turned off or reloaded.
Is there some special consideration we have to take care of when it comes to port-channels and OSPF? Our understanding is that when we kill one chassis, those links go down, and are removed from the port-channel bundle. The rest of the traffic is then redistributed over the remaining links. NSF should take care of forwarding until the new supervisor in the VSS has built its FIB.
NSF never kicks in, on the Nexuses in 'show ip ospf' the grace period is never in effect.
04-15-2011 03:41 AM
Trying to jump to the conclusion here, can you make sure that the IOS routers are configured with 'nsf ietf', as the Nexus OS does IETF NSF only,
and not Cisco NSF?
04-15-2011 03:51 AM
Thanks for your response. You're on to something, we changed it to ietf on the Nexuses from the cisco default, and it seems like NSF is working now. We verified through show ip ospf on the nexuses that the grace-period was in effect. However, the problem is that traffic is still being dropped for about 16 seconds. When we reload/shut down the active chassis, traffic is being forwarded for a while (indicating NSF is working), but after 10 seconds or so the traffic stops for 16 seconds. We're doing some more tests now to see what happens to the forwarding tables on the Nexuses, but this isn't quite right is it? If NSF was working 100%, [almost] no traffic should be dropped right?
04-18-2011 03:52 AM
Indeed with NSF there should be no drops. What is the grace period configured on the nexus? Default is 60 secs.
04-18-2011 03:57 AM
Yea, 60 seconds. Turns out that layer 3 portchannels with subinterfaces for each VRF doesn't work. We created a layer 2 trunk with SVIs instead and it works like a charm, maybe 1-2 seconds drop but thats it.
Thanks for your help Herve!
01-09-2012 04:00 AM
I have the same problem and unfortunately I cannot migrate to SVIs.
Do you know why there is a problem with the NSF on layer 3 portchannels with subinterfaces and what can be done to fix it ?
Thanks for any help.
01-09-2012 04:06 AM
There was some talk about internal state being very slow to react to changes when using l3 portchannel subifs, and there was no workaround at that time. Unfortunately, i dont have this from Cisco. Open a informational tac? Sorry that i cant be more specific:-/
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: