We have two 6500 in a VSS-deployment in the core. To that cluster we connect 4 Nexus 7018, which have uplinks to both 6500 in a port-channel. We run the port-channel as a routed port and use subinterfaces for each vrf. The problem is when we turn off one of the chassis in the VSS, the Nexus takes down the OSPF-adjacency and rebuilds it. The FIB is also wiped clear and rebuilt. This happens immediately when one of the VSS-chassis is turned off or reloaded.
Is there some special consideration we have to take care of when it comes to port-channels and OSPF? Our understanding is that when we kill one chassis, those links go down, and are removed from the port-channel bundle. The rest of the traffic is then redistributed over the remaining links. NSF should take care of forwarding until the new supervisor in the VSS has built its FIB.
NSF never kicks in, on the Nexuses in 'show ip ospf' the grace period is never in effect.
Thanks for your response. You're on to something, we changed it to ietf on the Nexuses from the cisco default, and it seems like NSF is working now. We verified through show ip ospf on the nexuses that the grace-period was in effect. However, the problem is that traffic is still being dropped for about 16 seconds. When we reload/shut down the active chassis, traffic is being forwarded for a while (indicating NSF is working), but after 10 seconds or so the traffic stops for 16 seconds. We're doing some more tests now to see what happens to the forwarding tables on the Nexuses, but this isn't quite right is it? If NSF was working 100%, [almost] no traffic should be dropped right?