cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
936
Views
0
Helpful
2
Replies

I have a VSS failure and two 6509 were all out of work.

fly
Level 2
Level 2

  Hi,

      standby 6509 sup in VSS crashed, and remain in rommon status and hadn't come back untill we reload it on the moring,

      but we found active 6509 shutdown all it's MEC interfaces , and standby 6509 remain in rommon status,so we run into a totoally out of work situation untill we reload standby box on the morning.

     customer lost communication for almost 8 hours, because fail happened at middle night.

     is there any mechanism like VPC auto recovery to config on VSS.

    this is ridiculous , because we have one full function 6509 but lost all communication in this failure.

     how can i avoid this , we config VSL and fast hello dual-active detection. but can't recovery automactilly from this fail

    another question: cisco support community change it's web page i can't find my post.

     thank you!

  Tom

2 Replies 2

alex.de.gracia
Level 1
Level 1

Any diagram?

 

Proper working dual active detection would prevent this split brain scenario. 

fly
Level 2
Level 2

HiHi

  Thank you

    two 6509 VSS, has one interface each box connect a layer 3 swtichA through portchannel. I found two interface on switchA also down and nerver come on until we reload standby 6509, and ospf neighbor to this switchA was also down.

    we config two tengigabit eth interface on diifferen line card as VSL, and two gigabit interface as fast hello dual-active detect interface.

   found log as below from active 6509

   but custmer didn't check active 6509 status and reboot standby 6509 directly to recovery.

    I think active 6509 shutdown it's interface and standby 6509 remain in rommon status, two 6509 all were out of services , so everything were lost for 8 hours.

    I think during standby 6509 crash( i can find crash file to to memory parity error), there is little time , active 6509 still can detect dual active through fast hello, so active 6509 shutdown it's interfaces, but unfortrunately standby 6509 reamin in rommon status, two box were out of work at same time.

    if standby 6509 are totally down(or crash,or power down) ,  does actice 6509 shutdown interfaces?( I think this is not correct behavior, active 6509 should keep working?)

    if active 6509 are totally down( or crash), does standby 6509 take over,

    If my thoughts is correct, this is rididuculous, because  i can't avoid whole standby 6509 crash or power down.

    but if standby 6509 is down, how can active 6509 know standby 6509 is down or VSL and dual-active detect interface are all down to avoid dual active.

   is there any possible to find a way: if standby 6509 is down or crash , active 6509 can detect this is not a dual active,and remain working.

  if VSL and dual active detect interface are all down, two 6509s can detect this dual active and active 6509 shutdown it's interface

thank you!

Tom

 

LL to DOWN, Neighbor Down: Interface down or detached
Mar 17 23:31:27.440: SW1_SP:  Switch 2 Physical Slot 1 - Module Type LINE_CARD  removed

Mar 17 23:31:27.939: SW1_SP:  Switch 2 Physical Slot 3 - Module Type LINE_CARD  removed

Mar 17 23:31:28.175: SW1_SP:  Switch 2 Physical Slot 8 - Module Type LINE_CARD  removed
Mar 17 23:31:28: %SATVS_IBC-SW1_SP-5-VSL_DOWN_SCP_DROP: VSL inactive - dropping cached SCP packet: (SA/DA:0x4/0x4, SSAP/DSAP:0x18/0x0, OP/SEQ:0x320/0x66AF, SIG/INFO:0x1/0x501, eSA:0000.0500.0000)
Mar 17 23:31:29: %OSPF-5-ADJCHG: Process 2011, Nbr 10.218.96.51 on Vlan181 from FULL to DOWN, Neighbor Down: Interface down or detached
Mar 17 23:31:29.163: SW1_SP:  Switch 2 Physical Slot 9 - Module Type LINE_CARD  removed
Mar 17 23:31:36: %VSDA-SW1_SP-3-LINK_DOWN: Interface Gi1/1/23 is no longer dual-active detection capable
Mar 17 23:31:36: %VSDA-SW1_SP-3-LINK_DOWN: Interface Gi1/1/24 is no longer dual-active detection capable

 

   

 

 

Review Cisco Networking for a $25 gift card