cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
5269
Views
0
Helpful
5
Replies

6509 VSS is crashed!

glmonarch
Level 1
Level 1

Hello all!

The core of our network is a double Cisco 6509 in VSS mode. Yestarday, the standby-hot switch

unexpectedly crashed and restarted himself. After that our network has became an unstable work. We saw many collisions in the network. Some switches have not answered for a ping. Then, we decided to turn off the active switch, because

he was the closest thing.

After that second the stanby-hot switch became active and network became to work correctly. Now the one switch is turned off.

Logs from active (for this time) switch:

30-07-2013         14:32:16               Local6.Debug    172.26.22.10      117801: Jul 30 14:32:13 MSK: Config Sync: Line-by-Line sync verifying failure on command:

30-07-2013         14:32:16               Local6.Debug    172.26.22.10      117802:   6420 permit ip any addrgroup users.company_domain_controllers

30-07-2013         14:32:16               Local6.Debug    172.26.22.10      117803: due to parser return error

30-07-2013         14:32:17               Local6.Notice    172.26.22.10      117804: Jul 30 14:32:14 MSK: %RF-SW2_SP-5-RF_RELOAD: Peer reload. Reason: Proxy request to reload peer

30-07-2013         14:32:17               Local6.Notice    172.26.22.10      117805: Jul 30 14:32:14 MSK: %SYS-SW1_SPSTBY-5-RELOAD: Reload requested - From Active Switch (Reload peer unit).

30-07-2013         14:32:17               Local6.Notice    172.26.22.10      117806: Jul 30 14:32:16 MSK: %LINEPROTO-5-UPDOWN: Line protocol on Interface TenGigabitEthernet2/5/4, changed state to down

30-07-2013         14:32:17               Local6.Notice    172.26.22.10      117807: Jul 30 14:32:16 MSK: %LINEPROTO-5-UPDOWN: Line protocol on Interface TenGigabitEthernet1/5/4, changed state to down

30-07-2013         14:32:17               Local6.Error        172.26.22.10      117808: Jul 30 14:32:16 MSK: %LINK-3-UPDOWN: Interface TenGigabitEthernet2/5/4, changed state to down

30-07-2013         14:32:17               Local6.Error        172.26.22.10      117809: Jul 30 14:32:16 MSK: %LINK-3-UPDOWN: Interface TenGigabitEthernet1/5/4, changed state to down

30-07-2013         14:32:17               Local6.Notice    172.26.22.10      117810: Jul 30 14:32:16 MSK: %LINEPROTO-5-UPDOWN: Line protocol on Interface TenGigabitEthernet2/3/7, changed state to down

30-07-2013         14:32:17               Local6.Notice    172.26.22.10      117811: Jul 30 14:32:16 MSK: %LINEPROTO-5-UPDOWN: Line protocol on Interface Port-channel20, changed state to down

30-07-2013         14:32:17               Local6.Error        172.26.22.10      117812: Jul 30 14:32:16 MSK: %LINK-3-UPDOWN: Interface Port-channel20, changed state to down

30-07-2013         14:32:17               Local6.Error        172.26.22.10      117813: Jul 30 14:32:16 MSK: %LINK-3-UPDOWN: Interface TenGigabitEthernet2/3/7, changed state to down

30-07-2013         14:32:17               Local6.Error        172.26.22.10      117814: Jul 30 14:32:16 MSK: %VSLP-SW2_SP-3-VSLP_LMP_FAIL_REASON: Te2/5/4: Disabled by Peer Reload Request

30-07-2013         14:32:17               Local6.Notice    172.26.22.10      117815: Jul 30 14:32:16 MSK: %VSL-SW2_SP-5-VSL_CNTRL_LINK:  New VSL Control Link Te2/3/7

30-07-2013         14:32:17               Local6.Error        172.26.22.10      117816: Jul 30 14:32:16 MSK: %VSLP-SW2_SP-3-VSLP_LMP_FAIL_REASON: Te2/3/7: Disabled by Peer Reload Request

30-07-2013         14:32:17               Local6.Critical    172.26.22.10      117817: Jul 30 14:32:16 MSK: %VSLP-SW2_SP-2-VSL_DOWN:   Last VSL interface Te2/3/7 went down

30-07-2013         14:32:17               Local6.Notice    172.26.22.10      117818: Jul 30 14:32:16 MSK: %LINEPROTO-SW2_SP-5-UPDOWN: Line protocol on Interface TenGigabitEthernet2/5/4, changed state to down

30-07-2013         14:32:17               Local6.Notice    172.26.22.10      117819: Jul 30 14:32:16 MSK: %LINEPROTO-SW2_SP-5-UPDOWN: Line protocol on Interface TenGigabitEthernet1/5/4, changed state to down

30-07-2013         14:32:17               Local6.Critical    172.26.22.10      117820: Jul 30 14:32:16 MSK: %VSLP-SW2_SP-2-VSL_DOWN:   All VSL links went down while switch is in ACTIVE role

30-07-2013         14:32:17               Local6.Error        172.26.22.10      117821: Jul 30 14:32:16 MSK: %LINK-SW2_SP-3-UPDOWN: Interface TenGigabitEthernet2/5/4, changed state to down

30-07-2013         14:32:17               Local6.Error        172.26.22.10      117822: Jul 30 14:32:16 MSK: %LINK-SW2_SP-3-UPDOWN: Interface TenGigabitEthernet1/5/4, changed state to down

30-07-2013         14:32:17               Local6.Info         172.26.22.10      117823: Jul 30 14:32:16 MSK: %PFREDUN-SW2_SP-6-ACTIVE: Standby processor removed or reloaded, changing to Simplex mode

These messages are repeated before i turned off the active switch:

30-07-2013         14:40:03               Local6.Error        172.26.22.10      118110: Jul 30 14:40:01 MSK: %XDR-3-CLIENTISSUBADNEGOMSG: Unexpected nego msg - slot 17/0 (17), XDR client IPv6 table broker, ctxt 0

30-07-2013         14:40:14               Local6.Info         172.26.22.10      118119: Jul 30 10:40:01.601: %XDR-DFC1-6-ISSUBADRCVTFM: Failed to rcv_transform message - slot RP (63), reason: ISSU_RC_NEGO_NOT_FINISHED

30-07-2013         14:41:38               Local6.Info         172.26.22.10      118226: Jul 30 14:41:36 MSK: %XDR-SW1_DFC1-6-ISSUBADRCVTFM: Failed to rcv_transform message - slot RP (63), reason: ISSU_RC_NEGO_NOT_FINISHED

30-07-2013         14:42:38               Local6.Info         172.26.22.10      118234: Jul 30 14:42:36 MSK: %XDR-SW1_DFC1-6-ISSUBADRCVTFM: Failed to rcv_transform message - slot RP (63), reason: ISSU_RC_NEGO_NOT_FINISHED

30-07-2013         14:43:39               Local6.Info         172.26.22.10      118242: Jul 30 14:43:36 MSK: %XDR-SW1_DFC1-6-ISSUBADRCVTFM: Failed to rcv_transform message - slot RP (63), reason: ISSU_RC_NEGO_NOT_FINISHED

172.26.22.10 - is a common logical IP address of VSS.

The module №5 in each switch is a SUP-720 supervisor.

Does anyone have an idea? Now i fear to turn on the second switch before i will understand - what does it mean?

5 Replies 5

Simon Leigh
Level 1
Level 1

Remove the vss links from the problem switching make sure it's isolated from the network and power it up.
Once it's on check the config with the active one.
We use two 6509s with 2 sups in each (vss) in our core.

Sent from Cisco Technical Support iPhone App.

Thx, for you answer. But, I want to know what is the reason of this crash?

I certainly could be wrong, but judging from the logs (first 5 rows) is standby switch reboots (but not active) after a configuration change...look at the "Config Sync: Line-by-Line sync verifying failure on command... due to parser return error"
Can you give an output of the commands:

show switch virtual role

show redundancy

show switch virtual redundancy

What the release of running IOS on the VSS? Is it from Cisco Safe Harbor program (http://www.cisco.com/go/safeharbor)?

Yes, you are right. I understanded this after my post is completed. I corrected my post.

DSW-VSS#show switch virtual role

Switch  Switch Status  Preempt    Priority  Role     Session ID

        Number         Oper(Conf) Oper(Conf)         Local  Remote

------------------------------------------------------------------

LOCAL    1     UP      FALSE(N )   110(110)  ACTIVE   0      0

In dual-active recovery mode: No

DSW-VSS#sho redundancy

Redundant System Information :

------------------------------

       Available system uptime = 41 weeks, 1 day, 6 hours, 21 minutes

Switchovers system experienced = 4

              Standby failures = 0

        Last switchover reason = active unit removed

                 Hardware Mode = Simplex

    Configured Redundancy Mode = sso

     Operating Redundancy Mode = sso

              Maintenance Mode = Disabled

                Communications = Down      Reason: Simplex mode

Current Processor Information :

-------------------------------

               Active Location = slot 1/5

        Current Software state = ACTIVE

       Uptime in current state = 1 day, 19 hours, 40 minutes

                Image Version = Cisco IOS Software, s72033_rp Software (s72033_rp-ADVENTERPRISEK9_WAN-M), Version 12.2(33)SXI6, RELEASE SOFTWARE (fc4)

Technical Support: http://www.cisco.com/techsupport

Copyright (c) 1986-2011 by Cisco Systems, Inc.

Compiled Mon 28-Mar-11 12:09 by prod_rel_team

                          BOOT = bootflash:s72033-adventerprisek9_wan-mz.122-33.SXI6.bin,1;sup-bootdisk:s72033-adventerprisek9_wan-mz.122-33.SXH4.bin,1;

        Configuration register = 0x2102

Peer (slot: unavailable) information is not available because it is in 'DISABLED' state

DSW-VSS#sho swi vi red

                  My Switch Id = 1

                Peer Switch Id = 2

        Last switchover reason = active unit removed

    Configured Redundancy Mode = sso

     Operating Redundancy Mode = sso

Switch 1 Slot 5 Processor Information :

-----------------------------------------------

        Current Software state = ACTIVE

       Uptime in current state = 1 day, 19 hours, 41 minutes

                 Image Version = Cisco IOS Software, s72033_rp Software (s72033_rp-ADVENTERPRISEK9_WAN-M), Version 12.2(33)SXI6, RELEASE SOFTWARE (fc4)

Technical Support: http://www.cisco.com/techsupport

Copyright (c) 1986-2011 by Cisco Systems, Inc.

Compiled Mon 28-Mar-11 12:09 by prod_rel_team

                          BOOT = bootflash:s72033-adventerprisek9_wan-mz.122-33.SXI6.bin,1;sup-bootdisk:s72033-adventerprisek9_wan-mz.122-33.SXH4.bin,1;

        Configuration register = 0x2102

                  Fabric State = ACTIVE

           Control Plane State = ACTIVE

Peer information is not available because

it is in 'DISABLED' state

Image Version = Cisco IOS Software, s72033_rp Software (s72033_rp-ADVENTERPRISEK9_WAN-M), Version 12.2(33)SXI6, RELEASE SOFTWARE (fc4)

Anton, I asked you to bring the output of these commands, to once again make sure that standby switch is reboots...

According to the logs, it reload unexpected after configuration change due config sync error:

30-07-2013         14:32:16               Local6.Debug    172.26.22.10      117801: Jul 30 14:32:13 MSK: Config Sync: Line-by-Line sync verifying failure on command:

30-07-2013         14:32:16               Local6.Debug    172.26.22.10      117802:   6420 permit ip any addrgroup users.company_domain_controllers

30-07-2013         14:32:16               Local6.Debug    172.26.22.10      117803: due to parser return error

Why parser return error on the command - "6420 permit ip any addrgroup users.company_domain_controllers" i unfortunately can not tell you (this is beyond my knowledge)...

In any case, unexpected reload of standby VSS peer should not be affect the normal operation of the network...

In my opinion, you had a Dual-Active Scenario, which causes network to unstable...

Are you VSS have configured Dual-Active Detection methods?

Review Cisco Networking for a $25 gift card