07-31-2013 12:35 AM - edited 03-07-2019 02:41 PM
Hello all!
The core of our network is a double Cisco 6509 in VSS mode. Yestarday, the standby-hot switch
unexpectedly crashed and restarted himself. After that our network has became an unstable work. We saw many collisions in the network. Some switches have not answered for a ping. Then, we decided to turn off the active switch, because
he was the closest thing.
After that second the stanby-hot switch became active and network became to work correctly. Now the one switch is turned off.
Logs from active (for this time) switch:
30-07-2013 14:32:16 Local6.Debug 172.26.22.10 117801: Jul 30 14:32:13 MSK: Config Sync: Line-by-Line sync verifying failure on command:
30-07-2013 14:32:16 Local6.Debug 172.26.22.10 117802: 6420 permit ip any addrgroup users.company_domain_controllers
30-07-2013 14:32:16 Local6.Debug 172.26.22.10 117803: due to parser return error
30-07-2013 14:32:17 Local6.Notice 172.26.22.10 117804: Jul 30 14:32:14 MSK: %RF-SW2_SP-5-RF_RELOAD: Peer reload. Reason: Proxy request to reload peer
30-07-2013 14:32:17 Local6.Notice 172.26.22.10 117805: Jul 30 14:32:14 MSK: %SYS-SW1_SPSTBY-5-RELOAD: Reload requested - From Active Switch (Reload peer unit).
30-07-2013 14:32:17 Local6.Notice 172.26.22.10 117806: Jul 30 14:32:16 MSK: %LINEPROTO-5-UPDOWN: Line protocol on Interface TenGigabitEthernet2/5/4, changed state to down
30-07-2013 14:32:17 Local6.Notice 172.26.22.10 117807: Jul 30 14:32:16 MSK: %LINEPROTO-5-UPDOWN: Line protocol on Interface TenGigabitEthernet1/5/4, changed state to down
30-07-2013 14:32:17 Local6.Error 172.26.22.10 117808: Jul 30 14:32:16 MSK: %LINK-3-UPDOWN: Interface TenGigabitEthernet2/5/4, changed state to down
30-07-2013 14:32:17 Local6.Error 172.26.22.10 117809: Jul 30 14:32:16 MSK: %LINK-3-UPDOWN: Interface TenGigabitEthernet1/5/4, changed state to down
30-07-2013 14:32:17 Local6.Notice 172.26.22.10 117810: Jul 30 14:32:16 MSK: %LINEPROTO-5-UPDOWN: Line protocol on Interface TenGigabitEthernet2/3/7, changed state to down
30-07-2013 14:32:17 Local6.Notice 172.26.22.10 117811: Jul 30 14:32:16 MSK: %LINEPROTO-5-UPDOWN: Line protocol on Interface Port-channel20, changed state to down
30-07-2013 14:32:17 Local6.Error 172.26.22.10 117812: Jul 30 14:32:16 MSK: %LINK-3-UPDOWN: Interface Port-channel20, changed state to down
30-07-2013 14:32:17 Local6.Error 172.26.22.10 117813: Jul 30 14:32:16 MSK: %LINK-3-UPDOWN: Interface TenGigabitEthernet2/3/7, changed state to down
30-07-2013 14:32:17 Local6.Error 172.26.22.10 117814: Jul 30 14:32:16 MSK: %VSLP-SW2_SP-3-VSLP_LMP_FAIL_REASON: Te2/5/4: Disabled by Peer Reload Request
30-07-2013 14:32:17 Local6.Notice 172.26.22.10 117815: Jul 30 14:32:16 MSK: %VSL-SW2_SP-5-VSL_CNTRL_LINK: New VSL Control Link Te2/3/7
30-07-2013 14:32:17 Local6.Error 172.26.22.10 117816: Jul 30 14:32:16 MSK: %VSLP-SW2_SP-3-VSLP_LMP_FAIL_REASON: Te2/3/7: Disabled by Peer Reload Request
30-07-2013 14:32:17 Local6.Critical 172.26.22.10 117817: Jul 30 14:32:16 MSK: %VSLP-SW2_SP-2-VSL_DOWN: Last VSL interface Te2/3/7 went down
30-07-2013 14:32:17 Local6.Notice 172.26.22.10 117818: Jul 30 14:32:16 MSK: %LINEPROTO-SW2_SP-5-UPDOWN: Line protocol on Interface TenGigabitEthernet2/5/4, changed state to down
30-07-2013 14:32:17 Local6.Notice 172.26.22.10 117819: Jul 30 14:32:16 MSK: %LINEPROTO-SW2_SP-5-UPDOWN: Line protocol on Interface TenGigabitEthernet1/5/4, changed state to down
30-07-2013 14:32:17 Local6.Critical 172.26.22.10 117820: Jul 30 14:32:16 MSK: %VSLP-SW2_SP-2-VSL_DOWN: All VSL links went down while switch is in ACTIVE role
30-07-2013 14:32:17 Local6.Error 172.26.22.10 117821: Jul 30 14:32:16 MSK: %LINK-SW2_SP-3-UPDOWN: Interface TenGigabitEthernet2/5/4, changed state to down
30-07-2013 14:32:17 Local6.Error 172.26.22.10 117822: Jul 30 14:32:16 MSK: %LINK-SW2_SP-3-UPDOWN: Interface TenGigabitEthernet1/5/4, changed state to down
30-07-2013 14:32:17 Local6.Info 172.26.22.10 117823: Jul 30 14:32:16 MSK: %PFREDUN-SW2_SP-6-ACTIVE: Standby processor removed or reloaded, changing to Simplex mode
These messages are repeated before i turned off the active switch:
30-07-2013 14:40:03 Local6.Error 172.26.22.10 118110: Jul 30 14:40:01 MSK: %XDR-3-CLIENTISSUBADNEGOMSG: Unexpected nego msg - slot 17/0 (17), XDR client IPv6 table broker, ctxt 0
30-07-2013 14:40:14 Local6.Info 172.26.22.10 118119: Jul 30 10:40:01.601: %XDR-DFC1-6-ISSUBADRCVTFM: Failed to rcv_transform message - slot RP (63), reason: ISSU_RC_NEGO_NOT_FINISHED
30-07-2013 14:41:38 Local6.Info 172.26.22.10 118226: Jul 30 14:41:36 MSK: %XDR-SW1_DFC1-6-ISSUBADRCVTFM: Failed to rcv_transform message - slot RP (63), reason: ISSU_RC_NEGO_NOT_FINISHED
30-07-2013 14:42:38 Local6.Info 172.26.22.10 118234: Jul 30 14:42:36 MSK: %XDR-SW1_DFC1-6-ISSUBADRCVTFM: Failed to rcv_transform message - slot RP (63), reason: ISSU_RC_NEGO_NOT_FINISHED
30-07-2013 14:43:39 Local6.Info 172.26.22.10 118242: Jul 30 14:43:36 MSK: %XDR-SW1_DFC1-6-ISSUBADRCVTFM: Failed to rcv_transform message - slot RP (63), reason: ISSU_RC_NEGO_NOT_FINISHED
172.26.22.10 - is a common logical IP address of VSS.
The module №5 in each switch is a SUP-720 supervisor.
Does anyone have an idea? Now i fear to turn on the second switch before i will understand - what does it mean?
07-31-2013 02:49 AM
Remove the vss links from the problem switching make sure it's isolated from the network and power it up.
Once it's on check the config with the active one.
We use two 6509s with 2 sups in each (vss) in our core.
Sent from Cisco Technical Support iPhone App.
07-31-2013 06:35 AM
Thx, for you answer. But, I want to know what is the reason of this crash?
07-31-2013 07:51 AM
I certainly could be wrong, but judging from the logs (first 5 rows) is standby switch reboots (but not active) after a configuration change...look at the "Config Sync: Line-by-Line sync verifying failure on command... due to parser return error"
Can you give an output of the commands:
show switch virtual role
show redundancy
show switch virtual redundancy
What the release of running IOS on the VSS? Is it from Cisco Safe Harbor program (http://www.cisco.com/go/safeharbor)?
08-01-2013 12:11 AM
Yes, you are right. I understanded this after my post is completed. I corrected my post.
DSW-VSS#show switch virtual role
Switch Switch Status Preempt Priority Role Session ID
Number Oper(Conf) Oper(Conf) Local Remote
------------------------------------------------------------------
LOCAL 1 UP FALSE(N ) 110(110) ACTIVE 0 0
In dual-active recovery mode: No
DSW-VSS#sho redundancy
Redundant System Information :
------------------------------
Available system uptime = 41 weeks, 1 day, 6 hours, 21 minutes
Switchovers system experienced = 4
Standby failures = 0
Last switchover reason = active unit removed
Hardware Mode = Simplex
Configured Redundancy Mode = sso
Operating Redundancy Mode = sso
Maintenance Mode = Disabled
Communications = Down Reason: Simplex mode
Current Processor Information :
-------------------------------
Active Location = slot 1/5
Current Software state = ACTIVE
Uptime in current state = 1 day, 19 hours, 40 minutes
Image Version = Cisco IOS Software, s72033_rp Software (s72033_rp-ADVENTERPRISEK9_WAN-M), Version 12.2(33)SXI6, RELEASE SOFTWARE (fc4)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2011 by Cisco Systems, Inc.
Compiled Mon 28-Mar-11 12:09 by prod_rel_team
BOOT = bootflash:s72033-adventerprisek9_wan-mz.122-33.SXI6.bin,1;sup-bootdisk:s72033-adventerprisek9_wan-mz.122-33.SXH4.bin,1;
Configuration register = 0x2102
Peer (slot: unavailable) information is not available because it is in 'DISABLED' state
DSW-VSS#sho swi vi red
My Switch Id = 1
Peer Switch Id = 2
Last switchover reason = active unit removed
Configured Redundancy Mode = sso
Operating Redundancy Mode = sso
Switch 1 Slot 5 Processor Information :
-----------------------------------------------
Current Software state = ACTIVE
Uptime in current state = 1 day, 19 hours, 41 minutes
Image Version = Cisco IOS Software, s72033_rp Software (s72033_rp-ADVENTERPRISEK9_WAN-M), Version 12.2(33)SXI6, RELEASE SOFTWARE (fc4)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2011 by Cisco Systems, Inc.
Compiled Mon 28-Mar-11 12:09 by prod_rel_team
BOOT = bootflash:s72033-adventerprisek9_wan-mz.122-33.SXI6.bin,1;sup-bootdisk:s72033-adventerprisek9_wan-mz.122-33.SXH4.bin,1;
Configuration register = 0x2102
Fabric State = ACTIVE
Control Plane State = ACTIVE
Peer information is not available because
it is in 'DISABLED' state
Image Version = Cisco IOS Software, s72033_rp Software (s72033_rp-ADVENTERPRISEK9_WAN-M), Version 12.2(33)SXI6, RELEASE SOFTWARE (fc4)
08-01-2013 01:06 AM
Anton, I asked you to bring the output of these commands, to once again make sure that standby switch is reboots...
According to the logs, it reload unexpected after configuration change due config sync error:
30-07-2013 14:32:16 Local6.Debug 172.26.22.10 117801: Jul 30 14:32:13 MSK: Config Sync: Line-by-Line sync verifying failure on command:
30-07-2013 14:32:16 Local6.Debug 172.26.22.10 117802: 6420 permit ip any addrgroup users.company_domain_controllers
30-07-2013 14:32:16 Local6.Debug 172.26.22.10 117803: due to parser return error
Why parser return error on the command - "6420 permit ip any addrgroup users.company_domain_controllers" i unfortunately can not tell you (this is beyond my knowledge)...
In any case, unexpected reload of standby VSS peer should not be affect the normal operation of the network...
In my opinion, you had a Dual-Active Scenario, which causes network to unstable...
Are you VSS have configured Dual-Active Detection methods?
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide