cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
10541
Views
0
Helpful
16
Replies

Cisco 6880-x vsl link failure

Jose Davlla
Level 1
Level 1

Hi Guys,

I have a head scratcher that I'm dealing with.  I have 2x (2 month) 6880-x switches configured with vss with 2 vsl links on each of its sup cards and switch 1 is the active switch on the vss.  Both 6880x chassis also have 2 additional 10g modules each.  I am dealing with 2 issues in this setup.  

The first issue is,  i have a 3750x with a 10g module thats port channeled to the vss with one 10g link going to each of the chassis.  Within the first 3 days it started dropping the link that's plugged to switch1, then comes back on right away.  It does it randomly throughout the day.  I have changed out the optics on both ends, changed the cable, and also changed the module on the 3750x and I still get the same issue.  The strange part is that when switch2 on the vss becomes the active switch, the problem goes away.  I did notice this before changing out the optics, cable, module on the 3750x.

The second issue is the one I'm been trying to figure out along with cisco TAC.  This started about a month after bring the vss online.  When switch1 was the active switch in the vss, every couple of days vsl links drop one at a time, eventually killing the vss and putting the standby unit into recovery mode because of the dual active detection.  once I reboot the standby switch, the vss comes back up normally.  it did this a couple of times until I decided to force switch2 to be the active switch on the vss.  when switch2 became active, the vss was stable for about a month then the vsl links died again, and the system failed over to switch1.  after looking at the logs with cisco tac, we see that the vsl links stop responding which causes the failover, but up until now we still can't determine what is causing the vsl links to fail.  cisco tac said that maybe the vsl links were being overloaded but we have been monitoring the bandwidth utilization on the vsl links and they never go beyond 1% utilization.  The last suggestion by cisco tac was to add another vsl link but through another module other than the sup.  This was done a couple of days ago so now i'm waiting to see if the vsl links fail again and since switch1 is the active switch on the vss, i'm having to deal with the first issue above with the 3750x.

I've included some log entries from the dropped port-channel member and also logs for when the vsl links fail.

Logs for 3750x port-channel member drops.

*Jul  5 05:27:10 PDT: %LINEPROTO-SW1-5-UPDOWN: Line protocol on Interface TenGigabitEthernet2/2/5, changed state to down
*Jul  5 05:27:10 PDT: %LINK-SW1-3-UPDOWN: Interface TenGigabitEthernet2/2/5, changed state to down
*Jul  5 05:27:10 PDT: %LINK-SW1-3-UPDOWN: Interface TenGigabitEthernet2/2/5, changed state to up
*Jul  5 05:28:10 PDT: %LINEPROTO-SW1-5-UPDOWN: Line protocol on Interface TenGigabitEthernet2/2/5, changed state to up
*Jul  6 11:28:52 PDT: %LINEPROTO-SW1-5-UPDOWN: Line protocol on Interface TenGigabitEthernet2/2/5, changed state to down
*Jul  6 11:28:52 PDT: %LINK-SW1-3-UPDOWN: Interface TenGigabitEthernet2/2/5, changed state to down
*Jul  6 11:28:53 PDT: %LINK-SW1-3-UPDOWN: Interface TenGigabitEthernet2/2/5, changed state to up
*Jul  6 11:29:53 PDT: %LINEPROTO-SW1-5-UPDOWN: Line protocol on Interface TenGigabitEthernet2/2/5, changed state to up
*Jul  7 06:27:25 PDT: %LINEPROTO-SW1-5-UPDOWN: Line protocol on Interface TenGigabitEthernet2/2/5, changed state to down
*Jul  7 06:27:25 PDT: %LINK-SW1-3-UPDOWN: Interface TenGigabitEthernet2/2/5, changed state to down
*Jul  7 06:27:26 PDT: %LINK-SW1-3-UPDOWN: Interface TenGigabitEthernet2/2/5, changed state to up
*Jul  7 06:27:33 PDT: %LINEPROTO-SW1-5-UPDOWN: Line protocol on Interface TenGigabitEthernet2/2/5, changed state to up
*Jul  7 06:33:43 PDT: %LINEPROTO-SW1-5-UPDOWN: Line protocol on Interface TenGigabitEthernet2/2/5, changed state to down
*Jul  7 06:33:43 PDT: %LINK-SW1-3-UPDOWN: Interface TenGigabitEthernet2/2/5, changed state to down
*Jul  7 06:33:43 PDT: %LINK-SW1-3-UPDOWN: Interface TenGigabitEthernet2/2/5, changed state to up
*Jul  7 06:34:43 PDT: %LINEPROTO-SW1-5-UPDOWN: Line protocol on Interface TenGigabitEthernet2/2/5, changed state to up
*Jul  7 07:38:35 PDT: %LINEPROTO-SW1-5-UPDOWN: Line protocol on Interface TenGigabitEthernet2/2/5, changed state to down
*Jul  7 07:38:35 PDT: %LINK-SW1-3-UPDOWN: Interface TenGigabitEthernet2/2/5, changed state to down
*Jul  7 07:38:36 PDT: %LINK-SW1-3-UPDOWN: Interface TenGigabitEthernet2/2/5, changed state to up

 

log for VSL link failures

*Jun 27 23:59:25 PDT: %VSLP-SW2-3-VSLP_LMP_FAIL_REASON: Te2/5/15: Link down
*Jun 27 23:59:25 PDT: %VSL-SW2-5-VSL_CNTRL_LINK:  New VSL Control Link  2/5/16 

*Jun 27 23:59:25 PDT: %LINEPROTO-SW2-5-UPDOWN: Line protocol on Interface TenGigabitEthernet2/5/15, changed state to down
*Jun 27 23:59:25 PDT: %LINEPROTO-SW2-5-UPDOWN: Line protocol on Interface TenGigabitEthernet1/5/15, changed state to down
*Jun 27 23:59:25 PDT: %LINK-SW2-3-UPDOWN: Interface TenGigabitEthernet2/5/15, changed state to down
*Jun 27 23:59:25 PDT: %VSLP-SW2-3-VSLP_LMP_FAIL_REASON: Te2/5/16: Link down
*Jun 27 23:59:25 PDT: %VSLP-SW2-2-VSL_DOWN:   Last VSL interface Te2/5/16 went down

*Jun 27 23:59:25 PDT: %VSLP-SW2-2-VSL_DOWN:   All VSL links went down while switch is in ACTIVE role

*Jun 27 23:59:25 PDT: %LINEPROTO-SW2-5-UPDOWN: Line protocol on Interface TenGigabitEthernet2/5/16, changed state to down
*Jun 27 23:59:25 PDT: %LINEPROTO-SW2-5-UPDOWN: Line protocol on Interface Port-channel2, changed state to down
*Jun 27 23:59:25 PDT: %LINK-SW2-3-UPDOWN: Interface Port-channel2, changed state to down
*Jun 27 23:59:25 PDT: %LINK-SW2-3-UPDOWN: Interface TenGigabitEthernet2/5/16, changed state to down
*Jun 27 23:59:25 PDT: %LINEPROTO-SW2-5-UPDOWN: Line protocol on Interface Port-channel1, changed state to down
*Jun 27 23:59:25 PDT: %LINK-SW2-3-UPDOWN: Interface Port-channel1, changed state to down
*Jun 27 23:59:25 PDT: %OIR-SW2-6-INSREM: Switch 1 Physical Slot 5 - Module Type LINE_CARD  removed 
*Jun 27 23:59:25 PDT: %OSPF-SW2-5-ADJCHG: Process 1, Nbr 10.253.0.3 on TenGigabitEthernet1/1/1 from FULL to DOWN, Neighbor Down: Interface down or detached
*Jun 27 23:59:25 PDT: %LINEPROTO-SW2-5-UPDOWN: Line protocol on Interface Port-channel24, changed state to down
*Jun 27 23:59:25 PDT: %LINK-SW2-3-UPDOWN: Interface Port-channel24, changed state to down
*Jun 27 23:59:26 PDT: %OIR-SW2-6-INSREM: Switch 1 Physical Slot 1 - Module Type LINE_CARD  removed 
*Jun 27 23:59:26 PDT: %LINK-SW2-3-UPDOWN: Interface Port-channel17, changed state to down
*Jun 27 23:59:26 PDT: %PFREDUN-SW2-6-ACTIVE: Standby processor removed or reloaded, changing to Simplex mode
*Jun 27 23:59:26 PDT: %OIR-SW2-6-INSREM: Switch 1 Physical Slot 2 - Module Type LINE_CARD  removed 
*Jun 27 23:59:27 PDT: %LINK-SW2-3-UPDOWN: Interface TenGigabitEthernet1/5/14, changed state to down
*Jun 27 23:59:27 PDT: %LINK-SW2-3-UPDOWN: Interface TenGigabitEthernet1/5/15, changed state to down
*Jun 27 23:59:27 PDT: %LINK-SW2-3-UPDOWN: Interface TenGigabitEthernet1/5/16, changed state to down
*Jun 27 23:59:27 PDT: %LINEPROTO-SW2-5-UPDOWN: Line protocol on Interface TenGigabitEthernet1/5/14, changed state to down
*Jun 27 23:59:27 PDT: %LINEPROTO-SW2-5-UPDOWN: Line protocol on Interface TenGigabitEthernet1/5/16, changed state to down
*Jun 27 23:59:27 PDT: %LINK-SW2-3-UPDOWN: Interface TenGigabitEthernet1/1/1, changed state to down
*Jun 27 23:59:27 PDT: %LINK-SW2-3-UPDOWN: Interface TenGigabitEthernet1/1/2, changed state to down
*Jun 27 23:59:27 PDT: %LINK-SW2-3-UPDOWN: Interface TenGigabitEthernet1/1/3, changed state to down
*Jun 27 23:59:27 PDT: %LINEPROTO-SW2-5-UPDOWN: Line protocol on Interface TenGigabitEthernet1/1/1, changed state to down
*Jun 27 23:59:27 PDT: %LINEPROTO-SW2-5-UPDOWN: Line protocol on Interface TenGigabitEthernet1/1/2, changed state to down
*Jun 27 23:59:27 PDT: %LINEPROTO-SW2-5-UPDOWN: Line protocol on Interface TenGigabitEthernet1/1/3, changed state to down

 

Press RETURN to get started!


*Jun 28 07:03:33.421: %USBFLASH-SW2_STBY-5-CHANGE: bootdisk has been inserted!
*Jun 28 07:03:53.297: %OIR-SW2_STBY-6-INSPS: Power supply inserted in slot 1
*Jun 28 07:03:53.301: %C6KPWR-SW2_STBY-4-PSOK: power supply 1 turned on.
*Jun 28 07:04:26.093: %FABRIC-SW2_STBY-5-FABRIC_MODULE_ACTIVE: The Switch Fabric Module in slot 5 became active.
*Jun 28 07:04:48.497: %DIAG-SW2_STBY-6-RUN_MINIMUM: Switch 2 Module 5: Running Minimal Diagnostics...
*Jun 28 07:04:48.497: %CONST_DIAG-SW2_STBY-6-DIAG_PORT_SKIPPED: Module 5 port 15 is skipped in TestLoopback due to: the port is used as a VSL link.
*Jun 28 07:04:48.497: %CONST_DIAG-SW2_STBY-6-DIAG_PORT_SKIPPED: Module 5 port 16 is skipped in TestLoopback due to: the port is used as a VSL link.
*Jun 28 07:04:55.049: %CONST_DIAG-SW2_STBY-6-DIAG_PORT_SKIPPED: Module 5 port 15 is skipped in TestFexModeLoopback due to: the port is used as a VSL link.
*Jun 28 07:04:55.049: %CONST_DIAG-SW2_STBY-6-DIAG_PORT_SKIPPED: Module 5 port 16 is skipped in TestFexModeLoopback due to: the port is used as a VSL link.
*Jun 28 07:05:00.165: %CONST_DIAG-SW2_STBY-6-DIAG_PORT_SKIPPED: Module 5 port 15 is skipped in TestL2CTSLoopback due to: the port is used as a VSL link.
*Jun 28 07:05:00.165: %CONST_DIAG-SW2_STBY-6-DIAG_PORT_SKIPPED: Module 5 port 16 is skipped in TestL2CTSLoopback due to: the port is used as a VSL link.
*Jun 28 07:05:06.049: %CONST_DIAG-SW2_STBY-6-DIAG_PORT_SKIPPED: Module 5 port 15 is skipped in TestL3CTSLoopback due to: the port is used as a VSL link.
*Jun 28 07:05:06.049: %CONST_DIAG-SW2_STBY-6-DIAG_PORT_SKIPPED: Module 5 port 16 is skipped in TestL3CTSLoopback due to: the port is used as a VSL link.
*Jun 28 07:05:23.309: %DIAG-SW2_STBY-6-DIAG_OK: Switch 2 Module 5: Passed Online Diagnostics
*Jun 28 00:05:38 PDT: %SYS-SW2_STBY-6-CLOCKUPDATE: System clock has been updated from 00:05:38 PDT Sat Jun 28 2014 to 00:05:38 PDT Sat Jun 28 2014, configured from console by console.
*Jun 28 00:05:38 PDT: %SYS-SW2_STBY-6-CLOCKUPDATE: System clock has been updated from 00:05:38 PDT Sat Jun 28 2014 to 00:05:38 PDT Sat Jun 28 2014, configured from console by console.
*Jun 28 00:05:38 PDT: %SSH-SW2_STBY-5-DISABLED: SSH 2.0 has been disabled
*Jun 28 00:05:54 PDT: %SYS-SW2_STBY-5-RESTART: System restarted --
Cisco IOS Software, c6880x Software (c6880x-ADVENTERPRISEK9-M), Version 15.1(2)SY2, RELEASE SOFTWARE (fc3)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2014 by Cisco Systems, Inc.
Compiled Wed 26-Feb-14 15:30 by prod_rel_team
*Jun 28 00:05:54 PDT: %SSH-SW2_STBY-5-ENABLED: SSH 2.0 has been enabled
*Jun 28 00:05:54 PDT: %SYS-SW2_STBY-3-LOGGER_FLUSHED: System was paused for 00:03:01 to ensure console debugging output.

*Jun 28 00:05:56 PDT: %C6KENV-SW2_STBY-4-LOWER_SLOT_EMPTY: The lower adjacent slot of module 5 might be empty. Airdam must be installed in that slot to be NEBS compliant
*Jun 28 00:07:36 PDT: %DIAG-SW2_STBY-6-RUN_MINIMUM: Switch 2 Module 1: Running Minimal Diagnostics...
*Jun 28 00:07:37 PDT: %SYS-SW2_STBY-3-LOGGER_FLUSHED: System was paused for 00:01:29 to ensure console debugging output.

*Jun 28 00:07:41 PDT: %DIAG-SW2_STBY-6-RUN_MINIMUM: Switch 2 Module 2: Running Minimal Diagnostics...
*Jun 28 00:08:10 PDT: %DIAG-SW2_STBY-6-DIAG_OK: Switch 2 Module 1: Passed Online Diagnostics
*Jun 28 00:08:22 PDT: %EC-SW2_STBY-5-CANNOT_BUNDLE2: Te2/1/10 is not compatible with Te1/2/10 and will be suspended (Operational flow control send of Te2/1/10 is off, Te1/2/10 is on)
*Jun 28 00:08:31 PDT: %EC-SW2_STBY-5-COMPATIBLE: Te2/1/10 is compatible with port-channel members
*Jun 28 00:08:45 PDT: %DIAG-SW2_STBY-6-DIAG_OK: Switch 2 Module 2: Passed Online Diagnostics
*Jun 28 00:08:46 PDT: %C6KENV-SW2_STBY-4-HIGHER_SLOT_EMPTY: The higher adjacent slot of module 2 might be empty. Airdam must be installed in that slot to be NEBS compliant
ELDC1C1-AG01-1 line 0 

************************ W A R N I N G ***************************
* THIS IS A PRIVATE COMPUTER SYSTEM, AND FOR AUTHORIZED USE ONLY.*
* THIS SYSTEM IS MONITORED, AND ANY UNAUTHORIZED USE MAY BE      *
* SUBJECT TO CRIMINAL PROSECUTION.                               *
* IF YOU ARE NOT AUTHORIZED, LOG OUT IMMEDIATELY!!!!!            *
******************************************************************

 

so according to the logs, it seems like the switch was reloaded or something of the nature but even the cisco tac said that it wasnt the case but couldnt determine whats causing it either.  Tac went through the config on the vss and has said that its configured correctly.

Maybe someone else has experienced this issue or if someone can point out something that I can look at...  sorry for the very long post.

 

thanks

 

 

 

 

 

 

 

16 Replies 16

Thanks for your answer. I've requested the version you suggest.
The SR is: Cisco Service Request 634166937

I've observed some kind of such behavior when VSL down on 6509(SUP: VS-720-10G) and 6807-xl (SUP-2T).
In 6509 case we booted the primary-node after SUP replacement and VSS did not come up because of VSL down. VSS came up after we had rebooted both chassis.

In 6807-xl case VSL suddenly went down after 10 month of VSS uptime and did not come up no matter what we had been doing (changing patchcords, re-seating modules, rebooting problem-node). VSS did come up after we changed IOS and rebooted both chassis. 

Review Cisco Networking for a $25 gift card