01-15-2016
02:19 AM
- last edited on
03-25-2019
04:37 PM
by
ciscomoderator
Hello,
A few days ago one of the VSS members, local switch 1, crashed and now local switch 2 is active. The local SPAN session (VLAN to local port) on local switch 1 kept on working but after a few days it stopped without any log message. Any ideas how to debug this or an idea what is going wrong?
Some info:
Cisco IOS Software, s2t54 Software (s2t54-ADVENTERPRISEK9-M), Version 15.0(1)SY5, RELEASE SOFTWARE (fc4)
ROM: System Bootstrap, Version 12.2(50r)SYS3, RELEASE SOFTWARE (fc1)
Session 1
---------
Type : Local Session
Status : Admin Enabled
Description : monitor
Source VLANs :
Both : 100
Destination Ports : Te1/1/7
Egress SPAN Replication State:
Operational mode : Centralized
Configured mode : Centralized (default)
Regards,
Andre
01-15-2016 05:09 AM
Hey Andre
what exactly was the crash did it produce a crash file ? what did it show as reload reason in show ver for that switch?, if the destination port is on the switch that crashed that's most likely related to your issue , hard to tell without seeing some logging or debugging why it happened but the crash may explain it , the fact no logs is strange did you notice you just notice it stop on pc/server side where wireshark is ? Did you re-apply the SPAN to get it working or is it still down
Anyway for now to debug span this is how on VSS
VSS-CORE#debug monitor ?
all All SPAN debugging messages
capture Show Capture tracing
errors Show SPAN error detail
erspan Show ERSPAN tracing
idb-update Show SPAN IDB update traces
info Show SPAN Informational tracing
list Show SPAN port and VLAN list tracing
notifications Show SPAN notifications
platform Show SPAN platform tracing
redundancy Show SPAN Redundancy tracing
requests Show SPAN requests
scp Show SPAN SCP tracing
snmp Show SPAN SNMP tracing
01-15-2016 05:35 AM
Hey Mark,
thanks for your time.
#sh ver | in rel
Compiled Tue 20-Aug-13 08:59 by prod_rel_team
Last reload reason: Reload Command
This is a snippet from the crash file (also uploaded the complete crash file):
15:49:18 CET Wed Dec 30 2015: Unexpected exception to CPU: vector 1400, PC = 0x6CAA774 , LR = 0x6CAC2D0
-Traceback= 6CAA774 6CAC2D0 6CADA78 6CA31F0 6CA3D60 6CA3DD8 6CA3EF4 5020380 5019C34
CPU Register Context:
MSR = 0x00029200 CR = 0x42000084 CTR = 0x08D85ECC XER = 0x00000000
R0 = 0x06CAC2D0 R1 = 0x159B2E30 R2 = 0xFFF7FFF7 R3 = 0x00000000
R4 = 0x56FDFB80 R5 = 0x159B2EB4 R6 = 0x157327DC R7 = 0x00000000
R8 = 0x000000EE R9 = 0x00000000 R10 = 0x00000000 R11 = 0x00000000
R12 = 0x22000084 R13 = 0x0121F000 R14 = 0x06CA3E0C R15 = 0x00000000
R16 = 0x00000000 R17 = 0x00000000 R18 = 0x00000000 R19 = 0x00000000
R20 = 0x00000000 R21 = 0x00000000 R22 = 0x00000000 R23 = 0x00000000
R24 = 0x00000000 R25 = 0x00000000 R26 = 0x00000000 R27 = 0x0EEF0000
R28 = 0x00000000 R29 = 0x18964110 R30 = 0x00000000 R31 = 0x56FDFB80
TEXT_START : 0x04000144
DATA_START : 0x0C000000
A debug shows the following information:
Jan 15 14:14:06: SW2: SPAN-info:span_handle_egress_span_timer
Jan 15 14:14:06: SW2: SPAN-info:span_start_egress_span_timer
Jan 15 14:14:16: SW2: SPAN-info:span_handle_egress_span_timer
Jan 15 14:14:16: SW2: SPAN-info:span_start_egress_span_timer
I did recreate the monitor, used an other source vlan. Also used source interfaces instead of a source plan. But not to any avail. Maybe a complete reload will solve the issue.
Cheers,
Andre
01-15-2016 05:47 AM
Hello
Have you check the crashinfo log from switch 1?
res
Paul
01-15-2016 06:26 AM
Hello Paul,
I did attach the crash log file in my answer to Mark.
Thanks,
Andre
01-15-2016 06:53 AM
Hey Andre just to confirm this crash occurred Dec30th ? From your earlier post you said it occurred few days ago so we may have the wrong file here
Did you log directly in switch 1 and collect the file from there or is it just that your clocks may be set wrong and this is the actual file generated ?
01-15-2016 07:00 AM
Hey Mark,
Nice catch, but indeed the crash occurred on December 30th 15:49 CET, afterwards SPAN kept on working until Sunday Januari 3'th (stopped around 09:50 CET on Sunday).
Cheers,
Andre
01-15-2016 07:37 AM
Hey Not sure if you have TAC support but you may want to get that checked looks like you had a serious software failure , memory side looks ok but cant see exactly what triggered it but most likely some irregular bug , TAC will be able to tell you exactly what you hit, if you don't have support with them and it was my switch I would move to the safe harbour version for that platform which Cisco currently recommends as the most stable and most tested IOS, saying that your switch has been stable 2 weeks it could also be a transient once off problem you could see how it holds but also the fact that your spans still not working and the switch has already reloaded itself it does not look good.
Recommended by Cisco
s2t54-adventerprisek9-mz.SPA.151-2.SY6.bin
15:49:18 CET Wed Dec 30 2015: Unexpected exception to CPU: vector 1400, PC = 0x6CAA774 , LR = 0x6CAC2D0
-Traceback= 6CAA774 6CAC2D0 6CADA78 6CA31F0 6CA3D60 6CA3DD8 6CA3EF4 5020380 5019C34
01-15-2016 09:58 AM
Hey,
Ok thanks for your insights, I will have contact with support in order to create a TAC case. Will report back if any news on the case.
Have a nice weekend.
Andre
01-18-2016 12:32 AM
Yes if you don't mind let us know what TAC say is the cause thanks
02-01-2016 02:23 AM
Well our support contact did not create a TAC case, but advised to upgrade to s2t54-adventerprisek9-mz.SPA.150-1.SY9.bin . If any other news come in I'll let you know.
02-02-2016 02:57 AM
strange they moved to that image even though its in the same train its not the recommended image for that platform but hopefully the bug is not in that version too that's the risk you take without running it by TAC and knowing exactly what triggered it , good luck
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide