cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1290
Views
0
Helpful
4
Replies

Cisco 4500 L3 switch keeps rebooting without cause.

Hi Cisco community.

 

My first post here, we have had some really weird incidents with one one of our Cisco 4500E switches. It keeps rebooting without a clear cause. The system says its "returned to rom by power-on". We switched the Power supplies and also had an electrician check the power supply. No UPS is involved here. We have a redundant setup with two PSU's

 

This setup is powering two POE blades

 

The switch did manage to write something to the crashinfo file which I will post below together with the "show version" output

 

NDC025 uptime is 7 hours, 15 minutes
System returned to ROM by power-on
System restarted at 04:45:36 CET Mon Jun 10 2019
System image file is "bootflash:cat4500e-entservicesk9-mz.122_52_SG_136"


This product contains cryptographic features and is subject to United
States and local country laws governing import, export, transfer and
use. Delivery of Cisco cryptographic products does not imply
third-party authority to import, export, distribute or use encryption.
Importers, exporters, distributors and users are responsible for
compliance with U.S. and local country laws. By using this product you
agree to comply with applicable laws and regulations. If you are unable
to comply with U.S. and local laws, return this product immediately.

A summary of U.S. laws governing Cisco cryptographic products may be found at:
http://www.cisco.com/wwl/export/crypto/tool/stqrg.html

If you require further assistance please contact us by sending email to
export@cisco.com.

cisco WS-C4506-E (MPC8548) processor (revision 6) with 524288K bytes of memory.
Processor board ID FOX1440G6Q5
MPC8548 CPU at 1.33GHz, Supervisor 6-E
Last reset from PowerUp
2 Virtual Ethernet interfaces
52 Gigabit Ethernet interfaces
2 Ten Gigabit Ethernet interfaces
511K bytes of non-volatile configuration memory.

Configuration register is 0x2102

Below is a crash-info exert

NDC025#more crashinfo:data

Current Time: 6/10/2019 9:46:41

Last Power Failure: 11/20/2017 07:56:48
Last Reload Status: 00002000
Last Software Reset State: 00000000

Crashdump version: 1

Last crash: 06/10/2019 04:43:36

Build: 12.2(52)SG(1.36) ENTSERVICESK9
buildversion addr: 136C6708

pc=00000000 lr=00000000 msr=00000000 vector=00000E00
cr=48000024 ctr=10AD1968 xer=00000000
r0=00000000 r1=15DA6568 r2=0000001E r3=00000000
r4=01AAB440 r5=00000000 r6=00DBBA00 r7=00000003
r8=13AF0000 r9=13AFB870 r10=15DA6550 r11=00D39A04
r12=00D39A04 r13=00000000 r14=10AD1120 r15=00000000
r16=00000000 r17=00000000 r18=00000000 r19=00000000
r20=00000000 r21=00000000 r22=13E351D0 r23=13E351D0
r24=0000000A r25=00000000 r26=13E351D0 r27=0000006C
r28=15DA6590 r29=118A50E0 r30=00000000 r31=00D39A04
dec=000054F4 tbu=000000D7 tbl=40DFF96B
dar=80210020 dsisr=80210020 hid0=80004000


Traceback: 00000000

Stack frames:

Pushed stack:
15DA6560:                   00000000 00000000
15DA6570: 217C13B8 00000001 15DA65B8 10AD1F34
15DA6580: 0000000A 00000000 00000012 00000000
15DA6590: 00000000 00000000 00000000 00000000
15DA65A0: 00000000 129E0000 13AF0000 136D0000
15DA65B0: 00000012 00000000 15DA65F8 10AD11F4
15DA65C0: 00000000 00D48464 00000003 20C5A284
15DA65D0: 00000000 00000001 00000000 FFFFFFFF
15DA65E0: FFFFFFFF 00000000 00000000 00000000
15DA65F0: 00000000 00000000 15DA6600 10A8E000
15DA6600: 00000000 10A85180

Popped stack:
15DA6360:                   00000000 15DA63FC
15DA6370: 00000000 00000000 00000000 0A7CE4C9
15DA6380: 15DA6460 00000000 15DA6390 112CAE28
15DA6390: 15DA63C0 113039DC 215DD1F0 215DD1F0
15DA63A0: 15DA63B8 00000000 0A7CE4C9 0A7CE4C9
15DA63B0: 00000000 15DA6400 00000000 15DA6460
15DA63C0: 15DA63C8 11B459C8 15DA6408 11B45A54
15DA63D0: 48000024 00000001 15DA6400 21CA522C
15DA63E0: 48000024 00000000 0A7CE4C9 0A7CE4C9
15DA63F0: 15DA6580 21CA522C 0A7CE4C9 00000000
15DA6400: 15DA6408 112C9D70 15DA6418 112CC0A8
15DA6410: 15DA6428 15DA6438 15DA6580 21CA522C
15DA6420: 21CA522C 00000000 15DA64C0 110C3140
15DA6430: 15DA6448 00000001 01046458 0A7CE4C9
15DA6440: 15DA6448 15DA6478 15DA6458 101CE95C
15DA6450: 00000000 21421848 15DA6480 00000000
15DA6460: 0A7CE4C9 00000001 15DA6488 1137CD7C
15DA6470: 15DA64E0 15DA64A8 10AD1120 00000000
15DA6480: 15DA6498 00000000 00000000 21CA522C
15DA6490: 15DA6500 00000000 15DA64A8 10AD3588
15DA64A0: 1873415C 00000000 15DA64B0 110E9768
15DA64B0: 15DA64C0 110C3998 15DA6580 00000000
15DA64C0: 15DA64D8 110C4368 1F7CB180 1F7CB17E
15DA64D0: 15DA6540 1873415C 15DA6520 11E1D17C
15DA64E0: 21CA522C 0000012C 15DA6500 1F7CB170
15DA64F0: 00000000 00000000 15DA6508 17C8047C
15DA6500: 00000000 00D3727C 15DA6598 00000000
15DA6510: 00000000 129E0000 17C8047C 17C8047C
15DA6520: 00000000 1873415C 15DA6540 11BB4DB0
15DA6530: 00000000 00000000 15DA6540 00000000
15DA6540: 15DA6570 10253064 15DA6558 00000000
15DA6550: 00000000 00D39A04 15DA6578 118A50E0
15DA6560: 00000000 01AAB440


Log buffer:

CMD: ' switchport access vlan dynamic' 13:38:25 CET Sun Jun 9 2019
CMD: ' switchport mode access' 13:38:25 CET Sun Jun 9 2019
CMD: ' no snmp trap link-status' 13:38:25 CET Sun Jun 9 2019
CMD: ' spanning-tree portfast' 13:38:25 CET Sun Jun 9 2019
CMD: 'interface Vlan1' 13:38:25 CET Sun Jun 9 2019
CMD: ' no ip address' 13:38:25 CET Sun Jun 9 2019
CMD: ' shutdown' 13:38:25 CET Sun Jun 9 2019
CMD: 'interface Vlan300' 13:38:25 CET Sun Jun 9 2019
CMD: ' ip address 10.124.228.25 255.255.255.0' 13:38:25 CET Sun Jun 9 2019

CMD: 'ip route 0.0.0.0 0.0.0.0 10.124.228.254' 13:38:25 CET Sun Jun 9 2019
CMD: 'ip http server' 13:38:25 CET Sun Jun 9 2019
CMD: 'no ip http secure-server' 13:38:25 CET Sun Jun 9 2019
CMD: 'logging 10.124.229.121' 13:38:25 CET Sun Jun 9 2019
CMD: PASSWORD statement not printed
CMD: PASSWORD statement not printed
CMD: 'snmp-server enable traps snmp linkdown linkup' 13:38:25 CET Sun Jun 9 2019
CMD: 'snmp-server host 10.124.229.6 NDC1s0n ' 13:38:25 CET Sun Jun 9 2019
CMD: 'banner login ^C' 13:38:25 CET Sun Jun 9 2019
CMD: 'line con 0' 13:38:25 CET Sun Jun 9 2019
CMD: PASSWORD statement not printed
CMD: ' logging synchronous' 13:38:25 CET Sun Jun 9 2019
CMD: ' login' 13:38:25 CET Sun Jun 9 2019
CMD: ' stopbits 1' 13:38:25 CET Sun Jun 9 2019
CMD: 'line vty 0 4' 13:38:25 CET Sun Jun 9 2019
CMD: PASSWORD statement not printed
CMD: ' logging synchronous' 13:38:25 CET Sun Jun 9 2019
CMD: ' login local' 13:38:25 CET Sun Jun 9 2019
CMD: ' transport input ssh' 13:38:25 CET Sun Jun 9 2019
CMD: 'line vty 5 15' 13:38:25 CET Sun Jun 9 2019
CMD: PASSWORD statement not printed
CMD: ' logging synchronous' 13:38:25 CET Sun Jun 9 2019

CMD: ' login local' 13:38:25 CET Sun Jun 9 2019
CMD: ' transport input telnet ssh' 13:38:25 CET Sun Jun 9 2019
CMD: 'monitor session 1 filter packet-type good rx' 13:38:25 CET Sun Jun 9 2019
CMD: 'monitor session 1 destination remote vlan 666 ' 13:38:25 CET Sun Jun 9 2019
CMD: 'ntp clock-period 17181850' 13:38:25 CET Sun Jun 9 2019
CMD: 'ntp server 10.124.228.254' 13:38:25 CET Sun Jun 9 2019
CMD: 'end' 13:38:25 CET Sun Jun 9 2019

*Jun  9 12:38:25.955: %SYS-5-CONFIG_I: Configured from memory by console
*Jun  9 12:38:26.124: %SYS-5-RESTART: System restarted --
Cisco IOS Software, Catalyst 4500 L3 Switch Software (cat4500e-ENTSERVICESK9-M), Version 12.2(52)SG(1.36                                                     ), CISCO INTERNAL USE ONLY ENTSERVICES PRODUCTION VERSION
Copyright (c) 1986-2009 by Cisco Systems, Inc.
Compiled Tue 12-May-09 02:18 by fklotz
*Jun  9 12:38:26.164: %SSH-5-ENABLED: SSH 2.0 has been enabled
*Jun  9 12:38:34.849: %SYS-6-LOGGINGHOST_STARTSTOP: Logging to host 10.124.229.121 Port 514 started - CL                                                     I initiated
*Jun  9 12:38:47.154: %VQPCLIENT-7-NEXTSERV: Trying next VMPS 10.124.229.101
*Jun  9 12:38:48.526: %EC-5-BUNDLE: Interface Gi3/33 joined port-channel Po2
*Jun  9 12:38:48.602: %EC-5-BUNDLE: Interface Gi3/32 joined port-channel Po2
*Jun  9 12:38:50.154: %VQPCLIENT-7-NEXTSERV: Trying next VMPS 10.124.229.102
*Jun  9 12:38:51.022: %C4K_IOSINTF-5-TRANSCEIVERINSERTED: Slot=1 Port=1: Transceiver has been inserted
*Jun  9 12:38:52.022: %C4K_IOSINTF-5-TRANSCEIVERINSERTED: Slot=1 Port=2: Transceiver has been inserted
*Jun  9 12:38:55.183: %VQPCLIENT-7-NEXTSERV: Trying next VMPS 10.124.229.101
*Jun  9 12:38:57.543: %EC-5-BUNDLE: Interface Te1/1 joined port-channel Po1
*Jun  9 12:38:58.187: %VQPCLIENT-7-NEXTSERV: Trying next VMPS 10.124.229.102
*Jun  9 12:38:58.779: %EC-5-BUNDLE: Interface Te1/2 joined port-channel Po1
Jun 10 00:54:30.810: %VQPCLIENT-7-RECONF: Reconfirming VMPS responses
Jun 10 01:54:31.234: %VQPCLIENT-7-RECONF: Reconfirming VMPS responses
Jun 10 02:54:31.652: %VQPCLIENT-7-RECONF: Reconfirming VMPS response


Malloc / Free trace:
1 . pc=1179FF18 addr=21EC5170
2 . pc=1179F544 addr=21EC5170
3 . pc=3000002C addr=21EC5170
4 . pc=11B1A034 addr=21EC5170
5 . pc=11B1A02C addr=2134F1B8
6 . pc=11B19EE4 addr=2134F1B8
7 . pc=30000014 addr=2134F1B8
8 . pc=11B19EB0 addr=21EC5170
9 . pc=3000002C addr=21EC5170
10. pc=11B1A034 addr=21EC5170
11. pc=11B1A02C addr=2134F1B8
12. pc=11B19EE4 addr=2134F1B8
13. pc=30000014 addr=2134F1B8
14. pc=11B19EB0 addr=21EC5170
15. pc=3000002C addr=21EC5170
16. pc=11B1A034 addr=21EC5170
17. pc=11B1A02C addr=2134F1B8
18. pc=11B19EE4 addr=2134F1B8
19. pc=30000014 addr=2134F1B8
20. pc=11B19EB0 addr=21EC5170
21. pc=3000002C addr=21EC5170
22. pc=11B1A034 addr=21EC5170
23. pc=11B1A02C addr=2134F1B8
24. pc=11B19EE4 addr=2134F1B8
25. pc=30000014 addr=2134F1B8
26. pc=11B19EB0 addr=21EC5170
27. pc=3000002C addr=21EC5170
28. pc=11B1A034 addr=21EC5170
29. pc=11B1A02C addr=2134F1B8
30. pc=11B19EE4 addr=2134F1B8
31. pc=30000014 addr=2134F1B8
32. pc=11B19EB0 addr=21EC5170
33. pc=3000002C addr=21EC5170
34. pc=11B1A034 addr=21EC5170
35. pc=11B1A02C addr=2134F1B8
36. pc=11B19EE4 addr=2134F1B8
37. pc=30000014 addr=2134F1B8
38. pc=11B19EB0 addr=21EC5170
39. pc=3000002C addr=21EC5170
40. pc=11B1A034 addr=21EC5170
41. pc=11B1A02C addr=2134F1B8
42. pc=11B19EE4 addr=2134F1B8
43. pc=30000014 addr=2134F1B8
44. pc=11B19EB0 addr=21EC5170
45. pc=3000002C addr=21EC5170
46. pc=11B1A034 addr=21EC5170
47. pc=11B1A02C addr=2134F1B8
48. pc=11B19EE4 addr=2134F1B8
49. pc=30000014 addr=2134F1B8
50. pc=11B19EB0 addr=21EC5170
51. pc=3000002C addr=21EC5170
52. pc=11B1A034 addr=21EC5170
53. pc=11B1A02C addr=2134F1B8
54. pc=11B19EE4 addr=2134F1B8
55. pc=30000014 addr=2134F1B8
56. pc=11B19EB0 addr=21EC5170
57. pc=3000002C addr=21EC5170
58. pc=11B1A034 addr=21EC5170
59. pc=11B1A02C addr=2134F1B8
60. pc=11B19EE4 addr=2134F1B8
61. pc=30000014 addr=2134F1B8
62. pc=11B19EB0 addr=21EC5170
63. pc=3000002C addr=21EC5170
64. pc=11B1A034 addr=21EC5170

Luke Board Specific Crash Data:
MCSR: 0x10
L1CSR0: 0x10001 L1CSR1: 0x10001
SRR0: 0x0 CSRR0: 0x118A6E6C MCSRR0: 0x10000100
MCAR: 0x18A6E80
ESR: 0x0
CISR0: 0x0 CISR1: 0x20000000
L2CTL: 0xA0000000
L2CAPTDATAHI: 0x0 L2CAPTDATALO: 0x0
L2CAPTECC: 0x0
L2ERRDET: 0x0
L2ERRDIS: 0x0
L2ERRATTR: 0x0
L2ERRADDRH: 0x0L2ERRADDRL: 0x0
L2_ERRCTL: 0x0
DDR_CAPTURE_DATA_HI: 0x4E800020 DDR_CAPTURE_DATA_LO: 0x8421FFE8
DDR_CAPTURE_ECC: 0x404
DDR_ERR_DETECT: 0x8000000C
DDR_ERR_DISABLE: 0x0
DDR_ERR_INT_EN: 0x9
DDR_CAPTURE_ATTRIBUTES: 0x102001
DDR_CAPTURE_ADDRESS: 0x18A65B0
DDR_CAPTURE_EXT_ADDRESS: 0x0
DDR_ERR_SBE: 0xFF00F9
PCI_ERR_DR: 0x0
PCI_ERR_ATTRIB: 0x0
PCI_ERR_ADDR: 0x0
PCI_ERR_EXT_ADDR: 0x0
PCI_ERR_DH: 0x0PCI_ERR_DL: 0x0
Machine Check Interrupt Count: 1
L1 Instruction Cache Parity Errors: 0
L1 Instruction Cache Parity Errors (CPU30): 0
L1 Data Cache Parity Errors: 0

Jawa Crash Data:
Interrupt Mask: 0xC100
Interrupt: 0x0

Can anyone deduce from the crashinfo what is causing this switch to crash?

4 Replies 4

marce1000
VIP
VIP

 

 - No , but try a  more recent software release , if applicable , and check if the problem is persistent, also configure snmp traps for all events and follow up on traps on the designated snmp-trap receiver.

 M.



-- ' 'Good body every evening' ' this sentence was once spotted on a logo at the entrance of a Weight Watchers Club !

hi Marce1000

Thanks for the tip, we will try and do that. We have now switched the supervisor with another blade and see if the problem persists.

Leo Laohoo
Hall of Fame
Hall of Fame

@IT Elgiganten NDC wrote:
Build: 12.2(52)SG(1.36) ENTSERVICESK9

That is either a BETA firmware or an Engineering Special release.   

According to the logs, the supervisor card is experiencing memory allocation errors. 

Upgrade the firmware of the controller or raise a TAC Case.

Hi Leo Laohoo

Thank you so much for your answer! We switched the hypervisor module with another switch now to see if the problem persists.
The switch only usually to reboot under load during the weekend when we are running backups so I think you might be correct.

What part exactly is showing you that it is a memory allocation error?
Review Cisco Networking products for a $25 gift card