cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
987
Views
0
Helpful
10
Replies

Need help with Cisco 7513: System returned to ROM by bus error at PC 0x400F1578, address 0x62B38F16

crbernabe
Level 1
Level 1

Hi

We are a small internet provider company and we are running two (2) identical cisco 7513 for our bgp connections with different providers on each unit, apparently this problem, router crashes and reload,  occurred almost simultaneously which started around 2 weeks ago and recurs on a daily basis which occurs twice a day on both units.  The units has been operating in years with the same configurations and settings and no incident like this has been experienced until last 2 weeks.

this is the current config based on a sh ver:

 

Cisco Internetwork Operating System Software
IOS (tm) RSP Software (RSP-ISV-M), Version 12.3(1), RELEASE SOFTWARE (fc3)
Copyright (c) 1986-2003 by cisco Systems, Inc.
Compiled Thu 15-May-03 04:53 by dchih
Image text-base: 0x4001095C, data-base: 0x41DA2000

ROM: System Bootstrap, Version 11.1(8)CA1, EARLY DEPLOYMENT RELEASE SOFTWARE (fc1)
BOOTLDR: GS Software (RSP-BOOT-M), Version 11.1(8)CA1, EARLY DEPLOYMENT RELEASE SOFTWARE (fc1)

bgpr2 uptime is 26 minutes
System returned to ROM by bus error at PC 0x400F1578, address 0x62B38F16
System image file is "slot0:rsp-isv-mz.123-1.bin"

cisco RSP4 (R5000) processor with 262144K/2072K bytes of memory.
R5000 CPU at 200Mhz, Implementation 35, Rev 2.1, 512KB L2 Cache
Last reset from power-on
G.703/E1 software, Version 1.0.
G.703/JT2 software, Version 1.0.
X.25 software, Version 3.0.0.
Bridging software.
Chassis Interface.
5 VIP2 R5K controllers (6 FastEthernet)(16 Serial).
6 FastEthernet/IEEE 802.3 interface(s)
16 Serial network interface(s)
123K bytes of non-volatile configuration memory.

20480K bytes of Flash PCMCIA card at slot 0 (Sector size 128K).
8192K bytes of Flash internal SIMM (Sector size 256K).

Slave in slot 7 is running Cisco Internetwork Operating System Software
IOS (tm) RSP Software (RSP-DW-M), Version 12.3(1), RELEASE SOFTWARE (fc3)
Copyright (c) 1986-2003 by cisco Systems, Inc.
Compiled Thu 15-May-03 05:01 by dchih
Slave: Loaded from system
Slave: cisco RSP4 (R5000) processor with 262144K bytes of memory.

Configuration register is 0x2102

 

 

As an additional info when the problem started, it also started recording console logs of this kind, I'm not sure if this is an attack coming from a single ip block going to our whole network randomly:

 

*Aug 27 04:49:33.051: %SEC-6-IPACCESSLOGP: list 130 denied tcp 64.125.239.34(59134) -> 210.16.61.142(12345), 1 packet
*Aug 27 04:49:38.435: %SEC-6-IPACCESSLOGP: list 130 denied tcp 64.125.239.75(43036) -> 210.16.63.169(12345), 1 packet
*Aug 27 04:49:44.167: %SEC-6-IPACCESSLOGP: list 130 denied tcp 64.125.239.76(51418) -> 210.16.63.170(12345), 1 packet
*Aug 27 04:49:45.319: %SEC-6-IPACCESSLOGRL: access-list logging rate-limited or missed 1 packet
*Aug 27 04:50:03.439: %SEC-6-IPACCESSLOGP: list 130 denied tcp 64.125.239.147(45838) -> 210.16.61.6(12345), 1 packet
*Aug 27 04:50:17.959: %SEC-6-IPACCESSLOGP: list 130 denied tcp 64.125.239.74(52141) -> 210.16.14.13(12345), 1 packet
*Aug 27 04:50:43.327: %SEC-6-IPACCESSLOGP: list 130 denied tcp 64.125.239.230(53566) -> 210.16.0.18(12345), 1 packet
*Aug 27 04:50:47.135: %SEC-6-IPACCESSLOGP: list 130 denied tcp 64.125.239.118(36652) -> 210.16.0.155(12345), 1 packet
*Aug 27 04:50:51.015: %SEC-6-IPACCESSLOGP: list 130 denied tcp 64.125.239.141(50522) -> 210.16.62.242(12345), 1 packet
*Aug 27 04:50:55.831: %SEC-6-IPACCESSLOGP: list 130 denied tcp 64.125.239.32(58788) -> 210.16.20.178(12345), 1 packet
*Aug 27 04:51:14.351: %SEC-6-IPACCESSLOGP: list 130 denied tcp 64.125.239.213(44437) -> 210.16.61.72(12345), 1 packet
*Aug 27 04:51:41.171: %SEC-6-IPACCESSLOGP: list 130 denied tcp 64.125.239.29(38951) -> 210.16.61.137(12345), 1 packet
*Aug 27 04:51:56.979: %SEC-6-IPACCESSLOGP: list 130 denied tcp 64.125.239.27(45956) -> 210.16.62.128(12345), 1 packet
*Aug 27 04:52:00.779: %SEC-6-IPACCESSLOGP: list 130 denied tcp 64.125.239.165(52954) -> 210.16.5.167(12345), 1 packet

 

 

Though there are filters applied, i'm not sure if I still need our upstream provider to assist us in filtering the source ip block.

 

Your kind assistance is greatly appreciated, good day.

10 Replies 10

Mark Malone
VIP Alumni
VIP Alumni

Are you still having crashes , did a crash file get generated to flash when it occurred, if its happening constantly you need to find out whether the hardware is gone or software but your saying its happening to 2 routers so I would say you triggered a bug, you could get the crash file give it to TAC to diagnose or just upgrade the software

System returned to ROM by bus error at PC 0x400F1578, address 0x62B38F16

Beforehand, thank you sir for your response.

 

anyway, yes sir we're still having frequent crashes, like for today between 5am until 320pm, we've already incurred 4 crashes.

as for the crashinfo, I was able to secure from one of the routers which i've attached.

Hoping to find an upgraded firmware for this model

Hi, ok can you post a command show stacks off the device please , I cant fully decipher a crash file I don't have the tools that TAC have if someone is on here from Cisco maybe they can, they have internal tools not available to the public which help with that but we should be able to see something in show stacks , if we cant identify it, it should be raised with TAC aswell if support in place if not you can try upgrade the software to avoid it your running an ED software which can be unstable there should be an MD release available , for now post the show stacks and well try identify if its hardware or software anyway, im still guessing software as its 2 routers simultaneously hit with the same issue

As well I see interrupts in the crash output so show stack will help identify that hopefully

Sir,

 

I've attached the sh stacks from both routers, the crashinfo that was sent came from the bgpr1

 

Last night I was able to setup a service router to replace bgpr1 and just transferred or pulled out the connections from bgpr1 to the service router and left the unit up and running. From the time I pulled the connections until this morning when I checked on the service router for its uptime for about 16 hours, bgpr1 has been up for the same duration, which I noticed, anyway just an added info

its software issue you need to upgrade to newer software, without TAC though I cant tell you exactly what image but i would just move up to newer software , these devices are EOL from 2012 so you wont get any support from TAC on it , the image your running is actually deferred on the website as well meaning too many bugs to keep in production , I see a 12.4 there I would go for that

http://www.cisco.com/c/en/us/support/docs/ios-nx-os-software/ios-software-releases-121-mainline/7949-crashes-buserror-troubleshooting.html

ERROR: This router was last restarted by a bus error: 'bus error at PC 0x400F1578, address 0x62B38F16' The system encounters a bus error when the processor tries to access a memory location that either does not exist (a software error) or does not respond properly (a hardware problem). TRY THIS: Issue a 'show region' and check if the address location (the 'address' part of the bus error - 0x62B38F16) falls within an existing address range. If the address reported by the bus error does not fall within the ranges displayed in the 'show region' output, this means that the router was trying to access an invalid address. This indicates that it is a Cisco IOS Software problem. Paste the output from the 'show stacks' command to decode the output and identify the Cisco IOS Software bug that is causing the bus error. If the address falls within one of the ranges in the 'show region' output, it means that the router was accessing a valid memory address, but the hardware corresponding to that address is not responding properly. This indicates a hardware problem. Try reseating the hardware belonging to this address range before attempting to replace it. Use the Troubleshooting Bus Error Crashes document for additional assistance.

I tried running the suggested debugging method with bgpr1 but no region output is generated, so I tried it instead with bgpr2 with these results:

 

System returned to ROM by bus error at PC 0x400F1578, address 0x6346C462

bgpr2#sh region
Region Manager:

      Start         End     Size(b)  Class  Media  Name
 0x40000000  0x4FFFFFFF   268435456  Local  R/W    main
 0x4001095C  0x41DA1999    31002686  IText  R/O    main:text
 0x41DA2000  0x4285B41F    11244576  IData  R/W    main:data
 0x4285B420  0x42B546DF     3117760  IBss   R/W    main:bss
 0x42B546E0  0x42B746DF      131072  Local  R/W    main:fastheap
 0x42B746E0  0x4FFFFFFF   222869792  Local  R/W    main:heap
 0x80000000  0x87FFFFFF   134217728  Local  R/W    main:(main_k0)
 0x88000000  0x88001FFF        8192  Iomem  REG    qa_k0
 0x88002000  0x881FFFFF     2088960  Iomem  R/W    memd:(memd_k0)
 0xA0000000  0xA7FFFFFF   134217728  Local  R/W    main:(main_k1)
 0xA8000000  0xA8001FFF        8192  Iomem  REG    qa_k1
 0xA8002000  0xA81FFFFF     2088960  Iomem  R/W    memd:(memd_k1)
 0xE0000000  0xE0001FFF        8192  Iomem  REG    qa
 0xE0002000  0xE01FFFFF     2088960  Iomem  R/W    memd
 0xE8000000  0xE8001FFF        8192  Iomem  REG    qa:writethru
 0xF0002000  0xF01FFFFF     2088960  Iomem  R/W    memd:(memd_bitswap)
 0xF8002000  0xF81FFFFF     2088960  Iomem  R/W    memd:(memd_uncached)


bgpr2#


bgpr2#sh stacks
Minimum process stacks:
 Free/Size   Name
 5672/6000   HPI Logger
 5500/6000   Clock Server
11200/12000  Router Init
 5128/12000  Init
10444/12000  Slave Server
 5428/6000   RADIUS INITCONFIG
 5684/6000   MDFS Reload
 4944/6000   BGP Open
 2460/3000   RSP memory size check
 9720/12000  Virtual Exec

Interrupt level stacks:
Level    Called Unused/Size  Name
  1    31908504   7972/9000  Network Interrupt
  2      111444   7836/9000  Network Status Interrupt
  3           0   8692/9000  OIR interrupt
  4           0   9000/9000  PCMCIA Interrupt
  5        7738   8644/9000  Console Uart
  6           4   9000/9000  Error Interrupt
  7     3393082   8604/9000  NMI Interrupt Handler

System was restarted by bus error at PC 0x400F1578, address 0x6346C462
RSP Software (RSP-ISV-M), Version 12.3(1), RELEASE SOFTWARE (fc3)
Compiled Thu 15-May-03 04:53 by dchih
Image text-base: 0x4001095C, data-base: 0x41DA2000


Stack trace from system failure:
FP: 0x4333FB60, RA: 0x400F1578
FP: 0x4333FB88, RA: 0x40692F1C
FP: 0x4333FBD0, RA: 0x4069EB34
FP: 0x4333FBF0, RA: 0x40687F94
FP: 0x4333FC40, RA: 0x40688C20
FP: 0x4333FC68, RA: 0x40688E20


***************************************************
******* Information of Last System Crash **********
***************************************************


The last crashinfo failed to be written.
Please verify the exception crashinfo configuration
the filesytem devices, and the free space on the
filesystem devices.
Using crashinfo_FAILED.

%Error opening crashinfo_FAILED (File not found)
bgpr2#

 

 

Seems to be a software issue as suggested.

 

Will try to get hold of the 12.4 which is around 28Mb and a bigger linear flash card, as my current card is just barely enough to hold the 22Mb 12.3 version.

12.4.3i or j is the same dram and flash that your running now there MD releases as well so they will be a lot more stable for bugs , those ED editions can be dodgy

 

I would just move off that as soon as possible to avoid another production reboot , you could always test 1 for a day and if its stable you should be good to change the 2nd router

Sir,

I'll try to work on your suggestion, but in-case I can't get hold of 12.4.3i/j, is the 12.4.3a stable?

Its an MD release as well so I would say you would be a lot better off , MD are main deployment fully tested , ED are early releases and can contain more bugs and there not as thoroughly tested

Sir,

Thank you for the feedback and your time for the assistance, I'll be working on this asap, and give feedback as well after the upgrade.

Review Cisco Networking for a $25 gift card