07-22-2010 10:38 AM - edited 03-20-2019 07:53 PM
Hi,
It contains proprietary information that cannot be disclosed at this time.Why ¿?
-Sergio
Solved! Go to Solution.
07-23-2010 09:10 AM
The problem is that the bug was originally found in an internal testbed on a non-shipping branch, so it belongs to an internal group ("labtrunk") which is not externally visible.
However, the bug that the testing team uncovered seems to also be present in shipping branches (as evidenced by "sys" bugs CSCtb55433 and CSCsl02596 which declare themselves duplicates of CSCsg43532.)
Nowhere in the process of getting marked as a duplicate was the bug reclassified as shipping/production ("sys") which would make it visible in bug toolkit (this is a process hole). Similarly, bug toolkit does not recognize that this "labtrunk" bug is duplicated by "sys" bugs (logic hole). If the bug toolkit team can take action here, they can either either reclassify this bug as "sys" or change the bug tooklit to recognize that this bug has a valid release note and is duplicated by a "sys" bug and should therefore be visible. The latter is where the bug toolkit team has the most control.
Summary: you have uncovered a combination process/logic hole in the system. Thanks.
And, in case you care, this is the (current) release note for CSCsg43532 is pated below. Bottom line is that the bug is largely cosmetic and should have no operational impact; you can ignore the message.
Symptom:
The user sees a TIMERNEG message related to VTP bulk sync on the stand by supervisor console as the stand by supervisor is booting up. (Please note that the reason for the supervisor rebooting is not related to this symptom; the only requirement is that the stand by supervisor is coming up in SSO mode.)
No traffic should be impacted by this symptom.
Conditions:
The chassis must be set to SSO mode (thus triggering a VTP bulk sync upon stand by supervisor boot up) for this TIMERNEG message to show up.
The active supervisor must have been up for some time for this effect to show up, and even then, it would appear to be "sporadic" (see below).
If the TIMERNEG message does show up on the stand by console, rebooting just the stand by at that time will very likely show the message again.
Workaround:
There is no traffic or service impact, so there is no need to work around this TIMERNEG message.
Further Problem Description:
Explanation of the word "sporadic" above: Analysis shows the length of time the active supervisor has been up determines whether this TIMERNEG message shows up. Internally, the stand by supervisor is hitting an arithmetic overflow condition such that every other 27 day period (1st, 3rd, 5th, ...), this message will show up upon boot, except that every other 270 day period (2nd, 4th, 6th, ...), the messages will not show up at all upon boot.
The root cause consists of a combination two different bugs which resulted in the TIMERNEG message. 1) The snmp_sysUpTime() is not being synchronized across the two supervisors because the sysuptime_sync sub system was not incorporated into the Galaxy builds. This software subsystem was added specifically to synchronize snmp_sysUpTime(), but the GSBU-specific build control file that specifies the subsystems to be part of the build did not have this subsystem listed. 2) The SetPeriodicTimeout() function did not properly protect against negative time values passed to mgd_timer_start(). The parameter to mgd_timer_start() is unsigned, but down in the lower parts of the call chain, the parameter turns into a signed integer, and is explicitly checked for negative values at that point. There is protection in SetPeriodicTimeout(), but it protects against a variable delta_t being negative, but the value passed to mgd_timer_start() is 10 * delta_t. Therefore, it is still possible to pass a negative value to mgd_timer_start() through SetPeriodicTimeout(). Ern
#SR:604741365 #Cat4507R which is used in customer environment, switched over with redundant SUP using SSO. #After the switchover, traceback generated. #Components ------------------ show module ------------------ Load for five secs: 88%/0%; one minute: 43%; five minutes: 23% Time source is NTP, 05:45:17.669 JST Fri Nov 3 2006 Chassis Type : WS-C4507R Power consumed by backplane : 40 Watts Mod Ports Card Type Model Serial No. ---+-----+--------------------------------------+------------------+----------- 1 2 Supervisor IV 1000BaseX (GBIC) WS-X4515 JAE1001T9M4 2 2 Supervisor IV 1000BaseX (GBIC) WS-X4515 JAE1001T66H 3 48 10/100/1000BaseT (RJ45) WS-X4548-GB-RJ45 JAE09117L4T 4 48 10/100/1000BaseT (RJ45) WS-X4548-GB-RJ45 JAE09107ASK 6 6 1000BaseX (GBIC) WS-X4306-GB JAE0950RRLY 7 6 1000BaseX (GBIC) WS-X4306-GB JAE0950RJC0 M MAC addresses Hw Fw Sw Status --+--------------------------------+---+------------+----------------+--------- 1 0016.4788.4c40 to 0016.4788.4c41 5.0 12.2(20r)EW1 12.2(31)SG Ok 2 0016.4788.4c42 to 0016.4788.4c43 5.0 12.2(20r)EW1 12.2(31)SG, Ok 3 0013.7f89.4700 to 0013.7f89.472f 2.0 Ok 4 0013.7f65.a7e0 to 0013.7f65.a80f 2.0 Ok 6 0015.630c.5d98 to 0015.630c.5d9d 4.1 Ok 7 0015.630c.5e16 to 0015.630c.5e1b 4.1 Ok Mod Redundancy role Operating mode Redundancy status ----+-------------------+-------------------+------------------- 1 Active Supervisor SSO Active 2 Standby Supervisor SSO Standby hot #Detail is below; 1. %C4K_REDUNDANCY-4-KEEPALIVE_WARNING generated on both Active SUP(slot2) and Standby SUP(slot1). 2. Active SUP(slot2) reloaded and switchover occurred. Standby SUP(slot1) became Active SUP(slot1) normally and Active SUP(slot2) became Standby SUP(slot2) normally. 3. After that, Standby SUP(slot2) generated %SYS-3-TIMERNEG with traceback. Nov 3 04:26:08.618 JST: %SYS-3-TIMERNEG: Cannot start timer (0x179D6718) with negative offset (-724680682). -Process= "chkpt message handler", ipl= 0, pid= 50 -Traceback= 10F846FC 10F8257C 10EA5A5C 10EA5ACC 1028575C 10284628 10284D98 102851AC 10E77990 107C0FB4 107BF988 107BCBFC 107B7998 105A7A7C 1059F3A8 #Please also refer for "related symptom (slot1 log)" and "related symptom (slot2 log)" for the log.
From a quick gdb session, I've seen a problem in the following code: int SetPeriodicTimeout ( SR_UINT32 when, SR_UINT32 period, void *info) { TimerEvent *tep; int delta_t; ... /* * calculate the delta time between "now" and when the timer * should fire (in centiseconds) */ delta_t = (int) when - snmp_sysUpTime(); if (delta_t < 0) delta_t = 0; tep->period = period * 10; tep->info = info; mgd_timer_start(&tep->timer, delta_t * 10); ... } From gdb: --------- (cisco-6.4-ppc-gdb) SetPeriodicTimeout (when=233931843, period=0, info=0x1b31de40) at ../snmp/snmp_timer.c:67 (cisco-6.4-ppc-gdb)p when $18 = 233931843 (cisco-6.4-ppc-gdb)p delta_t $19 = 233931843 (cisco-6.4-ppc-gdb)p (long) (delta_t * 10) $20 = -1955648866 (cisco-6.4-ppc-gdb) The value taken by delta_t is too big, such that (delta_t * 10) result being negative and so the error message. Not sure though why the snmp_sysUpTime() is returning 0.
Can not view this .txt file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=DDTS_History&ext=txt&type=FILE
Can not view this .txt file attachment inline, please click on the following link to view the attachment.
http:///cdts/siebel/siebsrvr/input/CSCsj28231/32/CSCsg43532_DDTS_History.txt
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=Diffs--carson&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
Can not view this .txt file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=Diffs--flo_gsbu7&ext=txt&type=FILE
Can not view this .txt file attachment inline, please click on the following link to view the attachment.
http://
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=Diffs--sierra&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
Can not view this .txt file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=Diffs--v122_31_sga_throttle&ext=txt&type=FILE
Can not view this .txt file attachment inline, please click on the following link to view the attachment.
http://
Can not view this .txt file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=Diffs--v122_37_sg_throttle&ext=txt&type=FILE
Can not view this .txt file attachment inline, please click on the following link to view the attachment.
http://
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=Static-Analysis-carson-Build-Report&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=Static-Analysis-diablo-Build-Report&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=Static-Analysis-sierra-Build-Report&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=Unit-Test-carson&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=Unit-Test-sierra&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=basic_sanity_carson_s72044-adventerprisek9_dbg-vz&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=basic_sanity_diablo_s72033-adventerprisek9_wan_dbg-vz&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=basic_sanity_sierra_s72033-adventerprisek9_dbg-vz&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=const-ios-code-review&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
07-23-2010 10:01 AM
OK, I think I solved it:
Limiting the output to relevant information, OI identifies CSCec51750 as the problem.
The release note is:
Symptoms: A router that is configured for HTTP secure-server may
reload unexpectedly because of an internal memory corruption.
Conditions: IOS HTTP Secure server enabled
Workaround: Disable HTTPS with "no ip http secure-server"
Also fixed in 12.2(33)SXH3
How did I do that?
The part of the crashinfor that catches my attention are the messages right before the restart:
Feb 3 14:28:54.120 Mexico: %SYS-2-NOPROCESS: No such process 218959117
-Process= "HTTP CORE", ipl= 0, pid= 177
-Traceback= 4105702C 4105731C 40A64E64 40A655A0 407E097C 407D2184 407D24E4 410129B4 410129A0
%ALIGN-1-FATAL: Illegal access to a low address 14:28:54 Mexico Wed Feb 3 2010
addr=0x66C, pc=0x40A64EF8, ra=0x40A655A0, sp=0x50FBDFA8
%ALIGN-1-FATAL: Illegal access to a low address 14:28:54 Mexico Wed Feb 3 2010
addr=0x66C, pc=0x40A64EF8, ra=0x40A655A0, sp=0x50FBDFA8
14:28:54 Mexico Wed Feb 3 2010: TLB (store) exception, CPU signal 10, PC = 0x40A64EF8
So, that's that cause of the crash - it looks like something is accessing it by HTTP ("HTTP CORE") in a weird way that is crashing the box.
Limiting the OI input to:
-----------
Feb 3 14:28:54.120 Mexico: %SYS-2-NOPROCESS: No such process 218959117
-Process= "HTTP CORE", ipl= 0, pid= 177
-Traceback= 4105702C 4105731C 40A64E64 40A655A0 407E097C 407D2184 407D24E4 410129B4 410129A0
%ALIGN-1-FATAL: Illegal access to a low address 14:28:54 Mexico Wed Feb 3 2010
addr=0x66C, pc=0x40A64EF8, ra=0x40A655A0, sp=0x50FBDFA8
%ALIGN-1-FATAL: Illegal access to a low address 14:28:54 Mexico Wed Feb 3 2010
addr=0x66C, pc=0x40A64EF8, ra=0x40A655A0, sp=0x50FBDFA8
14:28:54 Mexico Wed Feb 3 2010: TLB (store) exception, CPU signal 10, PC = 0x40A64EF8
--------------
OI focuses on the relevant messages and correctly (I think) finds the bug.
Sometimes it is usefull to manually trim input to OI. If you have the show version and the messages that seem to cause the problem, it does a much finer job of zeroing in on the issue.
07-23-2010 08:35 AM
Hi Sergio
To better assist you with below case, it would be great if you can help me answer below questions:
1) Are you looking for a software version that has fix for "CSCsg43532".
2) Can you please let me know how did you get awareness about "CSCsg43532" that you tried to see in Bug Toolkit?
Thanks
Arun
Product Manager
Cisco Bug Toolkit
07-23-2010 09:10 AM
The problem is that the bug was originally found in an internal testbed on a non-shipping branch, so it belongs to an internal group ("labtrunk") which is not externally visible.
However, the bug that the testing team uncovered seems to also be present in shipping branches (as evidenced by "sys" bugs CSCtb55433 and CSCsl02596 which declare themselves duplicates of CSCsg43532.)
Nowhere in the process of getting marked as a duplicate was the bug reclassified as shipping/production ("sys") which would make it visible in bug toolkit (this is a process hole). Similarly, bug toolkit does not recognize that this "labtrunk" bug is duplicated by "sys" bugs (logic hole). If the bug toolkit team can take action here, they can either either reclassify this bug as "sys" or change the bug tooklit to recognize that this bug has a valid release note and is duplicated by a "sys" bug and should therefore be visible. The latter is where the bug toolkit team has the most control.
Summary: you have uncovered a combination process/logic hole in the system. Thanks.
And, in case you care, this is the (current) release note for CSCsg43532 is pated below. Bottom line is that the bug is largely cosmetic and should have no operational impact; you can ignore the message.
Symptom:
The user sees a TIMERNEG message related to VTP bulk sync on the stand by supervisor console as the stand by supervisor is booting up. (Please note that the reason for the supervisor rebooting is not related to this symptom; the only requirement is that the stand by supervisor is coming up in SSO mode.)
No traffic should be impacted by this symptom.
Conditions:
The chassis must be set to SSO mode (thus triggering a VTP bulk sync upon stand by supervisor boot up) for this TIMERNEG message to show up.
The active supervisor must have been up for some time for this effect to show up, and even then, it would appear to be "sporadic" (see below).
If the TIMERNEG message does show up on the stand by console, rebooting just the stand by at that time will very likely show the message again.
Workaround:
There is no traffic or service impact, so there is no need to work around this TIMERNEG message.
Further Problem Description:
Explanation of the word "sporadic" above: Analysis shows the length of time the active supervisor has been up determines whether this TIMERNEG message shows up. Internally, the stand by supervisor is hitting an arithmetic overflow condition such that every other 27 day period (1st, 3rd, 5th, ...), this message will show up upon boot, except that every other 270 day period (2nd, 4th, 6th, ...), the messages will not show up at all upon boot.
The root cause consists of a combination two different bugs which resulted in the TIMERNEG message. 1) The snmp_sysUpTime() is not being synchronized across the two supervisors because the sysuptime_sync sub system was not incorporated into the Galaxy builds. This software subsystem was added specifically to synchronize snmp_sysUpTime(), but the GSBU-specific build control file that specifies the subsystems to be part of the build did not have this subsystem listed. 2) The SetPeriodicTimeout() function did not properly protect against negative time values passed to mgd_timer_start(). The parameter to mgd_timer_start() is unsigned, but down in the lower parts of the call chain, the parameter turns into a signed integer, and is explicitly checked for negative values at that point. There is protection in SetPeriodicTimeout(), but it protects against a variable delta_t being negative, but the value passed to mgd_timer_start() is 10 * delta_t. Therefore, it is still possible to pass a negative value to mgd_timer_start() through SetPeriodicTimeout(). Ern
#SR:604741365 #Cat4507R which is used in customer environment, switched over with redundant SUP using SSO. #After the switchover, traceback generated. #Components ------------------ show module ------------------ Load for five secs: 88%/0%; one minute: 43%; five minutes: 23% Time source is NTP, 05:45:17.669 JST Fri Nov 3 2006 Chassis Type : WS-C4507R Power consumed by backplane : 40 Watts Mod Ports Card Type Model Serial No. ---+-----+--------------------------------------+------------------+----------- 1 2 Supervisor IV 1000BaseX (GBIC) WS-X4515 JAE1001T9M4 2 2 Supervisor IV 1000BaseX (GBIC) WS-X4515 JAE1001T66H 3 48 10/100/1000BaseT (RJ45) WS-X4548-GB-RJ45 JAE09117L4T 4 48 10/100/1000BaseT (RJ45) WS-X4548-GB-RJ45 JAE09107ASK 6 6 1000BaseX (GBIC) WS-X4306-GB JAE0950RRLY 7 6 1000BaseX (GBIC) WS-X4306-GB JAE0950RJC0 M MAC addresses Hw Fw Sw Status --+--------------------------------+---+------------+----------------+--------- 1 0016.4788.4c40 to 0016.4788.4c41 5.0 12.2(20r)EW1 12.2(31)SG Ok 2 0016.4788.4c42 to 0016.4788.4c43 5.0 12.2(20r)EW1 12.2(31)SG, Ok 3 0013.7f89.4700 to 0013.7f89.472f 2.0 Ok 4 0013.7f65.a7e0 to 0013.7f65.a80f 2.0 Ok 6 0015.630c.5d98 to 0015.630c.5d9d 4.1 Ok 7 0015.630c.5e16 to 0015.630c.5e1b 4.1 Ok Mod Redundancy role Operating mode Redundancy status ----+-------------------+-------------------+------------------- 1 Active Supervisor SSO Active 2 Standby Supervisor SSO Standby hot #Detail is below; 1. %C4K_REDUNDANCY-4-KEEPALIVE_WARNING generated on both Active SUP(slot2) and Standby SUP(slot1). 2. Active SUP(slot2) reloaded and switchover occurred. Standby SUP(slot1) became Active SUP(slot1) normally and Active SUP(slot2) became Standby SUP(slot2) normally. 3. After that, Standby SUP(slot2) generated %SYS-3-TIMERNEG with traceback. Nov 3 04:26:08.618 JST: %SYS-3-TIMERNEG: Cannot start timer (0x179D6718) with negative offset (-724680682). -Process= "chkpt message handler", ipl= 0, pid= 50 -Traceback= 10F846FC 10F8257C 10EA5A5C 10EA5ACC 1028575C 10284628 10284D98 102851AC 10E77990 107C0FB4 107BF988 107BCBFC 107B7998 105A7A7C 1059F3A8 #Please also refer for "related symptom (slot1 log)" and "related symptom (slot2 log)" for the log.
From a quick gdb session, I've seen a problem in the following code: int SetPeriodicTimeout ( SR_UINT32 when, SR_UINT32 period, void *info) { TimerEvent *tep; int delta_t; ... /* * calculate the delta time between "now" and when the timer * should fire (in centiseconds) */ delta_t = (int) when - snmp_sysUpTime(); if (delta_t < 0) delta_t = 0; tep->period = period * 10; tep->info = info; mgd_timer_start(&tep->timer, delta_t * 10); ... } From gdb: --------- (cisco-6.4-ppc-gdb) SetPeriodicTimeout (when=233931843, period=0, info=0x1b31de40) at ../snmp/snmp_timer.c:67 (cisco-6.4-ppc-gdb)p when $18 = 233931843 (cisco-6.4-ppc-gdb)p delta_t $19 = 233931843 (cisco-6.4-ppc-gdb)p (long) (delta_t * 10) $20 = -1955648866 (cisco-6.4-ppc-gdb) The value taken by delta_t is too big, such that (delta_t * 10) result being negative and so the error message. Not sure though why the snmp_sysUpTime() is returning 0.
Can not view this .txt file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=DDTS_History&ext=txt&type=FILE
Can not view this .txt file attachment inline, please click on the following link to view the attachment.
http:///cdts/siebel/siebsrvr/input/CSCsj28231/32/CSCsg43532_DDTS_History.txt
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=Diffs--carson&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
Can not view this .txt file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=Diffs--flo_gsbu7&ext=txt&type=FILE
Can not view this .txt file attachment inline, please click on the following link to view the attachment.
http://
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=Diffs--sierra&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
Can not view this .txt file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=Diffs--v122_31_sga_throttle&ext=txt&type=FILE
Can not view this .txt file attachment inline, please click on the following link to view the attachment.
http://
Can not view this .txt file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=Diffs--v122_37_sg_throttle&ext=txt&type=FILE
Can not view this .txt file attachment inline, please click on the following link to view the attachment.
http://
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=Static-Analysis-carson-Build-Report&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=Static-Analysis-diablo-Build-Report&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=Static-Analysis-sierra-Build-Report&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=Unit-Test-carson&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=Unit-Test-sierra&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=basic_sanity_carson_s72044-adventerprisek9_dbg-vz&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=basic_sanity_diablo_s72033-adventerprisek9_wan_dbg-vz&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=basic_sanity_sierra_s72033-adventerprisek9_dbg-vz&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCsg43532&title=const-ios-code-review&ext=&type=FILE
Can not view this . file attachment inline, please click on the following link to view the attachment.
http://
07-23-2010 09:28 AM
Hi,
An unexpectedly reset on the active supervisor caused a switchover between the active and standby supervisor engines 32 / MSFC 2A
on cisco WS-C6509-E, Cisco IOS Software -s3223_rp Software (s3223_rp-IPSERVICESK9_WAN-M), Version 12.2(33)SXH4.
I used the output interpreter to analyze "show logging" command output and found a possible CSCsl02596 or CSCsg43532 bug, so I tried to find more details with Cisco Bug Toolkit but I did not have success, so I decided to upgrade from SXH4 to SXH6 because I found a traceback at the cras_info files -%ALIGN-1-FATAL. but I would like to know more details about CSCsg43532 bug.
I have attached the crash_info files.
Regards,
-Sergio
07-23-2010 09:34 AM
The unexpected reset is your root problem but the messages that Output Interpreter is catching are the cosmetic artifacts seen during the boot process caused by the bug. So you are chasing the wrong thing. :-) That is, if you applied this patch, the nuisance messages would go away but the problem might still remain.
Unfortunately, this is the wrong forum for chasing a 6500 issue - this group is about the data handling of the bug toolkit. You should either re-post your request in a LAN switching area on or open a TAC case at http://www.cisco.com/tac
07-23-2010 09:45 AM
Hi
Thanks for your explanation, I wanted only to know more details about the bug and I will re-post in a LAN switching area.
Regards,
-Sergio
07-23-2010 10:01 AM
OK, I think I solved it:
Limiting the output to relevant information, OI identifies CSCec51750 as the problem.
The release note is:
Symptoms: A router that is configured for HTTP secure-server may
reload unexpectedly because of an internal memory corruption.
Conditions: IOS HTTP Secure server enabled
Workaround: Disable HTTPS with "no ip http secure-server"
Also fixed in 12.2(33)SXH3
How did I do that?
The part of the crashinfor that catches my attention are the messages right before the restart:
Feb 3 14:28:54.120 Mexico: %SYS-2-NOPROCESS: No such process 218959117
-Process= "HTTP CORE", ipl= 0, pid= 177
-Traceback= 4105702C 4105731C 40A64E64 40A655A0 407E097C 407D2184 407D24E4 410129B4 410129A0
%ALIGN-1-FATAL: Illegal access to a low address 14:28:54 Mexico Wed Feb 3 2010
addr=0x66C, pc=0x40A64EF8, ra=0x40A655A0, sp=0x50FBDFA8
%ALIGN-1-FATAL: Illegal access to a low address 14:28:54 Mexico Wed Feb 3 2010
addr=0x66C, pc=0x40A64EF8, ra=0x40A655A0, sp=0x50FBDFA8
14:28:54 Mexico Wed Feb 3 2010: TLB (store) exception, CPU signal 10, PC = 0x40A64EF8
So, that's that cause of the crash - it looks like something is accessing it by HTTP ("HTTP CORE") in a weird way that is crashing the box.
Limiting the OI input to:
-----------
Feb 3 14:28:54.120 Mexico: %SYS-2-NOPROCESS: No such process 218959117
-Process= "HTTP CORE", ipl= 0, pid= 177
-Traceback= 4105702C 4105731C 40A64E64 40A655A0 407E097C 407D2184 407D24E4 410129B4 410129A0
%ALIGN-1-FATAL: Illegal access to a low address 14:28:54 Mexico Wed Feb 3 2010
addr=0x66C, pc=0x40A64EF8, ra=0x40A655A0, sp=0x50FBDFA8
%ALIGN-1-FATAL: Illegal access to a low address 14:28:54 Mexico Wed Feb 3 2010
addr=0x66C, pc=0x40A64EF8, ra=0x40A655A0, sp=0x50FBDFA8
14:28:54 Mexico Wed Feb 3 2010: TLB (store) exception, CPU signal 10, PC = 0x40A64EF8
--------------
OI focuses on the relevant messages and correctly (I think) finds the bug.
Sometimes it is usefull to manually trim input to OI. If you have the show version and the messages that seem to cause the problem, it does a much finer job of zeroing in on the issue.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide