09-30-2014 06:19 AM
I have a ACE20-MOD-K9 with version Version A2(3.5), and this Unexpected Reload:
I have this code: last boot reason: NP 2 Failed : SRAM Parity Error Chan 3
I need a Workaround for this problem please.
Thanks
Solved! Go to Solution.
09-30-2014 06:33 AM
Hi Juan,
Looking at the last reload reason, the ACE seems to have reloaded due to SRAM parity error. If the ACE reloads with the same issue again within an year, ACE should be RMA'd. You should find some crash/core files in dir core:. If you send me those i can verify if this is indeed the crash that occurred. But i am 99.99% sure it is the SRAM PARITY CRASH. Here's the brief about it:
The SRAM parity error presented in the core file is not due to a software issue.
The issue is the result of a "bit-flip" within the SRAM itself which can occur as a
result of environmental conditions. This "bit-flip" is rectified by a simple reboot of
the system, which would occur with the generation of the core file. Cisco internal
testing and customer experience has shown that these types of issues can occur
with very low frequency, but do not require an RMA of the device.
ACE is susceptible to this because of the way it uses SRAM to store control information
and packet data as opposed to scratch-pad storage. Almost any 1-bit flip will be detected
as a parity error.
CSCtc53046 is a partial software workaround which mitigates hardware generated SRAM
parity errors by reducing the amount of access to the SRAM due to the collection of the interface
statistics. It is highly recommended that customers upgrade to A2(3.3) or later to both
lower the overall rate of SRAM parity errors and ensure failover occurs appropriately.
SRAM errors are expected to occur at a frequency of approximately one per year per ACE module.
If a particular module experiences a significantly higher failure rate and is running A2(3.3)
or later, then a proactive RMA would be in order.
Regards,
Kanwal
Note: Please mark answers if they are helpful.
09-30-2014 06:33 AM
Hi Juan,
Looking at the last reload reason, the ACE seems to have reloaded due to SRAM parity error. If the ACE reloads with the same issue again within an year, ACE should be RMA'd. You should find some crash/core files in dir core:. If you send me those i can verify if this is indeed the crash that occurred. But i am 99.99% sure it is the SRAM PARITY CRASH. Here's the brief about it:
The SRAM parity error presented in the core file is not due to a software issue.
The issue is the result of a "bit-flip" within the SRAM itself which can occur as a
result of environmental conditions. This "bit-flip" is rectified by a simple reboot of
the system, which would occur with the generation of the core file. Cisco internal
testing and customer experience has shown that these types of issues can occur
with very low frequency, but do not require an RMA of the device.
ACE is susceptible to this because of the way it uses SRAM to store control information
and packet data as opposed to scratch-pad storage. Almost any 1-bit flip will be detected
as a parity error.
CSCtc53046 is a partial software workaround which mitigates hardware generated SRAM
parity errors by reducing the amount of access to the SRAM due to the collection of the interface
statistics. It is highly recommended that customers upgrade to A2(3.3) or later to both
lower the overall rate of SRAM parity errors and ensure failover occurs appropriately.
SRAM errors are expected to occur at a frequency of approximately one per year per ACE module.
If a particular module experiences a significantly higher failure rate and is running A2(3.3)
or later, then a proactive RMA would be in order.
Regards,
Kanwal
Note: Please mark answers if they are helpful.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide