01-14-2017 12:14 AM - edited 03-08-2019 08:54 AM
Hi ,
I am getting below error in the logs .There is no impact by i keep getting this alert in the logs
AEAUH-INJDC-01NB01-N5KSFSW01 %SATCTRL-FEX101-4-SOHMS_DIAG_WARN: FEX-101 Module 1: Runtime diag detected minor event: Correctable ECC errors <dev=0, count=3>
Below is show version
Cisco Nexus Operating System (NX-OS) Software
TAC support: http://www.cisco.com/tac
Documents: http://www.cisco.com/en/US/products/ps9372/tsd_products_support_series_home.html
Copyright (c) 2002-2013, Cisco Systems, Inc. All rights reserved.
The copyrights to certain works contained herein are owned by
other third parties and are used and distributed under license.
Some parts of this software are covered under the GNU Public
License. A copy of the license is available at
http://www.gnu.org/licenses/gpl.html.
Software
BIOS: version 3.6.0
loader: version N/A
kickstart: version 6.0(2)N2(1)
system: version 6.0(2)N2(1)
Power Sequencer Firmware:
Module 1: version v1.0
Module 2: version v1.0
Microcontroller Firmware: version v1.0.0.1
SFP uC: Module 1: v1.1.0.0
QSFP uC: Module not detected
BIOS compile time: 05/09/2012
kickstart image file is: bootflash:///n5000-uk9-kickstart.6.0.2.N2.1.bin
kickstart compile time: 7/24/2013 3:00:00 [07/24/2013 14:49:21]
system image file is: bootflash:///n5000-uk9.6.0.2.N2.1.bin
01-16-2017 02:44 AM
Hi
that usually means a memory errors
check sh diagnostic result fex 101...see if it returns ok status
xxxxxxxxxxx# sh diagnostic result fex 101
FEX-101: Fabric Extender 48x1GE + 4x10G Module SerialNo : xxxxxxxxxxx
Overall Diagnostic Result for FEX-101 : OK
Test results: (. = Pass, F = Fail, U = Untested)
TestPlatform:
0) SPROM: ---------------> .
1) Inband interface: ---------------> .
2) Fan: ---------------> .
3) Power Supply: ---------------> .
4) Temperature Sensor: ---------------> .
TestForwardingPorts:
Eth 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Port ------------------------------------------------------------------------
. . . . . . . . . . . . . . . . . . . . . . . .
Eth 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
Port ------------------------------------------------------------------------
. . . . . . . . . . . . . . . . . . . . . . . .
TestFabricPorts:
Fabric 1 2 3 4
Port ------------
. . . .
03-16-2017 11:51 PM
Hi,
have you found solution to this issue?
I have similar errors in log and suspect memory/cpu problems.
I have run sh processes cpu history on fex and found high workload.
01-28-2023 02:12 PM
Hi Asim,
This is not a software issue. This error means that a single-bit ECC correction (error correction) was made on FEX SDRAM memory.
It is harmless because hardware was able to correct the memory error via ECC. There's a counter that tracks these corrections:
prt> show new_ints
| SS9 : ssx_int_err_ecc1 |
|--+---------+----------------------------------+
|6 |000000050 | single-bit ECC error | main memory bank 1
|-----------------------------------------------|
If this is recurring then FEX should be replaced.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide