本帖最后由 xy411381121 于 2021-3-15 22:50 编辑 Nexus7000 C7010,版本是6.1(2)
最近在show module检查N7K的时候发现备引擎经常fail
Mod Ports Module-Type Model Status
--- ----- ----------------------------------- ------------------ ----------
1 32 10 Gbps Ethernet XL Module N7K-M132XP-12L ok
2 32 10 Gbps Ethernet XL Module N7K-M132XP-12L ok
5 0 Supervisor module-1X N7K-SUP1 active *
6 0 Supervisor module-1X N7K-SUP1 ha-standby
10 48 10/100/1000 Mbps Ethernet XL Module N7K-M148GT-11L ok
Mod Sw Hw
--- -------------- ------
1 6.1(2) 1.5
2 6.1(2) 3.1
5 6.1(2) 3.2
6 6.1(2) 3.2
10 6.1(2) 2.1
Mod Online Diag Status
--- ------------------
1 Pass
2 Pass
5 Pass
6
Fail10 Pass
查看诊断信息发现第6项PrimaryBootROM有问题
Current bootup diagnostic level: complete
Module 6: Supervisor module-1X (Standby)
Test results: (. = Pass, F = Fail, I = Incomplete,
U = Untested, A = Abort, E = Error disabled)
1) ASICRegisterCheck-------------> .
2) USB---------------------------> .
3) CryptoDevice------------------> .
4) NVRAM-------------------------> .
5) RealTimeClock-----------------> .
6) PrimaryBootROM---------------->
F 7) SecondaryBootROM--------------> .
8) CompactFlash------------------> .
9) ExternalCompactFlash----------> .
10) PwrMgmtBus--------------------> U
11) SpineControlBus---------------> .
12) SystemMgmtBus-----------------> U
13) StatusBus---------------------> U
14) StandbyFabricLoopback---------> .
15) ManagementPortLoopback--------> .
16) EOBCPortLoopback--------------> .
17) OBFL--------------------------> .
日志显示如下:
2021 Mar 11 12:12:40 N7K_01 %DIAGCLIENT-STANDBY-2-EEM_ACTION_HM_SHUTDOWN: Test
has been disabled as a part of default EEM action
2021 Mar 11 12:12:40 N7K_01 %DEVICE_TEST-STANDBY-2-PRIMARY_BOOTROM_FAIL: Module 6 has failed test PrimaryBootROM 20 times on device Primary BootROM due to error (null)
以前也遇到过引擎板fail,但诊断出来是SpineControlBus有问题,而且查看文档发现是个BUG问题,根据官网的CSCuc72466文档修改该引擎test 11的自检时间就好了,但这次我参看CSCuc72466的说明尝试修改该引擎test 6的自检时间后,当时show module,查看是到的module 6是pass,但过几天又会再次变为fail,近3周差不多就出现了3次,今天又发现,即使看到module 6 为pass,但PrimaryBootROM依旧为F。
请问这次这个应该不是BUG问题了吧,是硬件故障吗?