cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1811
Views
0
Helpful
5
Replies

Nexus 9504 Line card issue

russell.sage
Level 3
Level 3

I have 4 new 9504 switch being prepared to replace catalyst 6509's. They have been sitting in the DC's for several months powered up config in place waiting for the migration. 

A check to day show 2 of the 4 switches reporting the following;

 

Mod Ports Module-Type Model Status
--- ----- ------------------------------------- --------------------- ---------
1 52 48x10G + 4x40/100G Ethernet Module N9K-X9788TC-FX powered-dn
2 52 48x10/25G + 4x40/100G Ethernet Module N9K-X97160YC-EX powered-dn
3 52 48x10/25G + 4x40/100G Ethernet Module N9K-X97160YC-EX powered-dn

 

From the log

2022 Feb 10 11:55:19 cen-msl-sw-01-new %PLATFORM-2-MOD_DETECT: Module 3 detected (Serial number FOC244301UH) Module-Type 48x10/25G + 4x40/100G Ethernet Module Model N9K-X97160YC-EX
2022 Feb 10 11:55:19 cen-msl-sw-01-new %PLATFORM-2-MOD_PWRUP: Module 3 powered up (Serial number FOC244301UH)
2022 Feb 10 11:55:19 cen-msl-sw-01-new %PLATFORM-5-MOD_STATUS: Module 3 current-status is MOD_STATUS_POWERED_UP
2022 Feb 10 11:55:21 cen-msl-sw-01-new %PLATFORM-2-MOD_DETECT: Module 2 detected (Serial number FOC244301X1) Module-Type 48x10/25G + 4x40/100G Ethernet Module Model N9K-X97160YC-EX
2022 Feb 10 11:55:21 cen-msl-sw-01-new %PLATFORM-2-MOD_PWRUP: Module 2 powered up (Serial number FOC244301X1)
2022 Feb 10 11:55:21 cen-msl-sw-01-new %PLATFORM-5-MOD_STATUS: Module 2 current-status is MOD_STATUS_POWERED_UP
2022 Feb 10 11:55:21 cen-msl-sw-01-new %PLATFORM-2-MOD_DETECT: Module 1 detected (Serial number FOC244133PM) Module-Type 48x10G + 4x40/100G Ethernet Module Model N9K-X9788TC-FX
2022 Feb 10 11:55:21 cen-msl-sw-01-new %PLATFORM-2-MOD_PWRUP: Module 1 powered up (Serial number FOC244133PM)
2022 Feb 10 11:55:21 cen-msl-sw-01-new %PLATFORM-5-MOD_STATUS: Module 1 current-status is MOD_STATUS_POWERED_UP
2022 Feb 10 11:55:33 cen-msl-sw-01-new %CARDCLIENT-5-MOD_BOOT_PRIMARY: Module 3 IOFPGA booted from Primary
2022 Feb 10 11:55:33 cen-msl-sw-01-new %CARDCLIENT-5-MOD_BOOT_PRIMARY: Module 3 MIFPGA booted from Primary
2022 Feb 10 11:55:33 cen-msl-sw-01-new %CARDCLIENT-5-MOD_BOOT_PRIMARY: Module 3 BIOS booted from Primary
2022 Feb 10 11:55:35 cen-msl-sw-01-new %CARDCLIENT-5-MOD_BOOT_PRIMARY: Module 2 IOFPGA booted from Primary
2022 Feb 10 11:55:35 cen-msl-sw-01-new %CARDCLIENT-5-MOD_BOOT_PRIMARY: Module 2 MIFPGA booted from Primary
2022 Feb 10 11:55:35 cen-msl-sw-01-new %CARDCLIENT-5-MOD_BOOT_PRIMARY: Module 2 BIOS booted from Primary
2022 Feb 10 11:55:36 cen-msl-sw-01-new %CARDCLIENT-5-MOD_BOOT_PRIMARY: Module 1 IOFPGA booted from Primary
2022 Feb 10 11:55:36 cen-msl-sw-01-new %CARDCLIENT-5-MOD_BOOT_PRIMARY: Module 1 MIFPGA booted from Primary
2022 Feb 10 11:55:36 cen-msl-sw-01-new %CARDCLIENT-5-MOD_BOOT_PRIMARY: Module 1 BIOS booted from Primary
2022 Feb 10 11:57:47 cen-msl-sw-01-new %PLATFORM-2-MOD_PWRDN: Module 3 powered down (Serial number FOC244301UH)
2022 Feb 10 11:57:47 cen-msl-sw-01-new %PLATFORM-5-MOD_STATUS: Module 3 current-status is MOD_STATUS_CONFIGPOWERED_DOWN
2022 Feb 10 11:57:47 cen-msl-sw-01-new %PLATFORM-5-MOD_STATUS: Module 3 current-status is MOD_STATUS_POWERED_DOWN
2022 Feb 10 11:57:48 cen-msl-sw-01-new %PLATFORM-2-MOD_PWRDN: Module 2 powered down (Serial number FOC244301X1)
2022 Feb 10 11:57:48 cen-msl-sw-01-new %PLATFORM-5-MOD_STATUS: Module 2 current-status is MOD_STATUS_CONFIGPOWERED_DOWN
2022 Feb 10 11:57:48 cen-msl-sw-01-new %PLATFORM-5-MOD_STATUS: Module 2 current-status is MOD_STATUS_POWERED_DOWN
2022 Feb 10 11:57:51 cen-msl-sw-01-new %PLATFORM-2-MOD_PWRDN: Module 1 powered down (Serial number FOC244133PM)
2022 Feb 10 11:57:51 cen-msl-sw-01-new %PLATFORM-5-MOD_STATUS: Module 1 current-status is MOD_STATUS_CONFIGPOWERED_DOWN
2022 Feb 10 11:57:51 cen-msl-sw-01-new %PLATFORM-5-MOD_STATUS: Module 1 current-status is MOD_STATUS_POWERED_DOWN

Mod Power-Status Reason
--- ------------ ---------------------------
1 powered-dn Reset (powered-down) because module does not boot
2 powered-dn Reset (powered-down) because module does not boot
3 powered-dn Reset (powered-down) because module does not boot

All other cards in the switch are working fine.

Has anyone seen this issue. 

Soft resets have tried don't work

Power Cycle doesn't work.

1 Accepted Solution

Accepted Solutions

Hi Russell,

I appreciate you providing that output! The "Module initialization failed (DevErr is received syserr or failing sap)" error code reported in the output of show module internal exceptionlog can indicate many different issues, all of which require some more in-depth analysis.

To move forward with this, I would recommend opening a support case with Cisco TAC so that Cisco can investigate this further and provide you with a conclusive root cause. I highly recommend proactively attaching the output of the show tech-support details from both switches encountering this issue, as that will help TAC investigate further.

Thank you!

-Christopher

View solution in original post

5 Replies 5

Christopher Hart
Cisco Employee
Cisco Employee

Hi Russell,

This looks like somebody manually powered down these line cards through the poweroff configuration command. You can confirm this with the show running-config | include poweroff command, which should show that all three modules are manually powered off.

To correct this, you can use the configuration commands below to bring the line cards back online:

switch# configure terminal
switch(config)# no poweroff module 1
switch(config)# no poweroff module 2
switch(config)# no poweroff module 3
switch(config)# end
switch# copy running-config startup-config
[########################################] 100%
Copy complete, now saving to disk (please wait)...
Copy complete.
switch# 

 I hope this helps - thank you!

-Christopher

Hi
Thanks for the input but unfortunately we have already tried that before we did the power cycle

Hi Russell,

Apologies, you're correct - I checked in my lab, and when a line card is manually powered down, a reason of "Configured Power down" is shown in the output of show module. Obviously, your switch is showing a different reason.

A few questions:

  1. Is the syslog output you provided filtered at all? If possible, can you attach the full output of show logging logfile?
  2. What NX-OS software release is running on the two switches encountering this issue?
  3. What NX-OS software release is running on the two switches not encountering this issue?
  4. Can you provide the full output of show module? I am curious what kind of fabric modules are inserted in this chassis.

Thank you!

-Christopher

Christoper

All four switches are identical interms of cards and software 9.3(5)
When we run sh module internal exceptionlog module 1
We get
[15:37] Pearson, Iain, Vodafone
exception information --- exception instance 1 ----
Module Slot Number: 1
Device Id : 111
Device Name : 0x6F
Device Errorcode : 0x000001a8
Device ID : 00 (0x00)
Device Instance : 00 (0x00)
Dev Type (HW/SW) : 01 (0x01)
ErrNum (devInfo) : 168 (0xa8)
System Errorcode : 0x401d003e Module initialization failed (DevErr is received
syserr or failing sap)
Error Type : Informational
PhyPortLayer : Unknown
Port(s) Affected :
DSAP : 0 (0x0)
UUID : 0 (0x0)
Time : Fri Feb 11 14:05:33 2022
(Ticks: 62066D2D jiffies)
I will endeavour to get the log files you have requested

Hi Russell,

I appreciate you providing that output! The "Module initialization failed (DevErr is received syserr or failing sap)" error code reported in the output of show module internal exceptionlog can indicate many different issues, all of which require some more in-depth analysis.

To move forward with this, I would recommend opening a support case with Cisco TAC so that Cisco can investigate this further and provide you with a conclusive root cause. I highly recommend proactively attaching the output of the show tech-support details from both switches encountering this issue, as that will help TAC investigate further.

Thank you!

-Christopher

Review Cisco Networking for a $25 gift card