10-30-2012 10:53 PM - edited 03-07-2019 09:46 AM
Hi Team,
We are getting some error logs on Nexus switch 7K.
Please check and let me know is it critical?..........
Loggs:
----------
2012 Oct 30 22:36:07 SWITCH %CMPPROXY-STANDBY-2-LOG_CMP_UP: Connectivity Management processor(on module 6) is now UP
2012 Oct 30 22:36:40 SWITCH %SYSMGR-2-GSYNC_SNAPSHOT_SRVFAILED: Service "ipqosmgr" on active supervisor failed to sto
re its snapshot (error-id 0x40480005).
2012 Oct 30 22:36:40 SWITCH %SYSMGR-2-STANDBY_BOOT_FAILED: Standby supervisor failed to boot up.
2012 Oct 30 22:36:42 SWITCH %PLATFORM-2-MOD_REMOVE: Module 6 removed (Serial number JAF1550ATBR)
2012 Oct 30 22:42:08 SWITCH %BOOTVAR-5-NEIGHBOR_UPDATE_AUTOCOPY: auto-copy supported by neighbor supervisor, starting
...
2012 Oct 30 22:42:26 SWITCH %CMPPROXY-STANDBY-2-LOG_CMP_UP: Connectivity Management processor(on module 6) is now UP
2012 Oct 30 22:43:12 SWITCH %SYSMGR-2-GSYNC_SNAPSHOT_SRVFAILED: Service "ipqosmgr" on active supervisor failed to sto
re its snapshot (error-id 0x40480005).
2012 Oct 30 22:43:12 SWITCH %SYSMGR-2-STANDBY_BOOT_FAILED: Standby supervisor failed to boot up.
2012 Oct 30 22:43:15 SWITCH %PLATFORM-2-MOD_REMOVE: Module 6 removed (Serial number JAF1550ATBR)
2012 Oct 30 22:47:59 SWITCH %BOOTVAR-5-NEIGHBOR_UPDATE_AUTOCOPY: auto-copy supported by neighbor supervisor, starting
...
SWITCH#
Thanks.........
Regards,
Senthil
10-31-2012 01:02 AM
Hi Senthil,
Looks like this is matching with a known Bug CSCtt94327, but please share the #show version to verify this issue.
CSCtt94327 - Standby unable to restore - Killing ipqosmgr manually at active&standby
Symptom:
Error message on Active Sup :
Nexus7000# 2011 Oct 20 18:14:54 Nexus7000%$ VDC-1 %$ %SYSMGR-2-GSYNC_SNAPSHOT_SRVFAILED: Service "ipqosmgr" on active supervisor failed to store its snapshot (error-id 0x40480005).
2011 Oct 20 18:14:55 Nexus7000 %$ VDC-1 %$ %SYSMGR-2-STANDBY_BOOT_FAILED: Standby supervisor failed to boot up.
2011 Oct 20 18:14:58 Nexus7000%$ VDC-1 %$ %PLATFORM-2-MOD_REMOVE: Module 6 removed (Serial number JAF1516DTNP)
Error-id decode :
Nexus7000# sh system error-id 0x40480005
Error Facility: pss
Error Description: too big pss key or value size
Conditions:
Killing IPQOSMGR manually on Active and Standbyone, one after another at once will cause the standby supervisor to reload continuously.
Workaround(s):
A reload of the active supervisor will resolve the issue.
The current fix for CSCtt94327 on the 5.2(x) and 6.0(x) trains are 5.2(3) and 6.0(3) respectively
Fix in
6.1(0.130)S0
5.2(2.71)S0
6.1(0.134)S0
5.2(2.75)S0
6.0(2)S1
6.2(0.40)S0
Regards,
Aru
*** Please rate if the post is useful ***
11-02-2012 11:06 AM
Hi Arumugam,
Thanks for your great support.........
if we reload the supervisor? How long? Why isn’t the stanby sup working?
Please send me the cisco bug CSCtt94327 link?....
advice exact IOS upgrade
Logs:
------
NODCPXX-NX7K002# sh version
Cisco Nexus Operating System (NX-OS) Software
TAC support: http://www.cisco.com/tac
Documents: http://www.cisco.com/en/US/products/ps9372/tsd_products_support_series_home.html
Copyright (c) 2002-2011, Cisco Systems, Inc. All rights reserved.
The copyrights to certain works contained in this software are
owned by other third parties and used and distributed under
license. Certain components of this software are licensed under
the GNU General Public License (GPL) version 2.0 or the GNU
Lesser General Public License (LGPL) Version 2.1. A copy of each
such license is available at
http://www.opensource.org/licenses/gpl-2.0.php and
http://www.opensource.org/licenses/lgpl-2.1.php
Software
BIOS: version 3.22.0
kickstart: version 5.2(1)
system: version 5.2(1)
BIOS compile time: 02/20/10
kickstart image file is: bootflash:///n7000-s1-kickstart.5.2.1.bin
kickstart compile time: 12/25/2020 12:00:00 [07/29/2011 04:34:35]
system image file is: bootflash:///n7000-s1-dk9.5.2.1.bin
system compile time: 6/7/2011 13:00:00 [07/29/2011 06:29:25]
Hardware
cisco Nexus7000 C7010 (10 Slot) Chassis ("Supervisor module-1X")
Intel(R) Xeon(R) CPU with 8260944 kB of memory.
Processor Board ID JAF1550ATFS
Device name: NODCPXX-NX7K002
bootflash: 2048256 kB
slot0: 0 kB (expansion flash)
Kernel uptime is 198 day(s), 19 hour(s), 46 minute(s), 44 second(s)
Last reset
Reason: Unknown
System version: 5.2(1)
Service:
plugin
Core Plugin, Ethernet Plugin
CMP (Module 5) ok
CMP Software
CMP BIOS version: 02.01.05
CMP Image version: 5.1(1) [build 5.0(0.66)]
CMP BIOS compile time: 8/ 4/2008 19:39:40
CMP Image compile time: 8/5/2011 13:00:00
CMP (Module 6) no response
NODCPXX-NX7K002#
11-02-2012 11:35 AM
Hi Senthil,
The fix is on 5.2(3) and 6.0(3)
See the bug detail on beloe page:
Conditions:
Killing IPQOSMGR manually on Active and Standbyone, one after another at once will cause the standby supervisor to reload continuously.
Workaround(s):
A reload of the active supervisor will resolve the issue.
Regards,
Aru
*** Please rate if the post is useful ***
11-02-2012 12:14 PM
Hi Arumugam,
your solution is only upgrade is it? or we can reload the current active sup?.... if reload how many hours it will take?...
I need to get downtime from customer?.......Please update me ASAP.
11-03-2012 11:44 PM
Hi Senthil,
Reload will fix the problem this time and Ugradation provides the permanent solultion.
The CISCO NEXUS 7000 platform provides 1+1 redundant supervisor modules that can perform a supervisor switchover (SSO) in critical failure situations.
The time it takes to switchover from active to standby depends on the configuration (# of VDCs, IGP config, etc.), however it is into seconds. Consider though that the switchover is hitless, so you won't see any packets drops
A supervisor switchover can be manually initiated in a chassis with two supervisor modules present. Once the switchover is performed, the previous active supervisor reloads and come back online as the standby supervisor
n7000# system switchover
Note:
To ensure that an HA switchover is possible, use the show system redundancy status command or the show module command. If the command output displays the ha-standby state for the standby supervisor module, you can manually initiate a switchover
Refer:
Regards,
Aru
*** Please rate if the post is uesful ***
11-13-2012 06:24 PM
Hi Aaru,
I can't see any ha standby mode......
Logs:
-----
NX7K002# sh system redundancy status
Redundancy mode
---------------
administrative: HA
operational: None
This supervisor (sup-1)
-----------------------
Redundancy state: Active
Supervisor state: Active
Internal state: Active with warm standby
Other supervisor (sup-2)
------------------------
Redundancy state: Standby
Supervisor state: Unknown
Internal state: Other
NX7K002#
NX7K002#
NX7K002# sh module
Mod Ports Module-Type Model Status
--- ----- ----------------------------------- ------------------ ----------
1 0 10 Gbps Ethernet XL Module powered-dn
2 32 10 Gbps Ethernet XL Module N7K-M132XP-12L ok
3 32 10 Gbps Ethernet XL Module N7K-M132XP-12L ok
5 0 Supervisor module-1X N7K-SUP1 active *
6 0 Supervisor module-1X powered-up
7 48 10/100/1000 Mbps Ethernet XL Module N7K-M148GT-11L ok
8 48 10/100/1000 Mbps Ethernet XL Module N7K-M148GT-11L ok
Mod Power-Status Reason
--- ------------ ---------------------------
1 powered-dn failure(powered-down) since maximum number of bringups were exceeded
Mod Sw Hw
--- -------------- ------
2 5.2(1) 1.3
3 5.2(1) 1.3
5 5.2(1) 2.3
7 5.2(1) 1.2
8 5.2(1) 1.2
Mod MAC-Address(es) Serial-Num
--- -------------------------------------- ----------
2 44-d3-ca-84-b2-e8 to 44-d3-ca-84-b3-0c JAF1529DLEF
3 28-94-0f-f8-fa-74 to 28-94-0f-f8-fa-98 JAF1550ANPG
5 64-a0-e7-45-e2-b8 to 64-a0-e7-45-e2-c0 JAF1550ATFS
7 28-94-0f-a8-48-fc to 28-94-0f-a8-49-30 JAF1549DJTS
8 28-94-0f-54-bd-fc to 28-94-0f-54-be-30 JAF1549ANGF
Mod Online Diag Status
--- ------------------
2 Pass
3 Pass
5 Pass
7 Pass
8 Pass
Xbar Ports Module-Type Model Status
--- ----- ----------------------------------- ------------------ ----------
1 0 Fabric Module 1 N7K-C7010-FAB-1 ok
2 0 Fabric Module 1 N7K-C7010-FAB-1 ok
3 0 Fabric Module 1 N7K-C7010-FAB-1 ok
Xbar Sw Hw
--- -------------- ------
1 NA 1.2
2 NA 1.2
3 NA 1.2
Xbar MAC-Address(es) Serial-Num
--- -------------------------------------- ----------
1 NA JAF1549ANPT
2 NA JAF1549ANMC
3 NA JAF1549AMKD
* this terminal session
NX7K002#
Thanks.........
Regards,
Senthil
11-14-2012 06:51 PM
Hi,
We plan to upgrade another nexus 7k also which is located in redmond, this one also Bug related issue (CSCtt94327)
If we use the ISSU method is it possible? because i can see one of the supervior model showing Powered-up...is not showing HA-standby?...
Thanks...
Logs:
====
NX7K002# sh module
Mod Ports Module-Type Model Status
--- ----- ----------------------------------- ------------------ ----------
1 0 10 Gbps Ethernet XL Module powered-dn
2 32 10 Gbps Ethernet XL Module N7K-M132XP-12L ok
3 32 10 Gbps Ethernet XL Module N7K-M132XP-12L ok
5 0 Supervisor module-1X N7K-SUP1 active *
6 0 Supervisor module-1X powered-up
7 48 10/100/1000 Mbps Ethernet XL Module N7K-M148GT-11L ok
8 48 10/100/1000 Mbps Ethernet XL Module N7K-M148GT-11L ok
Mod Power-Status Reason
--- ------------ ---------------------------
1 powered-dn failure(powered-down) since maximum number of bringups were exceeded
Mod Sw Hw
--- -------------- ------
2 5.2(1) 1.3
3 5.2(1) 1.3
5 5.2(1) 2.3
7 5.2(1) 1.2
8 5.2(1) 1.2
Mod MAC-Address(es) Serial-Num
--- -------------------------------------- ----------
2 44-d3-ca-84-b2-e8 to 44-d3-ca-84-b3-0c JAF1529DLEF
3 28-94-0f-f8-fa-74 to 28-94-0f-f8-fa-98 JAF1550ANPG
5 64-a0-e7-45-e2-b8 to 64-a0-e7-45-e2-c0 JAF1550ATFS
7 28-94-0f-a8-48-fc to 28-94-0f-a8-49-30 JAF1549DJTS
8 28-94-0f-54-bd-fc to 28-94-0f-54-be-30 JAF1549ANGF
Mod Online Diag Status
--- ------------------
2 Pass
3 Pass
5 Pass
7 Pass
8 Pass
Xbar Ports Module-Type Model Status
--- ----- ----------------------------------- ------------------ ----------
1 0 Fabric Module 1 N7K-C7010-FAB-1 ok
2 0 Fabric Module 1 N7K-C7010-FAB-1 ok
3 0 Fabric Module 1 N7K-C7010-FAB-1 ok
Xbar Sw Hw
--- -------------- ------
1 NA 1.2
2 NA 1.2
3 NA 1.2
Xbar MAC-Address(es) Serial-Num
--- -------------------------------------- ----------
1 NA JAF1549ANPT
2 NA JAF1549ANMC
3 NA JAF1549AMKD
* this terminal session
NX7K002#
NX7K002# sh version
Cisco Nexus Operating System (NX-OS) Software
TAC support: http://www.cisco.com/tac
Documents: http://www.cisco.com/en/US/products/ps9372/tsd_products_support_series_home.html
Copyright (c) 2002-2011, Cisco Systems, Inc. All rights reserved.
The copyrights to certain works contained in this software are
owned by other third parties and used and distributed under
license. Certain components of this software are licensed under
the GNU General Public License (GPL) version 2.0 or the GNU
Lesser General Public License (LGPL) Version 2.1. A copy of each
such license is available at
http://www.opensource.org/licenses/gpl-2.0.php and
http://www.opensource.org/licenses/lgpl-2.1.php
Software
BIOS: version 3.22.0
kickstart: version 5.2(1)
system: version 5.2(1)
BIOS compile time: 02/20/10
kickstart image file is: bootflash:///n7000-s1-kickstart.5.2.1.bin
kickstart compile time: 12/25/2020 12:00:00 [07/29/2011 04:34:35]
system image file is: bootflash:///n7000-s1-dk9.5.2.1.bin
system compile time: 6/7/2011 13:00:00 [07/29/2011 06:29:25]
Hardware
cisco Nexus7000 C7010 (10 Slot) Chassis ("Supervisor module-1X")
Intel(R) Xeon(R) CPU with 8260944 kB of memory.
Processor Board ID JAF1550ATFS
Device name: NX7K002
bootflash: 2048256 kB
slot0: 0 kB (expansion flash)
Kernel uptime is 211 day(s), 4 hour(s), 34 minute(s), 41 second(s)
Last reset
Reason: Unknown
System version: 5.2(1)
Service:
plugin
Core Plugin, Ethernet Plugin
CMP (Module 5) ok
CMP Software
CMP BIOS version: 02.01.05
CMP Image version: 5.1(1) [build 5.0(0.66)]
CMP BIOS compile time: 8/ 4/2008 19:39:40
CMP Image compile time: 8/5/2011 13:00:00
NX7K002#
NX7K002# sh logging last 25
2012 Nov 14 18:12:18 NX7K002 %PLATFORM-2-MOD_REMOVE: Module 6 removed (Serial number JAF1550ATBR)
2012 Nov 14 18:17:43 NX7K002 %BOOTVAR-5-NEIGHBOR_UPDATE_AUTOCOPY: auto-copy supported by neighbor supervisor, starting
...
2012 Nov 14 18:18:13 NX7K002 %CMPPROXY-STANDBY-2-LOG_CMP_UP: Connectivity Management processor(on module 6) is now UP
2012 Nov 14 18:18:51 NX7K002 last message repeated 1 time
2012 Nov 14 18:18:51 NX7K002 %SYSMGR-2-GSYNC_SNAPSHOT_SRVFAILED: Service "ipqosmgr" on active supervisor failed to sto
re its snapshot (error-id 0x40480005).
2012 Nov 14 18:18:52 NX7K002 %SYSMGR-2-STANDBY_BOOT_FAILED: Standby supervisor failed to boot up.
2012 Nov 14 18:18:54 NX7K002 %PLATFORM-2-MOD_REMOVE: Module 6 removed (Serial number JAF1550ATBR)
2012 Nov 14 18:24:22 NX7K002 %BOOTVAR-5-NEIGHBOR_UPDATE_AUTOCOPY: auto-copy supported by neighbor supervisor, starting
...
2012 Nov 14 18:24:52 NX7K002 %CMPPROXY-STANDBY-2-LOG_CMP_UP: Connectivity Management processor(on module 6) is now UP
2012 Nov 14 18:25:30 NX7K002 last message repeated 1 time
2012 Nov 14 18:25:33 NX7K002 %SYSMGR-2-GSYNC_SNAPSHOT_SRVFAILED: Service "ipqosmgr" on active supervisor failed to sto
re its snapshot (error-id 0x40480005).
2012 Nov 14 18:25:33 NX7K002 %SYSMGR-2-STANDBY_BOOT_FAILED: Standby supervisor failed to boot up.
2012 Nov 14 18:25:37 NX7K002 %PLATFORM-2-MOD_REMOVE: Module 6 removed (Serial number JAF1550ATBR)
2012 Nov 14 18:30:50 NX7K002 %BOOTVAR-5-NEIGHBOR_UPDATE_AUTOCOPY: auto-copy supported by neighbor supervisor, starting
...
2012 Nov 14 18:31:19 NX7K002 %CMPPROXY-STANDBY-2-LOG_CMP_UP: Connectivity Management processor(on module 6) is now UP
2012 Nov 14 18:31:57 NX7K002 last message repeated 1 time
2012 Nov 14 18:32:02 NX7K002 %SYSMGR-2-GSYNC_SNAPSHOT_SRVFAILED: Service "ipqosmgr" on active supervisor failed to sto
re its snapshot (error-id 0x40480005).
2012 Nov 14 18:32:02 NX7K002 %SYSMGR-2-STANDBY_BOOT_FAILED: Standby supervisor failed to boot up.
2012 Nov 14 18:32:05 NX7K002 %PLATFORM-2-MOD_REMOVE: Module 6 removed (Serial number JAF1550ATBR)
2012 Nov 14 18:36:43 NX7K002 %BOOTVAR-5-NEIGHBOR_UPDATE_AUTOCOPY: auto-copy supported by neighbor supervisor, starting
...
2012 Nov 14 18:37:13 NX7K002 %CMPPROXY-STANDBY-2-LOG_CMP_UP: Connectivity Management processor(on module 6) is now UP
2012 Nov 14 18:37:51 NX7K002 last message repeated 1 time
2012 Nov 14 18:37:54 NX7K002 %SYSMGR-2-GSYNC_SNAPSHOT_SRVFAILED: Service "ipqosmgr" on active supervisor failed to sto
re its snapshot (error-id 0x40480005).
2012 Nov 14 18:37:54 NX7K002 %SYSMGR-2-STANDBY_BOOT_FAILED: Standby supervisor failed to boot up.
2012 Nov 14 18:37:57 NX7K002 %PLATFORM-2-MOD_REMOVE: Module 6 removed (Serial number JAF1550ATBR)
NX7K002#
Thanks...
Regards,
Senthil
11-14-2012 07:51 PM
Hi Senthil,
The standby supervisor is showing unknown state presenly,
NX7K002# sh system redundancy status
Redundancy mode
---------------
administrative: HA
operational: None
This supervisor (sup-1)
-----------------------
Redundancy state: Active
Supervisor state: Active
Internal state: Active with warm standby
Other supervisor (sup-2)
------------------------
Redundancy state: Standby
Supervisor state: Unknown
Internal state: Other
Unknown state indicates that the system is in an invalid state and requires a support call to TAC. The standby sup is not booting correctly.
Suggestion: To fix this issue, you need to reload the complete Box. Many times this workaround helps to fix the issue, then both Active and standby supervisor get sync.
For ISSU upgradation, please see the below requirment:
In a Nexus 7000 series chassis with dual supervisors, you can use the in-service software upgrade (ISSU) feature to upgrade the system software while the system continues to forward traffic. An ISSU uses the existing features of nonstop forwarding (NSF) with stateful switchover (SSO) to perform the software upgrade with no system downtime
In a redundant system with two supervisors, one of the supervisors is active while the other operates in the standby mode. During an ISSU, the new software is loaded onto the standby supervisor while the active supervisor continues to operate using the old software. As part of the upgrade, a switchover occurs between the active and standby supervisors, and the standby supervisor becomes active and begins running the new software. After the switchover, the new software is loaded onto the (formerly active) standby supervisor.
Please get fix the Standby supervisor issue, then perform the ISSU.
Refer:
Understanding In-Service Software Upgrades
Cisco NX-OS Software Upgrade or Downgrade
Regards,
Aru
*** Please rate if the post is useful ***
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide