cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Announcements

1057
Views
5
Helpful
19
Replies
Cisco Employee

Re: Supervisor Failed Over to Standby

Thank you for this information.

 

From this statement:

 

Initially there was multiple interfaces were down. Upon checking, we noticed that the active supervisor failed over. There are no high or low input and output rate on the interfaces. The interfaces are running fine. There are no CRC or input errors noticed on the interfaces. I have also checked on the CPU usage and there is no spikes noticed based from the history and utilization logs. At this point of time, there was no recent changes done. 

 

Could I just confirm that CPU and port stats that you uploaded were collected before the actual failure of the Supervisor Engine? Otherwise, is there any way that you can retrieve the same information before the failure? (e.g. via a syslog server or a monitoring NMS system?)

 

While I understand the need of a root cause, the information provided so far about the switchover is very generic (not anyone's fault of course), so a search for a specific bug would be very broad, and not very fruitful. We could spend a very good amount of time looking for bugs that are just "similar" but that would be as close as we could get. 

 

Kind regards,

Eduardo.

 

Beginner

Re: Supervisor Failed Over to Standby

Hi Guys, 

 

The logs were collected after the failed over happened. As of now, we are trying to get an access to the supervisor to capture some logs for further investigation. At the moment, we are unable to access the supervisor due to privilege issue and we are currently sorting that issue first. Will give you guys an update once we received the logs. 

 

Thank you. 

Contributor

Re: Supervisor Failed Over to Standby

Console into the standby Supervisor Engine in order to check whether it is in ROMmon mode or in continuous reboot. If the standby Supervisor Engine is in either of these two states, refer to Recover a Cisco IOS Catalyst 4500/4000 Series Switch from a Corrupt or Missing Image or in Rommon Mode.

4507#show module

Mod  Ports Card Type                              Model             Serial No.
----+-----+--------------------------------------+-----------------+-----------
 1      2  1000BaseX (GBIC) Supervisor(active)    WS-X4515          JAB0627065V
 2         Standby Supervisor
 3     48  10/100/1000BaseTX (RJ45)               WS-X4448-GB-RJ45  JAB053606AG
 4     48  10/100BaseTX (RJ45)V                   WS-X4148-RJ45V    JAE060800BL

 M MAC addresses                    Hw  Fw           Sw               Status
--+--------------------------------+---+------------+----------------+---------
 1 0009.e845.6300 to 0009.e845.6301 0.4 12.1(12r)EW( 12.1(12c)EW, EAR Ok       
 2 Unknown                              Unknown      Unknown          Other
 3 0001.6443.dd20 to 0001.6443.dd4f 0.0                               Ok       
 4 0008.2138.d900 to 0008.2138.d92f 1.6                               Ok
Make sure that the Supervisor Engine module properly seats in the backplane connector and that you have completely screwed down the Supervisor Engine installation screw. For more information, refer to the Installing and Removing the Supervisor Engine section of the document Installation and Configuration Note for the Catalyst 4000 Family Supervisor Engine

In order to identify whether the standby Supervisor Engine is faulty, issue the redundancy reload peer command from the active Supervisor Engine and through the console to the standby Supervisor Engine. Observe the bootup sequence in order to identify any hardware failures. Currently, the active Supervisor Engine cannot access the power-on diagnostics results of the standby Supervisor Engine.

Contributor

Re: Supervisor Failed Over to Standby

Check the software versions in both supervisors from output of "show module". Also the communication seems to be "Simplex".  It should be Duplex mode. I have a 4500 series switch with SSO. Below is the output of various commands.

L3_SW#sh redundancy
Redundant System Information :
------------------------------
Available system uptime = 6 weeks, 4 days, 18 hours, 12 minutes
Switchovers system experienced = 1
Standby failures = 0
Last switchover reason = active unit failed

Hardware Mode = Duplex
Configured Redundancy Mode = Stateful Switchover
Operating Redundancy Mode = Stateful Switchover
Maintenance Mode = Disabled
Communications = Up

Current Processor Information :
-------------------------------
Active Location = slot 2
Current Software state = ACTIVE
Uptime in current state = 6 weeks, 4 days, 18 hours, 12 minutes
Image Version = Cisco IOS Software, Catalyst 4500 L3 Switch Sof
tware (cat4500-ENTSERVICESK9-M), Version 15.0(2)SG5, RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2012 by Cisco Systems, Inc.
Compiled Tue 31-Jul-12 03:44 by prod_rel_team
BOOT = bootflash:cat4500-entservicesk9-mz.150-2.SG5.bi
n,12;
Configuration register = 0x2102

Peer Processor Information :
----------------------------
Standby Location = slot 1
Current Software state = STANDBY HOT
Uptime in current state = 3 weeks, 1 day, 13 hours, 3 minutes
Image Version = Cisco IOS Software, Catalyst 4500 L3 Switch Sof
tware (cat4500-ENTSERVICESK9-M), Version 15.0(2)SG5, RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2012 by Cisco Systems, Inc.
Compiled Tue 31-Jul-12 03:44 by p
BOOT = bootflash:cat4500-entservicesk9-mz.150-2.SG5.bi
n,12;
Configuration register = 0x2102


L3_SW#sh redundancy states
my state = 13 -ACTIVE
peer state = 8 -STANDBY HOT
Mode = Duplex
Unit = Secondary
Unit ID = 2

Redundancy Mode (Operational) = Stateful Switchover
Redundancy Mode (Configured) = Stateful Switchover
Redundancy State = Stateful Switchover
Maintenance Mode = Disabled
Manual Swact = enabled
Communications = Up

client count = 60
client_notification_TMR = 240000 milliseconds
keep_alive TMR = 9000 milliseconds
keep_alive count = 1
keep_alive threshold = 18
RF debug mask = 0x0

Highlighted
Contributor

Re: Supervisor Failed Over to Standby

The following guidelines and restrictions apply to supervisor engine redundancy:

•If SSO mode cannot be established between the active and standby supervisor engines because of an incompatibility in the configuration file, a mismatched command list (MCL) is generated at the active supervisor engine and a reload into RPR mode is forced for the standby supervisor engine. Subsequent attempts to establish SSO, after removing the offending configuration and rebooting the standby supervisor engine with the exact same image, might cause the C4K_REDUNDANCY-2-IOS_VERSION_CHECK_FAIL and ISSU-3-PEER_IMAGE_INCOMPATIBLE messages to appear because the peer image is listed as incompatible. If the configuration problem can be corrected, you can clear the peer image from the incompatible list with the redundancy config-sync ignore mismatched-commands EXEC command while the peer is in a standby cold (RPR) state. This action allows the standy supervisor engine to boot in standby hot (SSO) state when it reloads.

Here are the steps:
Step 1 Clear the offending configuration (that caused an MCL) while the standby supervisor engine is in standby cold (RPR) state.

Step 2 Enter the redundancy config-sync ignore mismatched-commands EXEC command at the active standby supervisor engine.

Step 3 Perform write memory.

Step 4 Reload the standy supervisor engine with the redundancy reload peer command.
•RPR and SSO requires Cisco IOS-XE Release 3.1.0 SG and later releases.

•SSO is not supported if the IOS-XE software is running in the LAN Base mode.

•WS-C4507R-E, WS-C4510R-E, WS-C4507R+E, and WS-C4510R+E are the only Catalyst 4500 series switches that support Supervisor Engine 7-E redundancy.

•SSO requires both supervisor engines in the chassis to have the same components ( model and memory), and to use the same Cisco IOS XE software image.

•When you use WS-X45-SUP7-E in RPR or SSO mode, only the first two uplinks on each supervisor engine are available. The second two uplinks are unavailable.

•The active and standby supervisor engines in the chassis must be in slots 3 and 4 for 7-slot chassis and slot 5 and 6 for 10-slot chassis.

•Each supervisor engine in the chassis must have its own flash device and console port connections to operate the switch on its own.

•Each supervisor engine must have a unique console connection. Do not connect a Y cable to the console ports.

•Supervisor engine redundancy does not provide supervisor engine load balancing.

•The Cisco Express Forwarding (CEF) table is cleared on a switchover. As a result, routed traffic is interrupted until route tables reconverge. This reconvergence time is minimal because the SSO feature reduces the supervisor engine redundancy switchover time from 30+ seconds to subsecond, so Layer 3 also has a faster failover time if the switch is configured for SSO.

•Static IP routes are maintained across a switchover because they are configured from entries in the configuration file.

•Information about Layer 3 dynamic states that is maintained on the active supervisor engine is not synchronized to the standby supervisor engine and is lost on switchover.

•If configuration changes on a redundant swtich are made through SNMP set operations, the changes are not synchronized to the standby supervisor engine even in SSO mode. You might experience unexpected behavior.

•After you configure the switch through SNMP in SSO mode, copy the running-config file to the startup-config file on the active supervisor engine to trigger synchronization of the startup-config file to the standby supervisor engine. Then, reload the standby supervisor engine so that the new configuration is applied on the standby supervisor engine.

•You cannot perform configuration changes during the startup (bulk) synchronization. If you attempt to make configuration changes during this process, the following message is generated:

 Config mode locked out till standby initializes
•If configuration changes occur at the same time as a supervisor engine switchover, these configuration changes are lost.

•If you remove a line card from a redundant switch and initiate an SSO switchover, then reinsert the line card, and all interfaces are shutdown. The rest of the original line card configuration is preserved.

CreatePlease to create content
Content for Community-Ad
July's Community Spotlight Awards