cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3926
Views
10
Helpful
8
Replies

ASR 9010 RSP switchover without any cause?

n.t85
Level 1
Level 1

Hi Community,

 

Yesterday we experienced a RSP switchover on our primary ASR 9010 router. The process was successful with low impact on traffic, however I am not able to find any cause that may have triggered the switchover. The standby card took mastership role and became Active, while the active card did a reload and now is in standby and ready to take mastership in case of failure.

 

Below a output form #show redundancy

 

RP/0/RSP1/CPU0:ATU_ASR_9010(admin)#show redundancy
Fri Apr 20 09:19:02.206 Tirana
Redundancy information for node 0/RSP1/CPU0:
==========================================
Node 0/RSP1/CPU0 is in ACTIVE role
Node Redundancy Partner (0/RSP0/CPU0) is in STANDBY role
Standby node in 0/RSP0/CPU0 is ready
Standby node in 0/RSP0/CPU0 is NSR-not-configured
Node 0/RSP1/CPU0 is in process group PRIMARY role
Process Redundancy Partner (0/RSP0/CPU0) is in BACKUP role
Backup node in 0/RSP0/CPU0 is ready
Backup node in 0/RSP0/CPU0 is NSR-not-configured

Group Primary Backup Status
--------- --------- --------- ---------
v6-routing 0/RSP1/CPU0 0/RSP0/CPU0 Ready
mcast-routing 0/RSP1/CPU0 0/RSP0/CPU0 Ready
netmgmt 0/RSP1/CPU0 0/RSP0/CPU0 Ready
v4-routing 0/RSP1/CPU0 0/RSP0/CPU0 Ready
central-services 0/RSP1/CPU0 0/RSP0/CPU0 Ready
dlrsc 0/RSP1/CPU0 0/RSP0/CPU0 Ready
dsc 0/RSP1/CPU0 0/RSP0/CPU0 Ready

Reload and boot info
----------------------
A9K-RSP-4G reloaded Thu Oct 1 06:16:42 2015: 2 years, 28 weeks, 6 days, 3 hours, 2 minutes ago
Active node booted Thu Oct 1 06:21:04 2015: 2 years, 28 weeks, 6 days, 2 hours, 57 minutes ago
Last switch-over Thu Apr 19 17:28:41 2018: 15 hours, 50 minutes ago
Standby node boot Thu Apr 19 17:29:28 2018: 15 hours, 49 minutes ago
Standby node last went not ready Thu Apr 19 17:31:54 2018: 15 hours, 47 minutes ago
Standby node last went ready Thu Apr 19 17:32:58 2018: 15 hours, 46 minutes ago
There has been 1 switch-over since reload

Active node reload "Cause: MBI-HELLO reloading node on receiving reload notification"
Standby node reload "Cause: USB Failure"

 

From first check the cards seem all healthy however I am not sure what caused the switchover in the first place.

 

Any suggestion would be highly appreciated

 

Thanks in advance

Nikola

1 Accepted Solution

Accepted Solutions

This is very likely a HW failure. The RSP should be replaced.

View solution in original post

8 Replies 8

Aleksandar Vidakovic
Cisco Employee
Cisco Employee

hi Nikola,

 

Standby node reload "Cause: USB Failure"

 

That's why the former active went into reload. There must be more in the syslog around the "Apr 19 17:28:41" timestamp that explains what kind of error was detected and on which USB device.

 

You can also look into "admin show logging onboard error all location <location>"

 

/Aleksandar

Hi Aleksandar,

 

Thanks for the reply.

Further in the admin log there is only one line showing error on RSP0 as follows:

 

04/19/2018 17:27:56 sev:0 0/RSP0/CPU0 umass-enum[98]: USB: device not responding, reloading 0x41

 

It is very strange as there is no other log showing any sign of any failure. The last previous log for this card is on 2015 when there was a manual reboot. Also now it seems the card is working fine and ready to come up to active. 

 

hi Nikola,

 

can you look for the harddisk:/umass.log file on that RSP and attach it here? Can you also share the "sh install active summary"? If you're running a recent release and this USB error was reported more than once, it would be good to replace the RSP.

 

/Aleksandar

Hi again Aleksandar,

Attached is the log file umass.log which shows that it has done this once in 2013 and yesterday.

 

Below also the output from the "sh install active summary". This install is the same from 2015.


Active Packages:
disk0:asr9k-services-infra-5.1.3
disk0:asr9k-mini-px-5.1.3
disk0:asr9k-optic-px-5.1.3
disk0:asr9k-doc-px-5.1.3
disk0:asr9k-9000v-nV-px-5.1.3
disk0:asr9k-bng-px-5.1.3
disk0:asr9k-fpd-px-5.1.3
disk0:asr9k-mpls-px-5.1.3
disk0:asr9k-video-px-5.1.3
disk0:asr9k-mcast-px-5.1.3
disk0:asr9k-mgbl-px-5.1.3
disk0:asr9k-asr903-nV-px-5.1.3
disk0:asr9k-services-px-5.1.3
disk0:asr9k-li-px-5.1.3
disk0:asr9k-asr901-nV-px-5.1.3
disk0:asr9k-px-5.1.3.CSCus22621-1.0.0
disk0:asr9k-px-5.1.3.CSCur84030-1.0.0
disk0:asr9k-px-5.1.3.CSCur83427-1.0.0
disk0:asr9k-px-5.1.3.CSCus64708-1.0.0
disk0:asr9k-px-5.1.3.CSCus71103-1.0.0
disk0:asr9k-px-5.1.3.CSCus79201-1.0.0
disk0:asr9k-px-5.1.3.CSCus64739-1.0.0
disk0:asr9k-px-5.1.3.CSCuv06002-1.0.0
disk0:asr9k-px-5.1.3.CSCur26760-1.0.0
disk0:asr9k-px-5.1.3.CSCuu49049-1.0.0

 

Thank you for your support

Regards

This is very likely a HW failure. The RSP should be replaced.

I actually have a spare RSP card with the same install. If i replace it now that the failed one is on standby will it get automatically the last configuration from the active one?

 

Thank you very much for your support,

Regards,

You can count on that.

Aleksandar Vidakovic
Cisco Employee
Cisco Employee

hi Nikola,

 

Standby node reload "Cause: USB Failure"

 

That's why the former active went into reload. There must be more in the syslog around the "Apr 19 17:28:41" timestamp that explains what kind of error was detected and on which USB device.

 

You can also look into "admin show logging onboard error all location <location>"

 

/Aleksandar

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: