cancel
Showing results for 
Search instead for 
Did you mean: 
cancel

Why am I getting "HA-REDCON-1-STANDBY_NOT_READY" periodically?

4933
Views
0
Helpful
4
Comments

It is often asked "why are my RSPs not synced, is there a problem?"

Let's take a look at the messages received and what the router is doing.

RP/0/RSP0/CPU0:Aug  9 02:28:07.938 : rmf_svr[383]: %HA-REDCON-1-STANDBY_NOT_READY : standby card is NOT ready

RP/0/RSP0/CPU0:Aug  9 02:28:07.938 : rmf_svr[383]: %HA-REDCON-1-BACKUP_NOT_READY : backup process groups between 0/RSP0/CPU0 and 0/RSP1/CPU0 are NOT ready

RP/0/RSP1/CPU0:Aug  9 02:28:07.938 : rmf_svr[383]: %HA-REDCON-1-STANDBY_NOT_READY : standby card is NOT ready

RP/0/RSP1/CPU0:Aug  9 02:28:07.939 : rmf_svr[383]: %HA-REDCON-6-STBY_BACKUP_NOT_READY : This card is standby and its backup process groups are NOT ready

These messages may appear for many different reasons, and does not typically imply any kind of problem. The most common time this is seen is when doing configuration changes.

For example, in this router, I added an OSPF process and committed the changes.  These are the syslogs seen from that:

RP/0/RSP0/CPU0:ASR9K(config-ospf-ar-if)#commit

Fri Aug  9 02:28:07.903 UTC

RP/0/RSP0/CPU0:Aug  9 02:28:07.938 : rmf_svr[383]: %HA-REDCON-1-STANDBY_NOT_READY : standby card is NOT ready

RP/0/RSP0/CPU0:Aug  9 02:28:07.938 : rmf_svr[383]: %HA-REDCON-1-BACKUP_NOT_READY : backup process groups between 0/RSP0/CPU0 and 0/RSP1/CPU0 are NOT ready

RP/0/RSP1/CPU0:Aug  9 02:28:07.938 : rmf_svr[383]: %HA-REDCON-1-STANDBY_NOT_READY : standby card is NOT ready

RP/0/RSP1/CPU0:Aug  9 02:28:07.939 : rmf_svr[383]: %HA-REDCON-6-STBY_BACKUP_NOT_READY : This card is standby and its backup process groups are NOT ready

RP/0/RSP0/CPU0:Aug  9 02:28:07.943 : rmf_svr[383]: %HA-REDCON-1-BACKUP_READY : backup process groups between 0/RSP0/CPU0 and 0/RSP1/CPU0 are ready

RP/0/RSP0/CPU0:Aug  9 02:28:07.943 : rmf_svr[383]: %HA-REDCON-1-BACKUP_NOT_READY : backup process groups between 0/RSP0/CPU0 and 0/RSP1/CPU0 are NOT ready

RP/0/RSP1/CPU0:Aug  9 02:28:07.943 : rmf_svr[383]: %HA-REDCON-6-STBY_BACKUP_READY : This card is standby and its backup process groups are ready

RP/0/RSP1/CPU0:Aug  9 02:28:07.945 : rmf_svr[383]: %HA-REDCON-6-STBY_BACKUP_NOT_READY : This card is standby and its backup process groups are NOT ready

RP/0/RSP0/CPU0:Aug  9 02:28:07.946 : rmf_svr[383]: %HA-REDCON-1-BACKUP_READY : backup process groups between 0/RSP0/CPU0 and 0/RSP1/CPU0 are ready

RP/0/RSP1/CPU0:Aug  9 02:28:07.947 : rmf_svr[383]: %HA-REDCON-6-STBY_BACKUP_READY : This card is standby and its backup process groups are ready

RP/0/RSP0/CPU0:Aug  9 02:28:07.964 : ospf[1013]: %ROUTING-OSPF-5-HA_NOTICE_START : Starting OSPF

RP/0/RSP1/CPU0:Aug  9 02:28:07.965 : ospf[1013]: %ROUTING-OSPF-5-HA_NOTICE_START : Starting OSPF

RP/0/RSP1/CPU0:Aug  9 02:28:08.238 : ospf[1013]: %ROUTING-OSPF-6-HA_INFO : Process vpncon: OSPF process initialization complete

RP/0/RSP0/CPU0:Aug  9 02:28:08.282 : ospf[1013]: %ROUTING-OSPF-6-HA_INFO : Process vpncon: OSPF process initialization complete

RP/0/RSP0/CPU0:Aug  9 02:28:08.292 : ospf[1013]: %ROUTING-OSPF-5-HA_NOTICE : Process vpncon: Signaled PROC_AVAILABLE

RP/0/RSP0/CPU0:Aug  9 02:28:09.115 : config[65784]: %MGBL-CONFIG-6-DB_COMMIT : Configuration committed by user 'bgarland'. Use 'show configuration commit changes 1000000050' to view the changes.

RP/0/RSP0/CPU0:ASR9010-C(config-ospf-ar-if)#RP/0/RSP0/CPU0:Aug  9 02:28:17.940 : rmf_svr[383]: %HA-REDCON-1-STANDBY_READY : standby card is ready

RP/0/RSP0/CPU0:Aug  9 02:28:18.136 : rmf_svr[383]: %HA-REDCON-1-BACKUP_NOT_READY : backup process groups between 0/RSP0/CPU0 and 0/RSP1/CPU0 are NOT ready

RP/0/RSP1/CPU0:Aug  9 02:28:18.136 : rmf_svr[383]: %HA-REDCON-6-STBY_BACKUP_NOT_READY : This card is standby and its backup process groups are NOT ready

RP/0/RSP0/CPU0:Aug  9 02:28:18.139 : rmf_svr[383]: %HA-REDCON-1-BACKUP_READY : backup process groups between 0/RSP0/CPU0 and 0/RSP1/CPU0 are ready

RP/0/RSP1/CPU0:Aug  9 02:28:18.140 : rmf_svr[383]: %HA-REDCON-6-STBY_BACKUP_READY : This card is standby and its backup process groups are ready

As you can see, while the config is being commit and processes are spawned, the standby goes out of sync and thus is "NOT_READY".  Normally this is well under 100ms but may take up to 3 seconds.  You can see what where changes are made and understand what is causing this by looking further at the "show redundancy trace" output. 

For example, here is the output of that trace for this config commit.

Aug  9 02:28:07.938 rmf/db 0/RSP0/CPU0 t1  DB - Time update object - name v4-routing field 0

Aug  9 02:28:07.938 rmf/db 0/RSP0/CPU0 t1  DB - Update object - name v4-routing primary 0/RSP0/CPU0 backup 0/RSP1/CPU0

Aug  9 02:28:07.938 rmf/svr 0/RSP0/CPU0 t1  NR bit 0 SET node type: pid 249993 0/RSP0/CPU0 placed:

Aug  9 02:28:07.938 rmf/svr 0/RSP0/CPU0 t1   Placed-VS Spawn in progress

Aug  9 02:28:07.938 rmf/svr 0/RSP0/CPU0 t1  RDY - Set STANDBY(0/RSP1/CPU0) not ready

Aug  9 02:28:07.938 rmf/svr 0/RSP0/CPU0 t1  TIMER - Starting timer NR Alarm Timer for 600 seconds

Aug  9 02:28:07.938 rmf/svr 0/RSP0/CPU0 t3  MSG - Async event received: 7 - notify NSR client value 0

Aug  9 02:28:07.938 rmf/svr 0/RSP0/CPU0 t1  TIMER - Starting timer NR Timer for 10 seconds

Aug  9 02:28:07.938 rmf/svr 0/RSP0/CPU0 t1  SYSMGR - sysmgr operation - set standby state: not ready

Aug  9 02:28:07.938 rmf/svr 0/RSP0/CPU0 t1  NR bit 1 SET group v4-routing: pid 249993 0/RSP1/CPU0 placed:

Aug  9 02:28:07.938 rmf/svr 0/RSP0/CPU0 t1   Placed-VS Spawn in progress

Aug  9 02:28:07.938 rmf/svr 0/RSP0/CPU0 t1  RDY - Set process group v4-routing not ready

Aug  9 02:28:07.938 rmf/svr 0/RSP0/CPU0 t1  RDY - Set BACKUP(0/RSP1/CPU0) not ready

Aug  9 02:28:07.938 rmf/svr 0/RSP0/CPU0 t1  TIMER - Starting timer NR Group Alarm Timer for 600 seconds

Aug  9 02:28:07.942 rmf/svr 0/RSP0/CPU0 t1  NR bit 0 CLR: pid 249993 0/RSP0/CPU0 placed cleared after 0 second(s):

Aug  9 02:28:07.942 rmf/svr 0/RSP0/CPU0 t1   Placed-VS Spawn in progress

Aug  9 02:28:07.942 rmf/svr 0/RSP0/CPU0 t1  RDY - Set STANDBY(0/RSP1/CPU0) ready: conn up: yes

Aug  9 02:28:07.942 rmf/svr 0/RSP0/CPU0 t1  RDY - STANDBY is ready, 0 second(s) after going not ready

Aug  9 02:28:07.942 rmf/svr 0/RSP0/CPU0 t1  TIMER - Stopping timer NR Alarm Timer

Aug  9 02:28:07.942 rmf/svr 0/RSP0/CPU0 t1  SYSMGR - sysmgr operation - set standby state: ready

Aug  9 02:28:07.942 rmf/svr 0/RSP0/CPU0 t3  MSG - Async event received: 7 - notify NSR client value 1

Aug  9 02:28:07.942 rmf/svr 0/RSP0/CPU0 t1  NR bit 1 CLR: pid 249993 0/RSP1/CPU0 placed cleared after 0 second(s):

Aug  9 02:28:07.942 rmf/svr 0/RSP0/CPU0 t1   Placed-VS Spawn in progress

Aug  9 02:28:07.942 rmf/svr 0/RSP0/CPU0 t1  RDY - Set process group v4-routing ready

Aug  9 02:28:07.942 rmf/svr 0/RSP0/CPU0 t1  RDY - v4-routing is ready, 0 second(s) after going not ready

Aug  9 02:28:07.943 rmf/db 0/RSP0/CPU0 t1  DB - Time update object - name v4-routing field 1

Aug  9 02:28:07.943 rmf/db 0/RSP0/CPU0 t1  DB - Update object - name v4-routing primary 0/RSP0/CPU0 backup 0/RSP1/CPU0

Aug  9 02:28:07.943 rmf/db 0/RSP0/CPU0 t1  DB - Time update object - name v4-routing field 0

Aug  9 02:28:07.943 rmf/db 0/RSP0/CPU0 t1  DB - Update object - name v4-routing primary 0/RSP0/CPU0 backup 0/RSP1/CPU0

Aug  9 02:28:07.943 rmf/svr 0/RSP0/CPU0 t1  RDY - Set BACKUP(0/RSP1/CPU0) ready: conn up: yes

Aug  9 02:28:07.943 rmf/svr 0/RSP0/CPU0 t1  TIMER - Stopping timer NR Group Alarm Timer

Aug  9 02:28:07.943 rmf/svr 0/RSP0/CPU0 t1  NR bit 0 SET node type: pid 249993 0/RSP0/CPU0 placed:

Aug  9 02:28:07.943 rmf/svr 0/RSP0/CPU0 t1   Placed-VS Spawn in progress

Aug  9 02:28:07.943 rmf/svr 0/RSP0/CPU0 t1  RDY - Set STANDBY(0/RSP1/CPU0) not ready

Aug  9 02:28:07.943 rmf/svr 0/RSP0/CPU0 t1  TIMER - Starting timer NR Alarm Timer for 600 seconds

Aug  9 02:28:07.943 rmf/svr 0/RSP0/CPU0 t1  SYSMGR - sysmgr operation - set standby state: not ready

Aug  9 02:28:07.943 rmf/svr 0/RSP0/CPU0 t3  MSG - Async event received: 7 - notify NSR client value 0

Aug  9 02:28:07.943 rmf/svr 0/RSP0/CPU0 t1  NR bit 1 SET group v4-routing: pid 249993 0/RSP1/CPU0 placed:

Aug  9 02:28:07.943 rmf/svr 0/RSP0/CPU0 t1   Placed-VS Spawn in progress

Aug  9 02:28:07.943 rmf/svr 0/RSP0/CPU0 t1  RDY - Set process group v4-routing not ready

Aug  9 02:28:07.943 rmf/svr 0/RSP0/CPU0 t1  RDY - Set BACKUP(0/RSP1/CPU0) not ready

Aug  9 02:28:07.943 rmf/svr 0/RSP0/CPU0 t1  TIMER - Starting timer NR Group Alarm Timer for 600 seconds

Aug  9 02:28:07.946 rmf/db 0/RSP0/CPU0 t1  DB - Time update object - name v4-routing field 1

Aug  9 02:28:07.946 rmf/db 0/RSP0/CPU0 t1  DB - Update object - name v4-routing primary 0/RSP0/CPU0 backup 0/RSP1/CPU0

Aug  9 02:28:07.946 rmf/svr 0/RSP0/CPU0 t1  NR bit 0 CLR: pid 249993 0/RSP0/CPU0 placed cleared after 0 second(s):

Aug  9 02:28:07.946 rmf/svr 0/RSP0/CPU0 t1   Placed-VS Spawn in progress

Aug  9 02:28:07.946 rmf/svr 0/RSP0/CPU0 t1  RDY - Set STANDBY(0/RSP1/CPU0) ready: conn up: yes

Aug  9 02:28:07.946 rmf/svr 0/RSP0/CPU0 t1  RDY - STANDBY is ready, 0 second(s) after going not ready

Aug  9 02:28:07.946 rmf/svr 0/RSP0/CPU0 t1  TIMER - Stopping timer NR Alarm Timer

Aug  9 02:28:07.946 rmf/svr 0/RSP0/CPU0 t1  SYSMGR - sysmgr operation - set standby state: ready

Aug  9 02:28:07.946 rmf/svr 0/RSP0/CPU0 t3  MSG - Async event received: 7 - notify NSR client value 1

Aug  9 02:28:07.946 rmf/svr 0/RSP0/CPU0 t1  NR bit 1 CLR: pid 249993 0/RSP1/CPU0 placed cleared after 0 second(s):

Aug  9 02:28:07.946 rmf/svr 0/RSP0/CPU0 t1   Placed-VS Spawn in progress

Aug  9 02:28:07.946 rmf/svr 0/RSP0/CPU0 t1  RDY - Set process group v4-routing ready

Aug  9 02:28:07.946 rmf/svr 0/RSP0/CPU0 t1  RDY - v4-routing is ready, 0 second(s) after going not ready

Aug  9 02:28:07.946 rmf/svr 0/RSP0/CPU0 t1  RDY - Set BACKUP(0/RSP1/CPU0) ready: conn up: yes

Aug  9 02:28:07.946 rmf/svr 0/RSP0/CPU0 t1  TIMER - Stopping timer NR Group Alarm Timer

Aug  9 02:28:17.940 rmf/svr 0/RSP0/CPU0 t1  MSG - Async event received: 2 - not ready timer value -1

Aug  9 02:28:18.136 rmf/db 0/RSP0/CPU0 t1  DB - Time update object - name v4-routing field 0

Aug  9 02:28:18.136 rmf/db 0/RSP0/CPU0 t1  DB - Update object - name v4-routing primary 0/RSP0/CPU0 backup 0/RSP1/CPU0

Aug  9 02:28:18.136 rmf/svr 0/RSP0/CPU0 t1  NR bit 0 SET group v4-routing: pid 106611040 0/RSP0/CPU0 ospf:

Aug  9 02:28:18.136 rmf/svr 0/RSP0/CPU0 t1   Checkpoint initial data transfer in progress

Aug  9 02:28:18.136 rmf/svr 0/RSP0/CPU0 t1  RDY - Set process group v4-routing not ready

Aug  9 02:28:18.136 rmf/svr 0/RSP0/CPU0 t1  RDY - Set BACKUP(0/RSP1/CPU0) not ready

Aug  9 02:28:18.136 rmf/svr 0/RSP0/CPU0 t1  TIMER - Starting timer NR Group Alarm Timer for 600 seconds

Aug  9 02:28:18.139 rmf/db 0/RSP0/CPU0 t1  DB - Time update object - name v4-routing field 1

Aug  9 02:28:18.139 rmf/db 0/RSP0/CPU0 t1  DB - Update object - name v4-routing primary 0/RSP0/CPU0 backup 0/RSP1/CPU0

Aug  9 02:28:18.139 rmf/svr 0/RSP0/CPU0 t1  NR bit 0 CLR: pid 106611040 0/RSP0/CPU0 ospf cleared after 0 second(s):

Aug  9 02:28:18.139 rmf/svr 0/RSP0/CPU0 t1   Checkpoint initial data transfer in progress

Aug  9 02:28:18.139 rmf/svr 0/RSP0/CPU0 t1  RDY - Set process group v4-routing ready

Aug  9 02:28:18.139 rmf/svr 0/RSP0/CPU0 t1  RDY - v4-routing is ready, 0 second(s) after going not ready

Aug  9 02:28:18.139 rmf/svr 0/RSP0/CPU0 t1  RDY - Set BACKUP(0/RSP1/CPU0) ready: conn up: yes

Aug  9 02:28:18.139 rmf/svr 0/RSP0/CPU0 t1  TIMER - Stopping timer NR Group Alarm Timer

RP/0/RSP0/CPU0:ASR9K#

From the traces above we can see that the RSPs begin going out of sync to startup the initial OSPF process on the standby.  Then the v4-routing checkpoint data.

Most of the time these are completely normal.  If for some reason you believe you are getting these for something not config related then start with looking at the "show redundancy trace" command for that time period and see what initiated it to go out of sync. 

Comments
Beginner
Hi Bryan,

This's normal process on IOSXR ? Do you have any information about this.

Beginner

Hi Bryan,

All expected behaviour in your example due to a new ospf process starting on the standby.

How about loss of sync in XR 5.3.1 due to ipv6_rib with no corresponding configuration change?

RP/0/RP0/CPU0:Apr 20 13:29:14.487 BST: rmf_svr[391]: %HA-REDCON-1-BACKUP_NOT_READY : backup process groups between 0/RP0/CPU0 and 0/RP1/CPU0 are NOT ready
RP/0/RP1/CPU0:Apr 20 13:29:14.488 BST: rmf_svr[391]: %HA-REDCON-6-STBY_BACKUP_NOT_READY : This card is standby and its backup process groups are NOT ready
RP/0/RP0/CPU0:Apr 20 13:29:14.493 BST: rmf_svr[391]: %HA-REDCON-1-BACKUP_READY : backup process groups between 0/RP0/CPU0 and 0/RP1/CPU0 are ready
RP/0/RP1/CPU0:Apr 20 13:29:14.493 BST: rmf_svr[391]: %HA-REDCON-6-STBY_BACKUP_READY : This card is standby and its backup process groups are ready

Apr 20 13:29:14.487 rmf/db 0/RP0/CPU0 t1 DB - Time update object - name v6-routing field 0
Apr 20 13:29:14.487 rmf/db 0/RP0/CPU0 t1 DB - Update object - name v6-routing primary 0/RP0/CPU0 backup 0/RP1/CPU0
Apr 20 13:29:14.487 rmf/svr 0/RP0/CPU0 t1 NR bit 0 SET group v6-routing: pid 725310 0/RP0/CPU0 ipv6_rib:
Apr 20 13:29:14.487 rmf/svr 0/RP0/CPU0 t1 Checkpoint initial data transfer in progress
Apr 20 13:29:14.487 rmf/svr 0/RP0/CPU0 t1 RDY - Set process group v6-routing not ready
Apr 20 13:29:14.487 rmf/svr 0/RP0/CPU0 t1 RDY - Set BACKUP(0/RP1/CPU0) not ready
Apr 20 13:29:14.487 rmf/svr 0/RP0/CPU0 t1 TIMER - Starting timer NR Group Alarm Timer for 600 seconds
Apr 20 13:29:14.493 rmf/db 0/RP0/CPU0 t1 DB - Time update object - name v6-routing field 1
Apr 20 13:29:14.493 rmf/db 0/RP0/CPU0 t1 DB - Update object - name v6-routing primary 0/RP0/CPU0 backup 0/RP1/CPU0
Apr 20 13:29:14.493 rmf/svr 0/RP0/CPU0 t1 NR bit 0 CLR: pid 725310 0/RP0/CPU0 ipv6_rib cleared after 0 second(s):
Apr 20 13:29:14.493 rmf/svr 0/RP0/CPU0 t1 Checkpoint initial data transfer in progress
Apr 20 13:29:14.493 rmf/svr 0/RP0/CPU0 t1 RDY - Set process group v6-routing ready
Apr 20 13:29:14.493 rmf/svr 0/RP0/CPU0 t1 RDY - v6-routing is ready, 0 second(s) after going not ready
Apr 20 13:29:14.493 rmf/svr 0/RP0/CPU0 t1 RDY - Set BACKUP(0/RP1/CPU0) ready: conn up: yes
Apr 20 13:29:14.493 rmf/svr 0/RP0/CPU0 t1 TIMER - Stopping timer NR Group Alarm Timer
Apr 20 13:29:14.493 rmf/lib 0/RP0/CPU0 22270# t3 ipv6_rib : Set not ready started - reason Checkpoint initial data transfer in progress
Apr 20 13:29:14.493 rmf/lib 0/RP0/CPU0 21866# t3 ipv6_rib : Set not ready success - reason Checkpoint initial data transfer in progress
Apr 20 13:29:14.499 rmf/lib 0/RP0/CPU0 20272# t3 ipv6_rib : Set ready started
Apr 20 13:29:14.499 rmf/lib 0/RP0/CPU0 22272# t3 ipv6_rib : Set ready success

RP/0/RP0/CPU0:9922#show configuration commit list
Thu Apr 21 09:36:17.128 BST
SNo. Label/ID User Line Client Time Stamp
~~~~ ~~~~~~~~ ~~~~ ~~~~ ~~~~~~ ~~~~~~~~~~
1 1000004144 user vty0:node0_RP0_CPU CLI Thu Apr 21 03:18:57 2016
2 1000004143 user vty1:node0_RP0_CPU CLI Wed Apr 20 23:42:58 2016
3 1000004142 user vty0:node0_RP0_CPU CLI Wed Apr 20 09:24:13 2016

Beginner

Have confirmed this is due to CSCux10642 "ipv6_rib fluctuating not ready to ready continuously", fixed in XR 5.3.3 and 6.0.1.

Beginner

Thank you so much, Bryan, very informative. 

CreatePlease to create content
Content for Community-Ad
July's Community Spotlight Awards