cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
5570
Views
5
Helpful
17
Replies

IOS XR lost configuration on reload

andrewmcca
Level 1
Level 1

Following a reload, a ASR 9001 boots up it has lost all of its config and gives the error below:

cfgmgr-rp[160]: %MGBL-CONFIG-0-INIT_FAILURE : Configuration Manager was unable to initialize the Configuration Namespace Version History module. Error: 'Result too large'.

This reappears every minute, I've tried all combinations of "clear configuration" EXEC command without luck.  I've also tried everything else I could think of to resolve the error, but I'm unable to do anything with the router.  When I try and configure it it gives an error saying:

SYSTEM CONFIGURATION IS STILL IN PROGRESS

And I'm unable to commit any config.  Has anyone come across this before?  Its running IOS XR 4.2.3.

Thanks in advance for your help

1 Accepted Solution

Accepted Solutions

xthuijs
Cisco Employee
Cisco Employee

Any chance you were downgrading from 43x to 42x?

If so, this may apply: CSCud37497

If you were coming from XR43 the vinfo file is containing incorrect info that is causing some read errors when it was converted by XR43 and read in XR42.

Release notes of DDTS:

Symptom:

Below logs(sample) are seen on console continuously and system configuration
never completes.

RP/0/RSP0/CPU0:Nov 22 15:37:41.386 : cfgmgr-rp[161]:
%MGBL-CONFIG-0-INIT_FAILURE : Configuration Manager was unable to initialize
the Configuration Namespace Version History module. Error: 'Result too large'.
  Initialization will be tried again after 60 seconds.

Conditions:

This is an intermittent issue but is seen during downgrade from 4.3.x version
to 4.2.x version.

Workaround:

To downgrade successfully from 4.3.x to 4.2.x use Disk Boot procedure and
explicitly set the rommon variable TURBOBOOT with format option. But make sure
to backup the running-configurations if any, so that it can be applied later
after downgrade

Example:
rommon> TURBOBOOT=on,disk0,fomat
rommon> sync


Further Problem Description:
This issue is specific to downgrade case from 4.3.x to 4.2.x (or 4.1.x)
versions only, whereby the router initialization does not completes due to some
read failure of version file from last booted image. Format option passed
during disk boot makes sure that all the old version files are deleted and
router comes up fresh with newly booted image. 

View solution in original post

17 Replies 17

xthuijs
Cisco Employee
Cisco Employee

Any chance you were downgrading from 43x to 42x?

If so, this may apply: CSCud37497

If you were coming from XR43 the vinfo file is containing incorrect info that is causing some read errors when it was converted by XR43 and read in XR42.

Release notes of DDTS:

Symptom:

Below logs(sample) are seen on console continuously and system configuration
never completes.

RP/0/RSP0/CPU0:Nov 22 15:37:41.386 : cfgmgr-rp[161]:
%MGBL-CONFIG-0-INIT_FAILURE : Configuration Manager was unable to initialize
the Configuration Namespace Version History module. Error: 'Result too large'.
  Initialization will be tried again after 60 seconds.

Conditions:

This is an intermittent issue but is seen during downgrade from 4.3.x version
to 4.2.x version.

Workaround:

To downgrade successfully from 4.3.x to 4.2.x use Disk Boot procedure and
explicitly set the rommon variable TURBOBOOT with format option. But make sure
to backup the running-configurations if any, so that it can be applied later
after downgrade

Example:
rommon> TURBOBOOT=on,disk0,fomat
rommon> sync


Further Problem Description:
This issue is specific to downgrade case from 4.3.x to 4.2.x (or 4.1.x)
versions only, whereby the router initialization does not completes due to some
read failure of version file from last booted image. Format option passed
during disk boot makes sure that all the old version files are deleted and
router comes up fresh with newly booted image. 

Thanks, that was the problem exactly, strangle I couldn't find that bug listed for the IOS.

I did still receive an error saying

TURBOBOOT: *** This image is not a membooted composite, TURBOBOOT option is therefore invalid ***

So had to do a tftp boot from ROMMON using the 4.3.1 image as below:

boot tftp://x.x.x.x/asr9k-mini-px.vm-4.3.1
Before I could get turboboot to work correctly, but when I did it booted with 4.3.1 correctly.  Glad I was doing that in a test lab and not a live environment!

Thanks greatly for your help,

Andrew

Hi Xander,

I have very often the problem where all VRFs are missing after upgrade.

Last time I got this message:

RP/0/RSP0/CPU0:Apr 25 02:57:02.152 : cfgmgr-rp[165]: %MGBL-CONFIG-4-VERSION : Version of existing saved configuration detected to be incompatible with the installed software. Configuration will be restored from an alternate source and may take longer than usual on this boot.
RP/0/RSP0/CPU0:Apr 25 02:57:04.919 : cfgmgr-rp[165]: %MGBL-CONFIGCLI-3-BATCH_CONFIG_FAIL : 14 config(s) failed during startup. To view failed config(s) use the command - "show configuration failed startup"
RP/0/RSP0/CPU0:Apr 25 02:57:04.926 : cfgmgr-rp[165]: %MGBL-CONFIG-3-INCONSISTENCY_ALARM : A configuration inconsistency alarm has been raised. Configuration commits will be blocked until 'clear configuration inconsistency' command has been run to synchronize persistent configuration with running configuration.

Show run vrf was empty (I was using console) and I could not SSH to it. I had to paste and commit.

show configuration failed startup:

Tue Apr 26 10:46:08.606 CET
!!02:57:01 UTC Mon Apr 25 2016
!! SYNTAX/AUTHORIZATION ERRORS: This configuration failed due to
!! one or more of the following reasons:
!! - the entered commands do not exist,
!! - the entered commands have errors in their syntax,
!! - the software packages containing the commands are not active,
!! - the current user is not a member of a task-group that has
!! permissions to use the commands.

aaa group server radius BNG_RADIUS
description "Internet"
address-family ipv4 unicast
address-family ipv6 unicast
description # IPoE_NAT #
address-family ipv4 unicast
address-family ipv6 unicast
description "Safe Internet"
address-family ipv4 unicast
description # ARBOAR #
address-family ipv4 unicast
description #Dualstack Inside VRF#
address-family ipv4 unicast
address-family ipv6 unicast
address-family ipv4 unicast

I do not see a problem with the syntaxes. Any ideas?

hi Smail,

not sure why it happens, we would have to investigate. However, these commands will help figure out the discrepancies between running and startup config: "[admin] show configuration persistent [diff]". If you see any, that would also help us investigate. After the reload, troubleshooting may be difficult because the only thing we can try is a reproduction.

hope this helps,

/Aleksandar

Hi Aleks,

it's empty

(admin)#show configuration persistent diff
Wed Apr 27 10:48:13.808 CET
Building configuration...
!! IOS XR Configuration 5.3.3
end

The weird thing is that only the VRF's are missing. What else can we do?

It's empty now because the running config and startup config are in sync. If you can catch them going out of sync that would help. Then we can look into the commit history and traces to figure out what went wrong.

Hi,

we will do an upgrade from 5.1.3 to 5.3.3 + SP2.

At this moment the configs are in sync.

admin show configuration persistent diff
Fri Jul 1 12:10:09.217 GMT
Building configuration...
!! IOS XR Configuration 5.1.3
end

What should I do if I see that the VRF's are missing again? Should I use the admin show configuration persistent diff  command?

hi Smail,

unless someone has a better idea, I would suggest:

  • save current config into a file
  • confirm the config is consistent between the two RPs (unless this is asr9001)
    • sh configuration inconsistency replica location <stdby_rp>
  • log console session into a file
  • reload
  • if config is lost:
    • capture the "sh conf failed startup"
    • open a TAC SR and let me know the ID

hope this helps,

/Aleksandar

I got this:

sh configuration inconsistency replica location 0/RSP1/CPU0

Fri Jul 1 13:16:00.494 GMT
The replica at location 0/RSP1/CPU0 is inconsistent. Please run 'clear configuration inconsistency replica location 0/RSP1/CPU0'

Should I clear it now or right before the upgrade (July 7th)?

I would clear it now and repeat the same sanity check before the upgrade.

clear configuration inconsistency replica location 0/RSP1/CPU0
Fri Jul 1 13:30:42.070 GMT
The replica was still inconsistent despite a successful attempt to clear the inconsistency. If this command continues to fail, it may be necessary to reload the node to correct it. (Error: No error)

Not sure what is going on. I will collect all of this and open a TAC case after the upgrade.

"sh configuration inconsistency replica location 0/RSP1/CPU0 detail" should help us understand the cause of the inconsistency. What does it say in your case?

sh configuration inconsistency replica location 0/RSP1/CPU0 detail
Fri Jul 1 14:12:23.807 GMT

Status SysDB:
Replica node '0x51'
SysDB path '/cfg/'
SysDB options '0x1'
In sync 'TRUE'
'No error'

Status RDSFS:
Primary node '0x41'
Replica node '0x51'
RDSFS path '/etc/cfg/lr'
RDSFS plane '2'
RFSFS options '0x9'
In sync 'FALSE'
'No error'

Status RDSFS:
Primary node '0x41'
Replica node '0x51'
RDSFS path '/etc/cfg/alt_cfg'
RDSFS plane '2'
RFSFS options '0x9'
In sync 'TRUE'
'No error'
RDSFS inconsistencies on node 0/RSP1/CPU0:
'/etc/cfg/lr/running/commitdb/1000000555.snd':
Present on the primary but missing on the replica.
'/etc/cfg/lr/running/commitdb/1000000597.snd':
Present on the primary but missing on the replica.
'/etc/cfg/lr/running/commitdb/commitdb_info_1000000551.snf':
Present on the replica but missing on the primary.
'/etc/cfg/lr/running/commitdb/commitdb_info_1000000593.snf':
Present on the replica but missing on the primary.
The replica at location 0/RSP1/CPU0 is inconsistent. Please run 'clear configuration inconsistency replica location 0/RSP1/CPU0'.

Can you check what were those commits?

  sh configuration commit changes 1000000551