cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3201
Views
0
Helpful
12
Replies

ACI Upgrade(1.0(2j) to 1.0(4h) issue

subashchandran
Level 1
Level 1

Hi All,

Code upgrade failing for switches.

APIC controllers upgrade successfully completed

!

admin@APIC1:~> firmware upgrade status node 202

Node-Id    Role            Current-Firmware     Target-Firmware      Upgrade-Status  Progress-Percent(if inprogress)

----------------------------------------------------------------------------------------------------------------------

202        spine           n9000-11.0(1c)       n9000-11.0(4h)       completenok     5

 

Legend:

notscheduled - Upgrade has NOT been scheduled

scheduled    - Upgrade has been scheduled at a future time

queued       - Node is waiting for token from scheduler(permission to upgrade)

inprogress   - Image installation is currently in progress on node

completeok   - Upgrade successful

completenok  - Upgrade failed

unknown      - Node unreachable

admin@APIC1:~> firmware upgrade status node 201

Node-Id    Role            Current-Firmware     Target-Firmware      Upgrade-Status  Progress-Percent(if inprogress)

----------------------------------------------------------------------------------------------------------------------

201        spine           n9000-11.0(2j)       n9000-11.0(4h)       completenok     0

 

Legend:

notscheduled - Upgrade has NOT been scheduled

scheduled    - Upgrade has been scheduled at a future time

queued       - Node is waiting for token from scheduler(permission to upgrade)

inprogress   - Image installation is currently in progress on node

completeok   - Upgrade successful

completenok  - Upgrade failed

unknown      - Node unreachable

!

admin@APIC1:~> firmware upgrade status
Node-Id Role Current-Firmware Target-Firmware Upgrade-Sta tus Progress-Percent(if inprogress)
-------------------------------------------------------------------------------- --------------------------------------
1 controller apic-1.0(4h) apic-1.0(4h) completeok 100
2 controller apic-1.0(4h) apic-1.0(4h) completeok 100
3 controller apic-1.0(4h) apic-1.0(4h) completeok 100
101 leaf n9000-11.0(2j) n9000-11.0(4h) completenok 0
201 spine n9000-11.0(2j) n9000-11.0(4h) completenok 0
202 spine n9000-11.0(1c) notschedule d 5

Legend:
notscheduled - Upgrade has NOT been scheduled
scheduled - Upgrade has been scheduled at a future time
queued - Node is waiting for token from scheduler(permission to upgrade)
inprogress - Image installation is currently in progress on node
completeok - Upgrade successful
completenok - Upgrade failed
unknown - Node unreachable

2 Accepted Solutions

Accepted Solutions

From the logs it shows:

<Mon Nov 2 16:18:08 2015> check_freespace: /var/tmp total blocks: 393216, free blocks: 107650, space being used: 72%
<Mon Nov 2 16:18:08 2015> fsm_action_pre_verification: Free space in "/var/tmp" partition is below threshold: 70
<Mon Nov 2 16:18:08 2015> notify_job_status_change: Sending job status notifica tion.
<Mon Nov 2 16:18:08 2015> installer_state_change: old_state: INSTALLER_STATE_PR E_VERIFICATION, new_state: INSTALLER_STATE_ABORTING
<Mon Nov 2 16:18:08 2015> installer_state_change: State changed from INSTALLER_ STATE_PRE_VERIFICATION to INSTALLER_STATE_ABORTING.

What does "mount | grep plog" CLI command show on leaf1?

We need to run "df -h /var/tmp" on leaf1 to see what is being used but unfortunately you will need "root" access.  This will require a Cisco TAC Case so a TAC engineer can check.

One other option to "try" prior to opeing a TAC is running this CLI command and then reboot leaf1.  After it comes up, retry upgrading the leaf.

From the leaf CLI:

* setup-clean-config.sh aci-n9000-dk9.11.0.2j.bin
* reload
* Once the leaf comes back up and rejoins the fabric, try upgrading leaf from APIC CLI:
"firmware upgrade switch aci-n9000-dk9.11.0.4h.bin nodes 101"

View solution in original post

You can think of setup-clean-config.sh aci-n9000-dk9.11.0.2j.bin as erasing the running configurations of the node, but the actual configurations or policies, are stored on the APIC and will be pushed to the node once it reloads. The APIC will see that the node has a serial number that was already a member of the fabric and automatically rejoin it to the existing fabric. You can check to see if the node has joined the fabric by issuing command acidiag fnvread from the APIC CLI.


ACI has the ability to export your configuration, but for what you are trying to accomplish is almost unecessary. I would evaulate the possible impact of this upgrade, before proceeding to export your configuration.

View solution in original post

12 Replies 12

Tomas de Leon
Cisco Employee
Cisco Employee

Please login to ACI devices as Admin and:

# Please Gather the following from the CLI of an APIC in the Cluster:

version
acidiag fnvread
controller
moquery -c infraWiNode
moquery -c infraCont

# Please Gather the following from the CLI of EACH APIC in the Cluster:

firmware list
ls /firmware/fwrepos/fwrepo


# Please Gather the following from each Leaf\Spine that are not upgrading:

mount | grep plog
cat /mit/sys/summary
cat /proc/cmdline
cat /mnt/cfg/0/boot/grub/menu.lst.local
cat /mnt/cfg/1/boot/grub/menu.lst.local
ls /bootflash
tail -100 /mnt/pss/installer_detail.log

cd /var/log/dme/log
zgrep "Upgrade" svc_ifc_policyelem.log*

Thanks

T.

Hi Tom,

Thaks for the quick reply...

Atatched all the logs.

Thanks ..

Subash

TASK#1

From the CLI of leaf1:

cd /bootflash
rm aci-n9000-dk9.11.0.4h.bin
rm aci-n9000-dk9.11.0.4h.bin-isan
rm auto-k


From the CLI of Spine1 & Spine2:

cd /bootflash
rm aci-n9000-dk9.11.0.4h.bin
rm aci-n9000-dk9.11.0.4h.bin-isan

TASK#2

From the APIC GUI:

ADMIN-> FIRMWARE->

* Click on FABRIC NODE FIRMWARE

* Change the FIRMWARE DEFAULT POLICY
Default Firmware Version: to "ANY"

* Expand FABRIC NODE FIRMWARE
* expand "Firmware Groups"
* delete all existing "Firmware Groups"
* create a new "Firmware Group" and select all NODEs. Set the Target firmware to "n9000-dk9.11.0.4h" for all nodes.

* expand "Maintenance Groups"
* delete all existing "Maintenance Groups"



TASK#3

From the CLI of leaf1:

* cd /bootflash
* ls -al and check to see if "aci-n9000-dk9.11.0.4h.bin" is there
* tail -f /mnt/pss/installer_detail.log


From the CLI of an APIC:

* firmware upgrade switch aci-n9000-dk9.11.0.4h.bin nodes 101



TASK#4

After the upgrade fails or succeeds, gather the same information as before for leaf1,
mount | grep plog

cat /mit/sys/summary
cat /proc/cmdline
cat /mnt/cfg/0/boot/grub/menu.lst.local
cat /mnt/cfg/1/boot/grub/menu.lst.local
ls /bootflash
tail -100 /mnt/pss/installer_detail.log

cd /var/log/dme/log
zgrep "Upgrade" svc_ifc_policyelem.log*

The console or terminal output of the "tail -f /mnt/pss/installer_detail.log"

Thanks

T.

Completed Task-1,2,

TASK#3

From the CLI of leaf1:

* cd /bootflash
* ls -al and check to see if "aci-n9000-dk9.11.0.4h.bin" is there
* tail -f /mnt/pss/installer_detail.log


From the CLI of an APIC:

* firmware upgrade switch aci-n9000-dk9.11.0.4h.bin nodes 101

I am getting error:-

admin@APIC1:~> firmware upgrade switch node 101 aci-n9000-dk9.11.0.4h.bin The node 101 is already part of some other firmware group.
Please remove the node from other firmware group and re-try.
Error = URL: http://127.0.0.1:7777/api//mo/uni/fabric/.xml
Code: 400
Output: <?xml version="1.0" encoding="UTF-8"?><imdata><error code="100" text="Validation failed: NodeGrp: uni/fabric/fwgrp-Leaf_Spine overlaps with node group: Dn0=uni/fabric/fwgrp-cliFwFG-101, "/></imdata>
Data Posted:
<firmwareFwGrp name='cliFwFG-101' status='created,modified'><fabricNodeBlk from_='101' name='cliFwNode-101' status='created,modified' to_='101'></fabricNodeBlk></firmwareFwGrp>
Firmware Installation on Switch Failed
admin@APIC1:~>

Firmware group is Leaf_Spine with all the Nodes

ok, delete the firmware-groups from task#2 but make sure the default firmware software is set to n9000-dk9.11.0.4h

and then try task #3 & #4

Hi Tom,

Same error:-

admin@APIC1:~> firmware upgrade status node 101
Node-Id Role Current-Firmware Target-Firmware Upgrade-Status Progress-Percent(if inprogress)
----------------------------------------------------------------------------------------------------------------------
101 leaf n9000-11.0(2j) n9000-11.0(4h) inretryqueue 0

Legend:
notscheduled - Upgrade has NOT been scheduled
scheduled - Upgrade has been scheduled at a future time
queued - Node is waiting for token from scheduler(permission to upgrade)
inprogress - Image installation is currently in progress on node
completeok - Upgrade successful
completenok - Upgrade failed
unknown - Node unreachable
+++++++++++++++++++++++++++++++++++++++++++
admin@APIC1:~> firmware upgrade status node 101
Node-Id Role Current-Firmware Target-Firmware Upgrade-Status Progress-Percent(if inprogress)
----------------------------------------------------------------------------------------------------------------------
101 leaf n9000-11.0(2j) n9000-11.0(4h) inprogress 5

Legend:
notscheduled - Upgrade has NOT been scheduled
scheduled - Upgrade has been scheduled at a future time
queued - Node is waiting for token from scheduler(permission to upgrade)
inprogress - Image installation is currently in progress on node
completeok - Upgrade successful
completenok - Upgrade failed
unknown - Node unreachable
+++++++++++++++++++++++++++++++++++
admin@APIC1:~> firmware upgrade status node 101
Node-Id Role Current-Firmware Target-Firmware Upgrade-Status Progress-Percent(if inprogress)
----------------------------------------------------------------------------------------------------------------------
101 leaf n9000-11.0(2j) n9000-11.0(4h) completenok 0

Legend:
notscheduled - Upgrade has NOT been scheduled
scheduled - Upgrade has been scheduled at a future time
queued - Node is waiting for token from scheduler(permission to upgrade)
inprogress - Image installation is currently in progress on node
completeok - Upgrade successful
completenok - Upgrade failed
unknown - Node unreachable

Atatched logs

From the logs it shows:

<Mon Nov 2 16:18:08 2015> check_freespace: /var/tmp total blocks: 393216, free blocks: 107650, space being used: 72%
<Mon Nov 2 16:18:08 2015> fsm_action_pre_verification: Free space in "/var/tmp" partition is below threshold: 70
<Mon Nov 2 16:18:08 2015> notify_job_status_change: Sending job status notifica tion.
<Mon Nov 2 16:18:08 2015> installer_state_change: old_state: INSTALLER_STATE_PR E_VERIFICATION, new_state: INSTALLER_STATE_ABORTING
<Mon Nov 2 16:18:08 2015> installer_state_change: State changed from INSTALLER_ STATE_PRE_VERIFICATION to INSTALLER_STATE_ABORTING.

What does "mount | grep plog" CLI command show on leaf1?

We need to run "df -h /var/tmp" on leaf1 to see what is being used but unfortunately you will need "root" access.  This will require a Cisco TAC Case so a TAC engineer can check.

One other option to "try" prior to opeing a TAC is running this CLI command and then reboot leaf1.  After it comes up, retry upgrading the leaf.

From the leaf CLI:

* setup-clean-config.sh aci-n9000-dk9.11.0.2j.bin
* reload
* Once the leaf comes back up and rejoins the fabric, try upgrading leaf from APIC CLI:
"firmware upgrade switch aci-n9000-dk9.11.0.4h.bin nodes 101"

From the leaf CLI:

* setup-clean-config.sh aci-n9000-dk9.11.0.2j.bin
* reload
* Once the leaf comes back up and rejoins the fabric, try upgrading leaf from APIC CLI:
"firmware upgrade switch aci-n9000-dk9.11.0.4h.bin nodes 101"

-----Do i need to backup the configs?

or it will erase my all running configs....

You can think of setup-clean-config.sh aci-n9000-dk9.11.0.2j.bin as erasing the running configurations of the node, but the actual configurations or policies, are stored on the APIC and will be pushed to the node once it reloads. The APIC will see that the node has a serial number that was already a member of the fabric and automatically rejoin it to the existing fabric. You can check to see if the node has joined the fabric by issuing command acidiag fnvread from the APIC CLI.


ACI has the ability to export your configuration, but for what you are trying to accomplish is almost unecessary. I would evaulate the possible impact of this upgrade, before proceeding to export your configuration.

Hi Tom,

From the leaf CLI:

* setup-clean-config.sh aci-n9000-dk9.11.0.2j.bin
* reload

After reload---booted with new code automatically from APIC--11.0(4h)

admin@APIC1:~>  acidiag fnvread

      ID             Name    Serial Number         IP Address    Role        State   LastUpdMsgId

-------------------------------------------------------------------------------------------------

     101            Leaf1      SAL1816QLD9       10.8.1.95/32    leaf       active   0

     201           Spine1      SAL18432Y26       10.8.1.94/32   spine       active   0

     202           Spine2      SAL18391DXH       10.8.1.93/32   spine       active   0

 

Total 3 nodes

+++++++++++++++++++++++++++++++++++++++++

admin@APIC1:~> firmware upgrade status

Node-Id    Role            Current-Firmware     Target-Firmware      Upgrade-Status  Progress-Percent(if inprogress)

----------------------------------------------------------------------------------------------------------------------

1          controller      apic-1.0(4h)         apic-1.0(4h)         completeok      100

2          controller      apic-1.0(4h)         apic-1.0(4h)         completeok      100

3          controller      apic-1.0(4h)         apic-1.0(4h)         completeok      100

101        leaf            n9000-11.0(4h)       n9000-11.0(4h)       completeok      100

201        spine           n9000-11.0(4h)       n9000-11.0(4h)       completeok      100

202        spine           n9000-11.0(4h)       n9000-11.0(4h)       completeok      100

 

Legend:

notscheduled - Upgrade has NOT been scheduled

scheduled    - Upgrade has been scheduled at a future time

queued       - Node is waiting for token from scheduler(permission to upgrade)

inprogress   - Image installation is currently in progress on node

completeok   - Upgrade successful

completenok  - Upgrade failed

unknown      - Node unreachable

Now i am planning to upgrade from 1.0(4h) 1.1(2h)

I am glad this fixed issue.  As Alec already mentioned, the configuration backup was not necessary for what i advised.

That said, it is best practice to always export the configuration "prior" to any upgrade.  This provides a safety net just incase you need to rollback.

Cheers!

T.

Hi Tom,

FYI.

I haven't faced any  buffer issues, while upgrading from 1.0(4h) to 1.1(2h).

Can you please share the document for Backing up for APIC configuration before upgrade.

Thanks,

Subash

Save 25% on Day-2 Operations Add-On License