cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1277
Views
3
Helpful
6
Replies

v1.5.1.1054 upgrade - status 'harvested'

Seb Rupik
VIP Alumni
VIP Alumni

Hi there,

I attempted upgrading to v1.5.1.1054 from v1.5.0.1368 early this morning, it initially failed due to NTP sync issues. I resolved these and attempted the upgrade again. After waiting over two hours for it to complete, I refreshed the page and eventually it 404'd .

grapevine doesn't have any recollection of the second failed upgrade:

$ grape update history

UPDATE                                 PROPERTY             VALUE

----------------------------------------------------------------------

63c88792-9eaa-11e7-887d-005056bc0ab3   finished             Thu Sep 21, 2017 08:55:12 AM (2 hrs ago)

63c88792-9eaa-11e7-887d-005056bc0ab3   reason               Unable to sync with any of the configured NTP servers.

63c88792-9eaa-11e7-887d-005056bc0ab3   status               failed

63c88792-9eaa-11e7-887d-005056bc0ab3   task_id              63c88792-9eaa-11e7-887d-005056bc0ab3

63c88792-9eaa-11e7-887d-005056bc0ab3   update_from          1.5.0.1368

63c88792-9eaa-11e7-887d-005056bc0ab3   update_to

63c88792-9eaa-11e7-887d-005056bc0ab3   username             admin

#

I restarted the service hoping that might bring it back to life, but now it looks even worse. I can get to the webGUI login page, but no further, the instance status looks like this:

$ grape instance status

SERVICE                              VERSION      STATE        CLIENT                                IP              UPTIME

-----------------------------------------------------------------------------------------------------------------------------

APIC-EM-CAA-SERVICE                  5.1.31       harvested    None                                  None

access-policy-programmer-service     5.1.31.1368  harvested    None                                  None

apic-em-event-service                5.1.31.1368  harvested    None                                  None

apic-em-inventory-manager-service    5.1.31.1368  harvested    None                                  None

apic-em-jboss-ejbca                  5.1.31.1368  harvested    None                                  None

apic-em-network-programmer-service   5.1.31.1368  harvested    None                                  None

apic-em-pki-broker-service           1.5.0.1368   harvested    None                                  None

cas-service                          5.0.31.4021  harvested    None                                  None

cassandra                            1.0.0        running      8ad77cb0-530a-43f4-b952-3e2bf2a2eb1c  169.254.0.1     86 days, 0:12:56

election-service                     5.0.31.4021  harvested    None                                  None

file-service                         5.1.31.1368  harvested    None                                  None

grapevine                            1.0.0        running      8ad77cb0-530a-43f4-b952-3e2bf2a2eb1c  169.254.0.1     86 days, 0:12:52

grapevine-coordinator-service        1.0.0        running      8ad77cb0-530a-43f4-b952-3e2bf2a2eb1c  169.254.0.1     86 days, 0:12:45

grapevine-log-collector              1.0.0        running      8ad77cb0-530a-43f4-b952-3e2bf2a2eb1c  169.254.0.1     86 days, 0:12:41

grouping-service                     5.1.31.1368  harvested    None                                  None

identity-manager-pxgrid-service      5.1.31.1368  harvested    None                                  None

nbar-policy-programmer-service       5.1.31.1368  harvested    None                                  None

network-poller-service               5.1.31.1368  harvested    None                                  None

node-ui                              1.0.0        running      8ad77cb0-530a-43f4-b952-3e2bf2a2eb1c  169.254.0.1     86 days, 0:12:49

pnp-service                          5.17.32.35   harvested    None                                  None

policy-analysis-service              5.1.31.1368  harvested    None                                  None

policy-manager-service               5.1.31.1368  harvested    None                                  None

postgres                             5.1.31.1368  harvested    None                                  None

qos-lan-policy-programmer-service    5.1.31.1368  harvested    None                                  None

qos-monitoring-service               5.1.31.1368  harvested    None                                  None

qos-policy-programmer-service        5.1.31.1368  harvested    None                                  None

rabbitmq                             1.0.0        running      8ad77cb0-530a-43f4-b952-3e2bf2a2eb1c  169.254.0.1     86 days, 0:12:59

rbac-service                         5.0.31.4021  harvested    None                                  None

reverse-proxy                        5.0.31.4021  running      f27f3a98-a069-4eda-a640-79e0a3c29800  169.254.0.1     0:20:03

router                               5.0.31.4021  running      f27f3a98-a069-4eda-a640-79e0a3c29800  169.254.0.1     0:20:04

scheduler-service                    5.1.31.1368  harvested    None                                  None

task-service                         5.1.31.1368  harvested    None                                  None

telemetry-service                    5.1.31.1368  harvested    None                                  None

topology-service                     1.5.0.1368   harvested    None                                  None

(grapevine)

[Thu Sep 21 10:56:08 UTC] grapevine@10.209.120.40 (grapevine-root-1) ~

Apart from the services listed as 'running', all the others cycle between 'deploying' and 'harvested'.

Any ideas?

cheers,

Seb.

6 Replies 6

aradford
Cisco Employee
Cisco Employee

did you do a "reset_grapevine" (Saying "N" to all the questions)?

That should get you back to a sane state.

Hi Adam,

I've tried that but get:

2017-09-21 13:02:41,955 | Attempting to sync with time server pool.ntp.org...

2017-09-21 13:02:50,786 | Unable to sync with time server pool.ntp.org

2017-09-21 13:02:55,791 | Configuring NTP (attempt #2)...

2017-09-21 13:02:55,792 | Attempting to sync with time server pool.ntp.org...

2017-09-21 13:03:04,623 | Unable to sync with time server pool.ntp.org

2017-09-21 13:03:09,629 | Configuring NTP (attempt #3)...

2017-09-21 13:03:09,630 | Attempting to sync with time server pool.ntp.org...

2017-09-21 13:03:18,462 | Unable to sync with time server pool.ntp.org

2017-09-21 13:03:23,468 | Configuring NTP (attempt #4)...

2017-09-21 13:03:23,468 | Attempting to sync with time server pool.ntp.org...

2017-09-21 13:03:32,302 | Unable to sync with time server pool.ntp.org

2017-09-21 13:03:37,308 | Configuring NTP (attempt #5)...

2017-09-21 13:03:37,310 | Attempting to sync with time server pool.ntp.org...

2017-09-21 13:03:46,092 | Unable to sync with time server pool.ntp.org

2017-09-21 13:03:51,098 | Unable to configure NTP after 5 attempts

2017-09-21 13:03:51,266 | [configure_ntp:2872] Unable to configure NTP. Please confirm NTP server connectivity and settings.

2017-09-21 13:03:51,266 | Config wizard completed with errors

I've edited /etc/ntp.conf to use a stratum 1 source I know which works:

root@grapevine-root-1:/home/grapevine# ntpq -pn

    remote          refid      st t when poll reach  delay  offset  jitter

==============================================================================

*10.xx.xx.xx  .GPS.            1 u  24  64  377  83.351  -0.224  6.530

however it looks like the the reset_grapevine script must be hardcoded to use pool.ntp.org . I've even tried editing /etc/hosts so that pool.ntp.org resolves to my NTP source, but that has had no effect.

Looking at /opt/cisco/grapevine/bin/grapevine_factory_reset shows that it is trying to run the config wizard for 1.5.1.4018 :

load_entry_point('grapevine-config-wizard==1.5.1.4018.dev1083-gf84d517', 'console_scripts', 'grapevine_factory_reset')()

...surely if the upgrade didn't successfully complete this should be trying to run a v1.5.0.1368 reset script?

cheers,

Seb.

I think you must have a "bad" ntp server in your grapevine config file.

What is the ntp setting in /etc/grapevine/controller-config.json

that is the setting which will be used on a reset_grapevine

and I forgot to mention if you need to change your ntp setting in controller_config.json you can do it through the config_wizard.

Again at the end of the config_wizard, make sure you say "N" to destroy disks to maintain state

I tried the config_wizard method and it looks to fail on the same step:

  The configuration wizard has encountered the following error:

  Timeout of 3600 seconds has been exceeded while growing services. The following services are not yet in RUNNING state: scheduler-

  service, identity-manager-pxgrid-service, policy-manager-service, task-service, topology-service, policy-analysis-service, telemetry-

  service

  Use the "back" button to revisit previous wizard screens to correct any errors...

It was going so well. I manually edited controller_config.json the two options in config_wizard didn't look applicable :/

After editing the file reset_grapevine was progressing nicely until:

2017-09-22 08:04:46,794 |Running [26/34]: apic-em-inventory-manager-service apic-em-network-programmer-service
2017-09-22 08:05:01,932 |Running [27/34]: network-poller-service

   2017-09-22 08:57:32,234 | [grow_all_services:1161] Timeout of 3600 seconds has been exceeded while growing services. The following services are not yet in RUNNING state: scheduler-service, identity-manager-pxgrid-service, policy-manager-service, task-service, topology-service, policy-analysis-service, telemetry-service

2017-09-22 08:57:32,235 | Config wizard completed with errors

(grapevine)

[Fri Sep 22 08:57:32 UTC] grapevine@10.209.120.40 (grapevine-root-1) ~

Is there a way to get the reset_grapevine to reset to the previous version and not the new v1.5.1.1054 version which didn't install correctly?

cheers,

Seb.