cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2043
Views
9
Helpful
7
Replies

Controller fails to start services after upgarde

mehdi.rashidi
Level 1
Level 1

Hello,

After upgrading to the latest version, the controller fails to start services successfully. I listed running services :

$ grapectl status

[sudo] password for grapevine:

grapevine_capacity_manager              RUNNING   pid 32743, uptime 1:44:24

grapevine_capacity_manager_lxc_plugin   RUNNING   pid 32737, uptime 1:44:24

grapevine_cassandra                     RUNNING   pid 32210, uptime 1:44:41

grapevine_client                        RUNNING   pid 32207, uptime 1:44:41

grapevine_coordinator_service           RUNNING   pid 388, uptime 1:44:21

grapevine_dlx_service                   RUNNING   pid 32212, uptime 1:44:41

grapevine_log_collector                 RUNNING   pid 32222, uptime 1:44:41

grapevine_root                          RUNNING   pid 32211, uptime 1:44:41

grapevine_supervisor_event_listener     RUNNING   pid 32206, uptime 1:44:41

grapevine_ui                            RUNNING   pid 32208, uptime 1:44:41

reverse-proxy=5.0.0.4270                RUNNING   pid 9167, uptime 1:38:00

router=5.0.0.4270                       RUNNING   pid 9168, uptime 1:38:00

and then did ./bin/reset_grapevine but it stopped at below:

2017-05-03 14:09:12,192 | Growing services (this can take several minutes). Please wait...

2017-05-03 14:09:12,312 |    Running [12/37]: reverse-proxy postgres cas-service grapevine node-ui rabbitmq grapevine-log-collector election-service router grapevine-coordinator-service rbac-service cassandra

2017-05-03 14:11:23,345 |    Running [13/37]: ipgeo-service

2017-05-03 14:12:23,924 |    Running [14/37]: file-service

2017-05-03 14:15:05,597 |    Running [15/37]: ip-pool-manager-service

2017-05-03 14:16:16,286 |    Running [16/37]: app-vis-policy-programmer-service

2017-05-03 14:16:51,661 |    Running [17/37]: pnp-service

2017-05-03 14:18:07,442 |    Running [19/37]: nbar-policy-programmer-service access-policy-programmer-service

2017-05-03 14:19:08,031 |    Running [20/37]: qos-policy-programmer-service

2017-05-03 14:19:18,137 |    Running [21/37]: visibility-service

2017-05-03 14:19:48,429 |    Running [22/37]: apic-em-pki-broker-service

2017-05-03 14:19:58,526 |    Running [24/37]: apic-em-jboss-ejbca pfr-policy-programmer-service

2017-05-03 14:20:59,123 |    Running [25/37]: grouping-service

2017-05-03 14:21:09,227 |    Running [26/37]: qos-lan-policy-programmer-service

2017-05-03 14:21:39,535 |    Running [27/37]: apic-em-event-service

2017-05-03 14:24:21,155 |    Running [28/37]: network-poller-service

2017-05-03 14:26:27,395 |    Running [29/37]: apic-em-inventory-manager-service

2017-05-03 14:26:47,605 |    Running [30/37]: apic-em-network-programmer-service

2017-05-03 15:09:13,564 | [grow_all_services:1159] Timeout of 3600 seconds has been exceeded while growing services. The following services are not yet in RUNNING state: scheduler-service, identity-manager-pxgrid-service, policy-manager-service, task-service, topology-service, policy-analysis-service, telemetry-service

2017-05-03 15:09:13,565 | Config wizard completed with errors

Any idea how I can resolve this issue?

Thanks

1 Accepted Solution

Accepted Solutions

aradford I resolved the issue by installing the latest upgrade 1.4.2

View solution in original post

7 Replies 7

aradford
Cisco Employee
Cisco Employee

This is usually an indication that the DiskIO is not good enough.

I assume you have enough RAM and vcpu?

Can you run the following command on the grapevine console:

dd if=/dev/zero of=/tmp/foo bs=1M count=512 conv=fdatasync


That will give you an indication of the disk IO.


You should see output similar to this

512+0 records in
512+0 records out

536870912 bytes (537 MB) copied, 23.7006 s, 22.7 MB/s

DiskIO looks fine:

$ dd if=/dev/zero of=/tmp/foo bs=1M count=512 conv=fdatasync

512+0 records in

512+0 records out

536870912 bytes (537 MB) copied, 2.40962 s, 223 MB/s

(grapevine)

vCPU: 8

Memory: 64G

aradford I resolved the issue by installing the latest upgrade 1.4.2

Very strange.  Specs looked fine.

Have you also upgraded the IWAN app too

Adam

Yes I did although it wasn't complaining about the old version 1.4.0.417

Do you have a case open regarding this issue?

I'm using the free version with no official support so no I didn't open a ticket for this.