cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2067
Views
10
Helpful
5
Replies

CloudCenter Suite installer 5.0 always fails on VMware

mschedrin
Level 1
Level 1

I have attempted to install CCS 5.0 on VMware vSphere 6.7 several times, but it always fails on Suite configuration:

Annotation 2019-04-05 112801.png

On another attempts I have also seen alternative error message:

time="2019-04-04T10:26:27Z" level=info msg=". Message: Pod, common-framework-suite-auth-5b89b7b65c-cqxq4, FailedMount, MountVolume.SetUp failed for volume \"suite-auth-tls\" : secrets \"suite-auth-tls\" not found"

I am following the guide: https://docs.cloudcenter.cisco.com/display/INSTALL/VMware+vSphere+Installation

I deploy new Kubernetes cluster on VMware.

Has anyone experienced same issue?

1 Accepted Solution

Accepted Solutions

I was able to figure out that root cause for the issue: I was missing firewall rules to allow master nodes to communicate to vSphere address. I had firewall rule for ccs installer, but didn't expect master nodes need that too during installation, that's not very apparent I seem. Probably it's worth adding in VMware installation guide https://docs.cloudcenter.cisco.com/display/INSTALL/VMware+vSphere+Installation. I found though that mentioned on troubleshooting page.

Thanks for help, I appreciate that.

View solution in original post

5 Replies 5

Shaun Roberts
Cisco Employee
Cisco Employee

You have unbound PVCs. (Persistent Volume Claims). CCS needs to have persistent volumes (storage) to be able to store things in the DB.

 

Before you allow the cluster to be deleted (if it fails), you can run the below and see status and names of pvcs.

 

[root@cx-ccs-prod-master-d7f34f25-f524-4f90-9037-7286202ed13a1 ~]# kubectl -n cisco get pvc
NAME                                                 STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
arangodb-data-claim                                  Bound     pvc-e4ddddb6-39c7-11e9-8c91-fa163e920fe4   40Gi       RWO            standard       37d
common-framework-grafana                             Bound     pvc-f42e024e-3949-11e9-8c91-fa163e920fe4   10Gi       RWO            standard       38d
common-framework-prometheus-server                   Bound     pvc-f42ea2f3-3949-11e9-8c91-fa163e920fe4   8Gi        RWO            standard       38d
common-framework-suite-license                       Bound     pvc-f42d36f8-3949-11e9-8c91-fa163e920fe4   10Gi       RWO            standard       38d
common-framework-suite-postgresql                    Bound     pvc-f42f5f09-3949-11e9-8c91-fa163e920fe4   50Gi       RWO            standard       38d
kafka-logs-claim                                     Bound     pvc-e4deb231-39c7-11e9-8c91-fa163e920fe4   4Gi        RWO            standard       37d
redis-data-action-orchestrator-pers-redis-master-0   Bound     pvc-e5596ee6-39c7-11e9-8c91-fa163e920fe4   5Gi        RWO            standard       37d
storage-common-framework-elasticsearch-data-0        Bound     pvc-f4d13d6c-3949-11e9-8c91-fa163e920fe4   50Gi       RWO            standard       38d
storage-common-framework-elasticsearch-data-1        Bound     pvc-0347c65e-394a-11e9-8c91-fa163e920fe4   50Gi       RWO            standard       38d
storage-common-framework-elasticsearch-master-0      Bound     pvc-f4d88f9d-3949-11e9-8c91-fa163e920fe4   30Gi       RWO            standard       38d
storage-common-framework-elasticsearch-master-1      Bound     pvc-039c9530-394a-11e9-8c91-fa163e920fe4   30Gi       RWO            standard       38d
storage-common-framework-elasticsearch-master-2      Bound     pvc-125354da-394a-11e9-8c91-fa163e920fe4   30Gi       RWO            standard       38d

As you can see they should all be "bound". If they are unbound then you need to look into those and see if there are any errors in the events logs. You can see each one using the describe command.

 

 

 

[root@cx-ccs-prod-master-d7f34f25-f524-4f90-9037-7286202ed13a1 ~]# kubectl -n cisco describe pvc arangodb-data-claim
Name:          arangodb-data-claim
Namespace:     cisco
StorageClass:  standard
Status:        Bound
Volume:        pvc-e4ddddb6-39c7-11e9-8c91-fa163e920fe4
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed=yes
               pv.kubernetes.io/bound-by-controller=yes
               volume.beta.kubernetes.io/storage-provisioner=kubernetes.io/cinder
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      40Gi
Access Modes:  RWO
Events:        <none>

You need to make sure that you have your storage class defined. It is a pre-req. Do a get sc.

 

[root@cx-ccs-prod-master-d7f34f25-f524-4f90-9037-7286202ed13a1 ~]# kubectl get sc
NAME                 PROVISIONER            AGE
standard (default)   kubernetes.io/cinder   38d

If you still have issues and cannot get past it, open a TAC case against your product.

--Shaun Roberts
Principal Engineer, CX
shaurobe@cisco.com

Thanks for your input! As a Cloud Center Suite administrator do I have root access to those VMs that installer deploys on my VMware cluster? The installation wizard does not prompt for any passwords during installation so I'm not sure where to get credentials in order to log in and troubleshoot Kubernetes cluster.

When you deploy the OVA, in advanced options (last screen i think), you need to change the default-instance-id and hostname to something else than their defaults.

 

Then(on that same screen) you can input a password and/or a public key. This will give you access. The default user is "cloud-user"

 

The VMs in the K8S clusters are key based only, there is no password. If you are not prompted to get that key, you should be able to hit the API at :

 

https://<master-ip>/suite-k8s-mgmt/api/v1/clusters/default/ssh-key

 

If not , you can login to the installer and run

curl -k https://<INSTALLER-IP>/suite-k8s-mgmt/api/v1/clusters/default/ssh-key

Either way, just need to get the key so you can login to one of the masters...

 

ssh -i <key_file> cloud-user@<master-ip>

 

You should also be able to download your log files on crash and look into what the error is or what leads up to it. (I think you did that above)

 

--Shaun Roberts
Principal Engineer, CX
shaurobe@cisco.com

I was able to figure out that root cause for the issue: I was missing firewall rules to allow master nodes to communicate to vSphere address. I had firewall rule for ccs installer, but didn't expect master nodes need that too during installation, that's not very apparent I seem. Probably it's worth adding in VMware installation guide https://docs.cloudcenter.cisco.com/display/INSTALL/VMware+vSphere+Installation. I found though that mentioned on troubleshooting page.

Thanks for help, I appreciate that.

Emawalker73268
Level 1
Level 1

Is this safe method to save data on website cloud?