Who Me Too'd this topic

diondohmen · ‎11-13-2019

Dear community,

we are having an issue activating the ha_cluster in between two on-prem satellites version 7.x

running provision_standby works just fine on the standy node.

but when running ha_deploy from the active node, we see these errors, resulting in the failure of the cluster setup

NOTICE: It is strongly recommended that you perform a backup of your
database before proceeding. Please see the documentation for details.

Proceed with the above configuration? Enter 'yes' to continue: yes
Adjusting firewall...
success
success
success
Stopping services...
Removed symlink /etc/systemd/system/multi-user.target.wants/satellite.service.
Created symlink from /etc/systemd/system/multi-user.target.wants/pcsd.service to /usr/lib/systemd/system/pcsd.service.
Changing password for user hacluster.
passwd: all authentication tokens updated successfully.
Authenticating cluster user...
x.x.x.x: Authorized
y.y.y.y: Authorized
Setting up cluster...
74cabb7f3440f8e5b71f9d1aa9340d99996e67ae697ae50fa6009186eecded6e
Error: unable to destroy cluster
x.x.x.x: Unable to connect to x.x.x.x, try setting higher timeout in --request-timeout option (Operation timed out after 60001 milliseconds with 0 out of -1 bytes received)
y.y.y.y: Unable to connect to y.y.y.y, try setting higher timeout in --request-timeout option (Operation timed out after 60001 milliseconds with 0 out of -1 bytes received)
Destroying cluster on nodes: x.x.x.x, y.y.y.y...
x.x.x.x: Stopping Cluster (pacemaker)...
y.y.y.y: Stopping Cluster (pacemaker)...
x.x.x.x: Unable to connect to x.x.x.x, try setting higher timeout in --request-timeout option (Operation timed out after 60001 milliseconds with 0 out of -1 bytes received)
y.y.y.y: Unable to connect to y.y.y.y, try setting higher timeout in --request-timeout option (Operation timed out after 60001 milliseconds with 0 out of -1 bytes received)
Configuring cluster...
Error: unable to get cib

after spitting through the logs, and viewing the steps within deploy_ha.sh I could see multiple lines within the var/log/pcsd/pcsd.log mentioning:

Cannot read config 'corosync.conf' from '/etc/corosync/corosync.conf': No such file

I only have:

-rw-r--r--. 1 root root 2881 Oct 30 2018 corosync.conf.example
-rw-r--r--. 1 root root 767 Oct 30 2018 corosync.conf.example.udpu
-rw-r--r--. 1 root root 3278 Oct 30 2018 corosync.xml.example

I don't think i can solve this within the current release, the only thing i could do is extending the timeout as mentioned in the logfile (--request-timeout option), but that won't create a /etc/corosync/corosync.conf file either :)

did anyone succeed installing a HA cluster on the 7.x version?

Who Me Too'd this topic

Problem setting up HA cluster within Cisco Smart on-prem satellite