cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
823
Views
0
Helpful
0
Comments
Meddane
VIP
VIP
Meddane_0-1675154510236.png

Cluster Database Configuration between cms1 cms2 and cms3

For the server certificate, you can use a multi-SAN certificate containing all the database server FQDN in the SAN attribute reducing thus the certificate management by using a single certificate for all nodes and all services.

For the client certificate, the Common Name (CN) must be set to postgres. When a server database receives a client certificate, they check that the CN field is equal to postgres to validate the authentication.

For database we need to generate two CSRs with the corresponding private keys, the client and the server.

Use one CMS to generates these two CSRs, once you get the server and client certificates from your CA, copy the two certificates with their private keys to all nodes using WinSCP.

For Server certificate use the following command, give a name for example dbcert, it is important to put the CN to the FQDN of the Master Database cms1 and the FQDN of the slaves in the SAN. For example: cms1.lab.local in the Common Name, cms2.lab.local and cms3.lab.local in the SAN.

cms1>pki csr dbcert CN:cms1.lab.local OU:CCNP O:Collaboration L:lab ST:local C:US subjectAltName:cms2.lab.local,cms3.lab.local

For Client certificate use the following command, give a name for example dbclt.

cms1>pki csr dbclt CN:postgres

On CMS1.

Configure the database client and server certificates created previously and named dbcert and dbclt. The Root-CA certificate is added to verify the validity of the client/ server certificates. Specify which interface to use for the database clustering and initialize the master database.

cms1>database cluster certs dbcert.key dbcert.cer dbclt.key dbclt.cer Root-CA.cer

cms1>database cluster localnode a

cms1>database cluster initialize

On CMS2 and CMS3.

Configure the database client and server certificates created previously and named dbcert and dbclt. The Root-CA certificate is added to verify the validity of the client/ server certificates, specify the interface to use, connect cms2 and cms3 to the master database cms1.

cmsx>database cluster certs dbcert.key dbcert.cer dbclt.key dbclt.cer Root-CA.cer

cmsx>database cluster localnode a

cmsx>database cluster join cms1.lab.local

 

Meddane_1-1675154510238.png

 

Meddane_2-1675154510239.png

On both cms1, cms2 and cms3, Verify the status of the database cluster.

The database status on CMS2 and CMS3. The status shown ERROR Cannot find primary node in cluster.

 

Meddane_3-1675154510241.png

 

Meddane_4-1675154510243.png

CMS2 and CMS3 fail to connect to the primary node CMS1. So what’s wrong here? let’s check the logs using the syslog follow command on CMS 2 and CMS3.

The output of CMS2’s log indicates the message CMS1: error could not translate host name “cms1 to address.

CMS2 tries to resolve the server’s name CMS1 but cannot find the IP address 10.1.5.61.

 

Meddane_5-1675154510246.png

The output of CMS3’s log indicates the message CMS1: error could not translate host name “cms1 to address.

CMS3 tries to resolve the server’s name CMS1 but find the IP address 10.1.5.61.

 

Meddane_6-1675154510249.png

To solve the issues, we need DNS A record to resolve the name CMS1 to the IP address 10.1.5.61. Fortunately, Cisco Meeting server allows to create a DNS RR record to resolve the server’s name locally.

On CMS2 and CMS3, add a DNS RR record using the following commands.

 

Meddane_7-1675154510250.png

 

Meddane_8-1675154510250.png

Now let’s verify the database status on CMS2 and CMS3.

The output show that both are in the Connected status to the primary node CMS1 10.1.5.61.

But both nodes CMS2 and CMS3 fails to connect to each other.

 

Meddane_9-1675154510253.png

 

Meddane_10-1675154510255.png

Let’s verify the logs on CMS2 and CMS3.

CMS2 cannot find the IP address 10.1.5.63 of the hostname CMS3.

 

Meddane_11-1675154510260.png

CMS3 cannot find the IP address 10.1.5.62 of the hostname CMS2.

 

Meddane_12-1675154510263.png

On the primary node CMS1, verify the database status, we can see that the primary node fails to connect to CMS2 and CMS3.

 

Meddane_13-1675154510265.png

Let’s check the logs on CMS1. The same issue is displayed, CMS1 fails to find the IP addresses of the server’s name CMS2 and CMS3.

 

Meddane_14-1675154510268.png

To solve the issues, we need to add two DNS RR Records on CMS1 to resolve the server’s name of CMS2 and CMS3. Configure the following commands as shown below

 

Meddane_15-1675154510269.png

On CMS2, configure one DNS RR Record to resolve the server’s name of CMS3.

 

Meddane_16-1675154510269.png

On CMS3, configure one DNS RR Record to resolve the server’s name of CMS2.

 

Meddane_17-1675154510269.png

On CMS1 verify the DNS Record entries, it must have two DNS RR Records to resolve the server’s name of CMS2 and CMS3.

 

Meddane_18-1675154510270.png

On CMS2 verify the DNS Record entries, it must have two DNS RR Records to resolve the server’s name of CMS1 and CMS3.

 

Meddane_19-1675154510271.png

On CMS3 verify the DNS Record entries, it must have two DNS RR Records to resolve the server’s name of CMS1 and CMS2.

 

Meddane_20-1675154510272.png

Finally, the Database status on CMS1, CMS2 and CMS3 shows that the status is Connected for all nodes.

 

Meddane_21-1675154510275.png

 

Meddane_22-1675154510277.png

Note: From version 3.5, Cisco Meeting Server can use other validations Hostname/IP Address If you are deploying or upgrading to version 3.6 (or 3.5). The database cluster nodes might fail to connect to each other. The reason is that each server will try to resolve the server's name (or hostname) of each node in the cluster, which is not available in version prior 3.5.

The database cluster verifymode <full/ca> command allows you to configure other validations. If the command is set to full, the Meeting Server along with certificates, verifies if the server identity name (hostname) matches with the name stored in the server certificates. While, if the command is set to ca, the Meeting Server will validate only the Certificate Authority.

By default, the verifymode is set to ca as shown below.

 

Meddane_23-1675154510279.png

 

Meddane_24-1675154510282.png

 

Meddane_25-1675154510283.png

To enable the verifymode full, the nodes need to be removed from the cluster database.

On all CMS, execute the database cluster remove command.

 

Meddane_26-1675154510284.png

 

Meddane_27-1675154510285.png

 

Meddane_28-1675154510286.png

On all CMS, enable the verifymode full using the database cluster verifymode full command.

 

Meddane_29-1675154510287.png
 
Meddane_30-1675154510287.png=

 

Meddane_31-1675154510287.png

On the primary node CMS1, run the database cluster initialize command.

 

Meddane_32-1675154510289.png

On CMS and CMS3, run the database cluster join cms1.lab.local command.

 

Meddane_33-1675154510290.png

 

Meddane_34-1675154510292.png

On CMS1, verify the database status, the verifymode full is now enabled. But the slaves CMS2 and CMS3 are not yet connected to the primary node.

 

Meddane_35-1675154510294.png

On CMS2 and CMS3, verify the database status, the verifymode full is now enabled.

The output of the database cluster status command displays ERROR: Cannot find primary in cluster.

 

Meddane_36-1675154510296.png

 

Meddane_37-1675154510299.png

To identify the problem, let’s execute the syslog follow command on CMS2 and CMS3.

On CMS2, we can see that the logs tell us that the server certificate for cms1.lab.local does not match hostname “cms1”. In other words, the hostname cms1 of the primary node is missing in the Subject Alternative Name SAN of the server certificate.

 

Meddane_38-1675154510303.png

cms2>

Sep 26 13:49:07.774 user.info cms2 sfpool:  Health check cms1: error (up = 1): server certificate for "cms1.lab.local" (and 4 other names) does not match host name "cms1"|

Sep 26 13:49:10.128 user.info cms2 sfpool:  Failover Monitor: Unexpected roll call discrepancy; failover saw 0 connected node(s) (), 1 node(s) up (10.1.5.61)

Sep 26 13:49:12.871 user.info cms2 sfpool:  Health check cms1: error (up = 1): server certificate for "cms1.lab.local" (and 4 other names) does not match host name "cms1"|

Sep 26 13:49:13.138 user.info cms2 sfpool:  Failover Monitor: Unexpected roll call discrepancy; failover saw 0 connected node(s) (), 1 node(s) up (10.1.5.61)

Sep 26 13:49:16.150 user.info cms2 sfpool:  Failover Monitor: Unexpected roll call discrepancy; failover saw 0 connected node(s) (), 1 node(s) up (10.1.5.61)

Sep 26 13:49:17.961 user.info cms2 sfpool:  Health check cms1: error (up = 1): server certificate for "cms1.lab.local" (and 4 other names) does not match host name "cms1"|

cms2>

The same error message is displayed on CMS3.

 

Meddane_39-1675154510307.png

cms3>

Sep 26 13:51:49.978 user.info cms3 sfpool:  Health check cms1: error (up = 1): server certificate for "cms1.lab.local" (and 4 other names) does not match host name "cms1"|

Sep 26 13:51:50.887 user.info cms3 sfpool:  Failover Monitor: Unexpected roll call discrepancy; failover saw 0 connected node(s) (), 1 node(s) up (10.1.5.61)

Sep 26 13:51:53.913 user.info cms3 sfpool:  Failover Monitor: Unexpected roll call discrepancy; failover saw 0 connected node(s) (), 1 node(s) up (10.1.5.61)

Sep 26 13:51:55.097 user.info cms3 sfpool:  Health check cms1: error (up = 1): server certificate for "cms1.lab.local" (and 4 other names) does not match host name "cms1"|

cms3>

To solve the issue, we need to generate another server certificate including the FQDN of all nodes cmsx.lab.local and the hostname of all nodes CMS1, CMS2 and CMS3.

cms1>pki csr dbcert36 CN:cms1.lab.local OU:CCNP O:Collaboration L:lab ST:local C:US subjectAltName:cms2.lab.local,cms3.lab.local,cms1,cms2,cms3

cms1>

 

Meddane_40-1675154510308.png

Copy the CSR and the private key into your PC and generate a new server certificate named for example dbcert36.cer.

 

Meddane_41-1675154510315.png

Verify the new dbcert36.cer certificate include the hostname CMS1, CMS2 and CMS3 along with the FQDNs. Ensure that the dbcert36.cer and the corresponding key is copied into all nodes.

 

Meddane_42-1675154510323.png

Before reconfiguring the cluster database with the new certificate, we need to remove all nodes from the cluster. Execute the database cluster remove command on all CMS.

cmsx> database cluster remove

On CMS1, configure the database server and client certificates named dbcert36 and dbclt. Run the database cluster initialize command.

cms1>database cluster certs dbcert36.key dbcert36.cer dbclt.key dbclt.cer Root-CA.cer

cms1>database cluster initialize

 

Meddane_43-1675154510323.png

 

Meddane_44-1675154510324.png

On CMS2 and CMS3, configure the database server and client certificates named dbcert36 and dbclt. Connect CMS2 and CMS3 to the master database CMS1 using the database cluster join cms1.lab.local command.

cmsx>database cluster certs dbcert36.key dbcert36.cer dbclt.key dbclt.cer Root-CA.cer

cmsx>database cluster join cms1.lab.local

 

Meddane_45-1675154510325.png

 

Meddane_46-1675154510325.png

Connect CMS2 and CMS3 to the master database CMS1.

 

Meddane_47-1675154510326.png

 

Meddane_48-1675154510327.png

Verify the database status on CMS1, the nodes CMS2 and CMS3 are still not connected to the primary node CMS1.

 

Meddane_49-1675154510328.png

On CMS2 and CMS3, the database status displays ERROR: postgresql has failed to start.

 

Meddane_50-1675154510330.png

 

Meddane_51-1675154510332.png

On CMS2 and CMS3, let’s execute the syslog follow command.

The output tells us that the IP address 10.1.5.61 of the primary node CMS1 is missing the server certificate. In other words, the IP address is not found in the SAN of the server certificate.

 

Meddane_52-1675154510335.png

Sep 26 14:45:20.119 local0.err cms2 postgres[72323]:  [6-1] 2022-09-26 14:45:20 UTC [local] FATAL:  the database system is starting up

Sep 26 14:45:20.119 user.warning cms2 host:server:  WARNING : database connection failure (FATAL:  the database system is starting up)

Sep 26 14:45:20.337 local0.err cms2 postgres[72326]:  [6-1] 2022-09-26 14:45:20 UTC  FATAL:  could not connect to the primary server: server certificate for "cms1.lab.local" (and 5 other names) does not match host name "10.1.5.61"

Sep 26 14:45:21.120 local0.err cms2 postgres[72337]:  [6-1] 2022-09-26 14:45:21 UTC [local] FATAL:  the database system is starting up

Sep 26 14:45:21.120 user.warning cms2 host:server:  WARNING : database connection failure (FATAL:  the database system is starting up)

 

Meddane_53-1675154510337.png

Sep 26 14:46:11.124 local0.err cms3 postgres[53171]:  [6-1] 2022-09-26 14:46:11 UTC [local] FATAL:  the database system is starting up

Sep 26 14:46:11.124 user.warning cms3 host:server:  WARNING : database connection failure (FATAL:  the database system is starting up)

Sep 26 14:46:12.125 local0.err cms3 postgres[53174]:  [6-1] 2022-09-26 14:46:12 UTC [local] FATAL:  the database system is starting up

Sep 26 14:46:12.125 user.warning cms3 host:server:  WARNING : database connection failure (FATAL:  the database system is starting up)

Sep 26 14:46:12.240 local0.err cms3 postgres[53175]:  [6-1] 2022-09-26 14:46:12 UTC  FATAL:  could not connect to the primary server: server certificate for "cms1.lab.local" (and 5 other names) does not match host name "10.1.5.61"

To solve the issue we need to generate another server certificate including the FQDN of all nodes cmsx.lab.local and the IP address 10.1.5.61 of the primary node CMS1.

cms1>pki csr dbcert3636 CN:cms1.lab.local OU:CCNP O:Collaboration L:lab ST:local C:US subjectAltName:cms2.lab.local,cms3.lab.local,cms1,cms2,cms3,10.1.5.61

 

Meddane_54-1675154510338.png

Copy the CSR and the private key into your PC and generate a new server certificate named for example dbcert3636.cer.

Verify the new dbcert36.cer certificate include the IP address 10.1.5.61 along with the FQDNs. Ensure that the dbcert3636.cer and the corresponding key is copied into all nodes.

 

Meddane_55-1675154510344.png

Before reconfiguring the cluster database with the new certificate, we need to remove all nodes from the cluster. Execute the database cluster remove command on all CMS.

cmsx> database cluster remove

On CMS1, configure the database server and client certificates named dbcert3636 and dbclt. Run the database cluster initialize command.

cms1>database cluster certs dbcert3636.key dbcert13636.cer dbclt.key dbclt.cer Root-CA.cer

cms1>database cluster initialize

 

Meddane_56-1675154510345.png

On CMS2 and CMS3, configure the database server and client certificates named dbcert3636 and dbclt. Connect CMS2 and CMS3 to the master database CMS1 using the database cluster join cms1.lab.local command.

cmsx>database cluster certs dbcert36.key dbcert36.cer dbclt.key dbclt.cer Root-CA.cer

cmsx>database cluster join cms1.lab.local

 

Meddane_57-1675154510346.png

 

Meddane_58-1675154510347.png

On CMS1, CMS2 and CMS3, verify the database status, the replication is now good.

 

Meddane_59-1675154510349.png

 

Meddane_60-1675154510351.png

 

Meddane_61-1675154510353.png

 

 

 

 

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: