cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1505
Views
10
Helpful
5
Replies

Issue with VCS cluster breaking after reboot

I've got a weird issue with a VCS-C cluster.  I have two nodes running X12.6.2.  Node 1 is a "cluster of 1" with itself defined as peer 1 (using IP address, everything set to permissive).

When I add peer 2, they cluster OK and peer 2 grabs all its config etc and everything works fine, calls, registrations etc.  However, whenever I reboot peer 2, the clustering breaks (saying it's in a partitioned state) and won't come back up again, even when I do a xcommand forceconfigupdate.  The only way to fix it is to remove peer 2 from the cluster, factory reset it and re-add it back to the cluster, after which it will work until the next reboot.

 

Settings, SRVs etc I assume are correct, otherwise it wouldn't cluster in the first place.


Log entries on peer 2 show "Message rejected" for the clustering service, coming from Peer 1, but I can't find much info about that.

Any ideas?

5 Replies 5

amehla
Cisco Employee
Cisco Employee

Hi Nick,

 

Looks to be bit strange behavior. Assuming that you have configured clustering configuration on both peers correctly and same, do you see any error on 2nd node after reboot and any alarm on both peers related to same ?

 

Amit Mehla

Hi Amit, the cluster configuration is the same on both nodes.  After I reboot the second node and lose clustering, I see cluster errors on both nodes.  It was the second node where I saw the error in the logs, although I didn't actually check the log messages on node 1.

JJ77
Level 1
Level 1

Hello,

You may like to have a look at the foll release note (Pg. 17)-https://www.cisco.com/c/dam/en/us/td/docs/voice_ip_comm/expressway/release_note/Cisco-Expressway-Release-Note-X12-6-2.pdf

 

Please check if the peer IP address changes after a reboot (this may be the first thing to check) 

Also, check if the certificates on peer is intact after the reboot 

 

As a recommendation, the peer VCS should always be rebooted after enabling maintenance mode (not sure if this is done in your case hence bringing it to attention) 

 

Thanks for the response - the peer IP addresses and certificates remain intact so it doesn't look like it's being removed from the cluster.  In addition, yes we do reboot after maintenance mode.

 

JJ77
Level 1
Level 1

Have you checked the Cluster pre-shared key on the slave peer after the reboot ?

The VCS uses IPsec to enable secure communication between each cluster peer.