06-18-2021 07:13 AM - edited 06-18-2021 08:24 AM
Upgraded ESX, so had to shutdown the Subscriber. Had an issue with MGCP gateways not failing back to primary and while registered to the secondary they were not processing calls.
CUCM 11.5
Site A
Sub A-1
Sub A-2
Datacenter
Pub
Sub D-1
When we shutdown Sub A-1 everything failed over to Sub D-1 like it was supposed to. Upgraded ESX and moved to next host. When we shutdown Sub A-2 everything failed over to Sub D-1 like it was supposed to. Yes, I forgot to power on Sub A-1 so all devices were now trying to register to D-1. I noticed that many of the phone that normally register to A-1 were not registered, that was when I realized I forgot to power A-1 back on. Fast-forward, I have upgraded ESX and powered all CUCMs back up. I verified phones registered to the appropriate CUCM server. Gateways were still registered to D-1 but calls should have been working. I did not test calls, I only verified everything was registered.
The next morning, 2 gateways at Site A were still registered to D-1 and not processing calls. In-bound and out-bound calls were failing. When looking at the gateway, all PRI were up and registered. Debug q931 showed PRI channels restarting. Forced both gateways to failback to the primary CCM by rebooting D-1 and calls started working. Now the gateways show registered to primary and secondary is "ready".
Why would the gateways not process calls while registered to the secondary? This has worked properly in the past. Unfortunately, that morning we also got hit with the lowwatermark alert and ended up deleting logs before, trying to pull traces from RTMT.
Why would the gateways not recognize that the primary restored and automatically fail back? switch-back mode is set to graceful.
Just looking for some ideas of what could have cause the issue. Right now the only thing I can think of is that with everything trying to register to D-1, it was to much for it to handle and got hung up.
06-18-2021 08:21 AM
When leaving the site,Gateways were still registered to D-1 but calls should have been working.
But next morning, 2 gateways at Site A were still registered to D-2 and not processing calls. what is this D-2 ?
06-18-2021 08:23 AM
D-2 is a typo.
The next morning both gateways at Site A were still registered to D-1...
06-18-2021 08:26 AM
Can you share your VG MGCP configuration ?
06-23-2021 12:58 PM
06-23-2021 09:29 PM
When mgcp manually configured, configure ccm-manager switchback immediate.
when using ccm-manager config, enable the below on CUCM gateway page.
Try the above and see the behavior.
06-24-2021 10:15 AM
My CUCM is set to graceful fallback. After seeing your suggestion I did some reading and the documentation I read indicated that the switchback setting was more for SRST. We are not using SRST, this failover is just between CUCM subsribers. Does the switchback setting impact CUCM failover as well?
It has been set to graceful forever and we have never had this problem.
I suspect that some service in Subscriber D-1 got hung and didn't release the gateways. I am hesitant to blame gateways because it happened to all of them.
I actually have test scheduled for the weekend so I will definitely give your suggestion another look and actually try it as well.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide