cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
925
Views
10
Helpful
6
Replies

CUCM failed failover for MGCP Gateways

Chris Austin
Level 1
Level 1

Upgraded ESX, so had to shutdown the Subscriber.  Had an issue with MGCP gateways not failing back to primary and while registered to the secondary  they were not processing calls.

CUCM 11.5

Site A

Sub A-1

Sub A-2

 

Datacenter

Pub

Sub D-1

 

When we shutdown Sub A-1 everything failed over to Sub D-1 like it was supposed to.  Upgraded ESX and moved to next host.  When we shutdown Sub A-2 everything failed over to Sub D-1 like it was supposed to.  Yes, I forgot to power on Sub A-1 so all devices were now trying to register to D-1.  I noticed that many of the phone that normally register to A-1 were not registered, that was when I realized I forgot to power A-1 back on. Fast-forward, I have upgraded ESX and powered all CUCMs back up.  I verified phones registered to the appropriate CUCM server.  Gateways were still registered to D-1 but calls should have been working.  I did not test calls, I only verified everything was registered.

 

The next morning, 2 gateways at Site A were still registered to D-1 and not processing calls.  In-bound and out-bound calls were failing.  When looking at the gateway, all PRI were up and registered.  Debug q931 showed PRI channels restarting.  Forced both gateways to failback to the primary CCM by rebooting D-1 and calls started working.  Now the gateways show registered to primary and secondary is "ready".

 

Why would the gateways not process calls while registered to the secondary?  This has worked properly in the past.  Unfortunately, that morning we also got hit with the lowwatermark alert and ended up deleting logs before, trying to pull traces from RTMT. 

 

Why would the gateways not recognize that the primary restored and automatically fail back?  switch-back mode is set to graceful.

 

Just looking for some ideas of what could have cause the issue.  Right now the only thing I can think of is that with everything trying to register to D-1, it was to much for it to handle and got hung up.

6 Replies 6

When leaving the site,Gateways were still registered to D-1 but calls should have been working.

But next morning, 2 gateways at Site A were still registered to D-2 and not processing calls. what is this D-2 ? 

 

 



Response Signature


D-2 is a typo. 

 

The next morning both gateways at Site A were still registered to D-1...

Can you share your VG MGCP configuration ?

 

 



Response Signature


I have attached the config.

When mgcp manually configured, configure  ccm-manager switchback immediate.

 

when using ccm-manager config, enable the below on CUCM gateway page.

 

Capture.PNG

 

 

Try the above and see the behavior.



Response Signature


My CUCM is set to graceful fallback. After seeing your suggestion I did some reading and the documentation I read indicated that the switchback setting was more for SRST.  We are not using SRST, this failover is just between CUCM subsribers.  Does the switchback setting impact CUCM failover as well?

 

It has been set to graceful forever and we have never had this problem.

 

I suspect that some service in Subscriber D-1 got hung and didn't release the gateways.  I am hesitant to blame gateways because it happened to all of them.

 

I actually have test scheduled for the weekend so I will definitely give your suggestion another look and actually try it as well.