cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
421
Views
5
Helpful
4
Replies
goranpilat
Participant

MRA not working properly while CUCM cluster is in partial state

Hi,

 

As I said, smaller part of CUCM cluster is down (2 of 5 servers), and MRA is working significantly slower and more inconsistent.

 

Symptoms:

 

-users, when try to login, get the message that the network is down, after they try couple more times they manage to login

-I both rebooted EXPC and reconfigured CUCM servers, but still, although two CUCMs are down, I see active TCP connections on EXPC to all of CUCMs and all UC zones are active with SIP reachable status.

-I have two cisco-uds records, first one pointing to server which is UP and the second one to server which is DOWN.

 

My questions:

-Why does EXPC see dead CUCMs as active?

-Is it trying the user authentication on those servers and failing, thus prolonging authentication period?

-Why is user not authenticated on living server (at least not all the time, only every third time or so authentication goes through)?

-Could it maybe be client problem, which is not waiting long enough for authentication to finish?

 

thanks a lot for any suggestions

 

Goran

4 REPLIES 4

hi, 

Could you please remove the offline call manager from the CM group assigned to the user and give it a try? 

I am not sure it is related but there is an open bug related to call manager node failure that leads to a jabber login issue.  https://bst.cloudapps.cisco.com/bugsearch/bug/CSCuq79072 

 

Regards,

Shalid

Jaime Valencia
Hall of Fame Cisco Employee

Have you refreshed the servers in EXP-C?

HTH

java

if this helps, please rate

Jup, refreshed CUCMs, removed/re-added CUCMS, and rebooted the EXPC.

Adam Pawlowski
VIP Advocate

I don't believe this solution is supported for operation when servers are down and out of service.

 

I just went through this myself as we tested such a thing. In the past, without SSO, you could get away re-organizing the CM group to set primary to an available UCM, then you click on login a bunch and eventually you get in.

 

What's happening there is that the Expressway still believes that those other nodes area available for UDS, and tries to query them. They'll time out and sign in will fail, but you can press sign in again and maybe hit another one.


With SSO this is even worse as the same mechanism will be used to attempt to validate tokens, which will fault as the host isn't up, and leads to token revocation/expiry.

 

I'd be more than happy to hear that there's a way to support this, which maybe there will be in the future, but as far as I can tell when you have a UCM offline this is what happens with Jabber and MRA at least.

Content for Community-Ad

Spotlight Awards 2021