cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
6668
Views
58
Helpful
20
Replies

CSCvz27796 - MRA Registration Flaps

Adam Pawlowski
VIP Alumni
VIP Alumni
I spent a few hours on this one myself the other day in our lab and wanted to make this post in case it helps someone else in this situation.
 
Post upgrade of our UCM cluster to 14.0SU1, Jabber would register but then not operate, same with IP Phones over MRA. You could occasionally make a call that would last ~30 seconds, before being disconnected and the endpoint not working.
 
Logging from the UCM shows endpoint unregistered, and lists a reason code of 6:
 
ccm02test:Jan 27 11:24:00 ccm02test Jan 27 2022 16:24:00.191 UTC :  %UC_CALLMANAGER-3-EndPointUnregistered: %[DeviceName=SEP2834A2FFFFA1][IPAddress=1.1.2.2][Protocol=SIP][DeviceType=685][Description=Adam Pawlowski 8861][Reason=6][IPAddrAttributes=0][LastSignalReceived=SIPConnControlInd][MRAStatus=0][AppID=Cisco CallManager][ClusterID=CCM01TEST-Cluster][NodeID=ccm02test]: An endpoint has unregistered
ccm02test.log:Jan 27 11:24:00 ccm02test : : 762: ccm02test Jan 27 2022 16:24:00.189 UTC :   %UC_CALLMANAGER-3-EndPointUnregistered: %[DeviceName=SEP2834A28FFFA1][IPAddress=1.1.2.2][Protocol=SIP][DeviceType=685][Description=Adam Pawlowski 8861][Reason=6][IPAddrAttributes=0][LastSignalReceived=SIPConnControlInd][MRAStatus=0][AppID=Cisco CallManager][ClusterID=CCM01TEST-Cluster][NodeID=ccm02test]: An endpoint has unregistered
The UCM shows a socket error, regarding the connection being closed, before it tears down the device registration:
UCM_Trace.png
 
If you enable developer logging in the Expressway, you can see the socket being closed:
tvcs: UTCTime="2022-01-27 16:11:55,757" Module="developer.sip.transport" Level="DEBUG" CodeLocation="ppcmains/sip/siptrnsp/SipSockMap.cpp(384)" Method="SipSockMap::closeSocketAndFreeEntry" Thread="0x7f5e10fbb640": LocalId="459358807" LocalAddr="['IPv4''TCP''1.1.2.2:26979']" RemoteAddr="['IPv4''TCP''1.1.2.60:5060']" Type="SIP_SOCKTYPE_TCP_OUTG" Detail="Closing Socket" Reason="Manually forced disconnect"
In a wireshark trace, this is a FIN coming from the Expressway towards the UCM.
 
We can see in the Expressway logging then just before this that it's due to STUN reply timeout:
 
2022-01-27T10:07:38.996-05:00 tvcs: UTCTime="2022-01-27 15:07:38,995" Module="developer.sip.transport" Level="INFO" CodeLocation="ppcmains/sip/siptrnsp/siptrnspsfsm.cpp(4334)" Method="::SIPTRNSP_doStunTimeoutSocketClose" Thread="0x7f5e10fbb640": freeing cucm socket connection
 
 

Disabling STUN keepalive in the Expressway C does not fix this. The workaround in the bug does. You can change the name of the Expressway C entry , under Devices -> Expressway -C , in your UCM, to FQDN instead of hostname. This is assuming you let it auto populate.

 

The phones and Jabber have their socket open to the Expressway - E so they don't reflect that they've lost registration. Attempting to place a call from them results in maybe it working briefly if it's currently registered, but otherwise a fast busy.

 

As others are moving to CSR 14 I figured I'd post this to save some trouble and hair pulling if this comes up in search.

Sorry for posting this in edit blocks but the community kept giving me blob conversion errors (?) and eating my post.

20 Replies 20

can you please elaborate this steps (2- set DNS + domain name to CUCM + CUP (ssh to them >> set network dns.. + set network domain..) (I forgot to do this..))  and where to do this? on CUCM or expressway ? 

 

dear @Techmtimvideo 
means of @MxHadi post is to do this steps >>

1. SSH to CUCM & enter this commands:
set network dns primary
set network domain
>> then reboot

2. SSH to CUP & enter this commands:
set network dns primary
set network domain
>> then reboot

he forgets to set this options on his two servers (CUCM+CUP)..

>> my Telegram id: @morez_hadi + if this helped, please rate by click (Accept as solution) or (Helpful)

My CUCM was v14SU3, expressway v14.0.1 before. Deployed and communicated all servers by FQDN and CA-Signed certificate. But to fix smart license issue, I have to upgrade expressway to 14.0.11. After upgrading, the MRA call always dropped after seconds. So I checked CUCM > Device > Expressway-C configuration is empty. Then I re-configured FQDN of ExpC and X509. The issue resolved.

JonHile
Level 1
Level 1

Anyone have any further insight on this issue? i have tried all of the above and still experiencing issues with MRA re-registration every 40-60 seconds, Only occurring for devices connected offnet, anything connected on net seems to be stable

Turning off "STUN keepalive" helped me (CUCM 15SU1a, Expressway 15.0.3).
It is located in Configuration/Unified Communications/Configuration at the very bottom.

We're been on that same combination as you, but did not experience this issue you had. For us it is enough to have all C's added with the FQDN by copying the item created by the Cs via AXL, so we end up with two records in CM per C node.

Turning off STUN keepalives will disable the CM high availability resulting in clients loosing connection if there happens to be any issue with the CM that they registered with at connect.

image.png

With STUN keepalives the clients should be able to make a new connection with another CPE node listed in the CMG in the device pools the device belongs to.



Response Signature