cancel
Showing results for 
Search instead for 
Did you mean: 
cancel

Who Me Too'd this topic

CSCvz27796 - MRA Registration Flaps

Adam Pawlowski
VIP Alumni
VIP Alumni
I spent a few hours on this one myself the other day in our lab and wanted to make this post in case it helps someone else in this situation.
 
Post upgrade of our UCM cluster to 14.0SU1, Jabber would register but then not operate, same with IP Phones over MRA. You could occasionally make a call that would last ~30 seconds, before being disconnected and the endpoint not working.
 
Logging from the UCM shows endpoint unregistered, and lists a reason code of 6:
 
ccm02test:Jan 27 11:24:00 ccm02test Jan 27 2022 16:24:00.191 UTC :  %UC_CALLMANAGER-3-EndPointUnregistered: %[DeviceName=SEP2834A2FFFFA1][IPAddress=1.1.2.2][Protocol=SIP][DeviceType=685][Description=Adam Pawlowski 8861][Reason=6][IPAddrAttributes=0][LastSignalReceived=SIPConnControlInd][MRAStatus=0][AppID=Cisco CallManager][ClusterID=CCM01TEST-Cluster][NodeID=ccm02test]: An endpoint has unregistered
ccm02test.log:Jan 27 11:24:00 ccm02test : : 762: ccm02test Jan 27 2022 16:24:00.189 UTC :   %UC_CALLMANAGER-3-EndPointUnregistered: %[DeviceName=SEP2834A28FFFA1][IPAddress=1.1.2.2][Protocol=SIP][DeviceType=685][Description=Adam Pawlowski 8861][Reason=6][IPAddrAttributes=0][LastSignalReceived=SIPConnControlInd][MRAStatus=0][AppID=Cisco CallManager][ClusterID=CCM01TEST-Cluster][NodeID=ccm02test]: An endpoint has unregistered
The UCM shows a socket error, regarding the connection being closed, before it tears down the device registration:
UCM_Trace.png
 
If you enable developer logging in the Expressway, you can see the socket being closed:
tvcs: UTCTime="2022-01-27 16:11:55,757" Module="developer.sip.transport" Level="DEBUG" CodeLocation="ppcmains/sip/siptrnsp/SipSockMap.cpp(384)" Method="SipSockMap::closeSocketAndFreeEntry" Thread="0x7f5e10fbb640": LocalId="459358807" LocalAddr="['IPv4''TCP''1.1.2.2:26979']" RemoteAddr="['IPv4''TCP''1.1.2.60:5060']" Type="SIP_SOCKTYPE_TCP_OUTG" Detail="Closing Socket" Reason="Manually forced disconnect"
In a wireshark trace, this is a FIN coming from the Expressway towards the UCM.
 
We can see in the Expressway logging then just before this that it's due to STUN reply timeout:
 
2022-01-27T10:07:38.996-05:00 tvcs: UTCTime="2022-01-27 15:07:38,995" Module="developer.sip.transport" Level="INFO" CodeLocation="ppcmains/sip/siptrnsp/siptrnspsfsm.cpp(4334)" Method="::SIPTRNSP_doStunTimeoutSocketClose" Thread="0x7f5e10fbb640": freeing cucm socket connection
 
 

Disabling STUN keepalive in the Expressway C does not fix this. The workaround in the bug does. You can change the name of the Expressway C entry , under Devices -> Expressway -C , in your UCM, to FQDN instead of hostname. This is assuming you let it auto populate.

 

The phones and Jabber have their socket open to the Expressway - E so they don't reflect that they've lost registration. Attempting to place a call from them results in maybe it working briefly if it's currently registered, but otherwise a fast busy.

 

As others are moving to CSR 14 I figured I'd post this to save some trouble and hair pulling if this comes up in search.

Sorry for posting this in edit blocks but the community kept giving me blob conversion errors (?) and eating my post.

Who Me Too'd this topic