07-24-2014 06:07 AM - edited 03-19-2019 08:25 AM
Hello,
We have IM&P pub and sub on 9.1.1-41900 and we have just moved the machines from physical hardware to a virtual environment.
Everything went fine apart from one issue. When the users are assigned to the primary node the "on a call" presence status does not work. When we move the user to the seconday server it works fine.
After a thourogh analysis I see that the sip publish leaves the CUCM to the primary node (sip proxy) and then sip proxy server tries to send it to the presence engine on port 5070 on the same server (primary presence) but it cannot establish tcp connection. It turns out we cannot telnet on port 5070 on that server but we can on the secondary (that's why the presence "on a call" works when the user is assigned to the secodary)
The server does not listen on that port even after a full server restart, or even a service restart. The service appears is running on both gui and cli. The service also listens to port 6603 and all other ports on 66XX range which means that it has connection to the presence datastore.
The sip proxy logs show this
17:55:46.207 |[Tue Jul 22 17:55:46 2014] PID(3306) sip_tcp.c(1053) Creating connection with 10.204.65.9:5070, connid 4795 sock_fd 31
17:55:46.207 |[Tue Jul 22 17:55:46 2014] PID(3306) sip_tcp.c(2562) setting timer for 10000 ms on connection connid: 4795, sock_fd 31
17:55:46.207 |[Tue Jul 22 17:55:46 2014] PID(3306) sip_tcp.c(3944) sip_tcp received auth state as: 0 for connid: 4795 sockfd 31 flags 0 from sip_sm
17:55:46.207 |[Tue Jul 22 17:55:46 2014] PID(3306) sip_tcp: epoll event error on connected socket with connid 4795, sock_fd 31 remote_addr 10.204.65.9:5070, State Connect pending flags 0, 2 No such file or directory
17:55:46.207 |[Tue Jul 22 17:55:46 2014] PID(3306) sip_tcp.c(1084) sip_tcp : Hard close/destroy of tcp connid 4795 sock_fd 31 flags 0
17:55:46.207 |Tue Jul 22 17:55:46 2014] PID(3306) sip_tcp.c(964) Freeing connection with connid 4795, sock_fd 31 remote_addr 10.204.65.9:5070, State Connect pending flags 0
17:55:46.208 |[Tue Jul 22 17:55:46 2014] PID(3306) sip_tcp.c(928) sip_tcp : close() sock_fd 31:34
17:55:46.208 |[Tue Jul 22 17:55:46 2014] PID(3306) sip_tcp.c(778) sip_tcp is now sending failure pdu connid 4795, sock_fd 0 1 msgs
As I said restarting the server or the service does not help. So far we have all the users on the secondary node but we need to bring the primary up
The presence engine logs do not show any noticeable error.
Can somebody from cisco tell me if this is a know bug and if there is any recovery procedure ?
Edit: I also see this in the logs which might be relevant
"PE is currently disabled, will not process CN's until it is re-enabled"
Regards
08-05-2014 01:07 AM
For the record the issue as due to this error when presence engine is starting
--------------------------------------------------------------------------------------------
11:38:43.223 |system.tls.config 1075611 WARNING Error loading private key from file: error code: 185073780 in x509_cmp.c line 406.
11:38:43.223 |debug.oam.fault.faultservice 1075611 INFO faultnotification: constructing notification : PETlsConfigError
11:38:43.223 |system.oam.faults 1075611 DEBUG CCMFaultModule::notify: Alarm name = PETlsConfigError
11:38:43.223 |system.oam.faults 1075611 DEBUG CCMFaultModule::notify: param 1 = TlsErrorMessage : Error loading private key from file: error code: 185073780 in x509_cmp.c line 406.
11:38:43.223 |GenAlarm: AlarmName = UNKNOWN_ALARM:PETlsConfigError, subFac = KeyParam = , severity = 3, AlarmMsg = TlsErrorMessage : Error loading private key from file: error code: 185073780 in x509_cmp.c line 406.
AppID : Cisco Presence Engine
ClusterID : StandAloneCluster36f38
NodeID : dc10pvups01
11:38:43.223 |createFile: fileName = GEN_ALARM_MAPFILE.8001000 _ElemTableSize =500 totalsize = 698016
11:38:43.224 |GenAlarm: Push_back offset 1 seq 1
11:38:43.224 |UNKNOWN_ALARM:PETlsConfigError - TlsErrorMessage:Error loading private key from file: error code: 185073780 in x509_cmp.c line 406.
App ID:Cisco Presence Engine Cluster ID:StandAloneCluster36f38 Node ID:dc10pvups01
11:38:43.224 |system.oam.faults 1075611 DEBUG CCMFaultModule::notify: retval= 0
11:38:43.224 |debug.oam.fault.faultservice 1075611 INFO faultnotification: destructing notification : PETlsConfigError
----------------------------------------------------------------------------
I regenerated the cup and the ipsec certificate, reloaded the presence server and the issue is fixed
01-14-2015 01:14 PM
This resolved my issue. Thank you
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide