cancel
Showing results for 
Search instead for 
Did you mean: 
cancel

Who Me Too'd this topic

Jabber issue with IM & P Server in Failover

Ali Amir
Level 1
Level 1

Hello!

 
I have the below environment:

 
CUCM & IM/Presence 10.5.2
Jabber 11.5.2

IM/Presence Cluster with 1 Publisher and 1 Subscriber in the Site same site.
 

In CUCM we have "Presence Redundancy Group" which contains both IM/Presence nodes the "Enable High Availability" checkbox checked. The status of High Availability could be confirmed on Presence Topology Details.

Regarding load balancing, users are divided to groups and assigned to host03 and host04. Every user on every server could login without problem to the related assigned server.

When the failover option is active on a server, the server gets idle state and the other server go to Running in Backup Mode. The status of servers on topology details could confirm this.

But users on idle server could not get the service from the other server and stay in offline mode. It seems as Jabber sends authentication request to the second server "host04.domain.test.biz", but this request could process and saved on the second server for failover time.


In Jabber Logs I could find below logs:

################
2016-09-18 21:15:30,855 INFO  [0x00001234] [erx\jwcpp\LoginMgr\LoginContext.cpp(654)] [csf.jwcpp] [CLoginContext::SaveHASoapServers] - @LoginMgr: #0, CLoginContext::SaveHASoapServers ha, login, cache HA Soap servers:host03.domain.test.biz;host04.domain.test.biz for key:host03.domain.test.biz
2016-09-18 21:15:30,855 INFO  [0x00001234] [erx\jwcpp\LoginMgr\LoginContext.cpp(894)] [csf.jwcpp] [CLoginContext::InitCupHAHosts] - @LoginMgr: #0, CLoginContext::InitCupHAHosts ha, soap-servers:host03.domain.test.biz
2016-09-18 21:15:30,855 INFO  [0x00001234] [erx\jwcpp\LoginMgr\LoginContext.cpp(894)] [csf.jwcpp] [CLoginContext::InitCupHAHosts] - @LoginMgr: #0, CLoginContext::InitCupHAHosts ha, soap-servers:host04.domain.test.biz
2016-09-18 21:15:30,855 INFO  [0x00001234] [rx\jwcpp\LoginMgr\LoginCUPState.cpp(124)] [csf.jwcpp] [CLoginCup::connectOnRetrievedServerInfo] - @LoginMgr: #0, CLoginCup::connectOnRetrievedServerInfo login, cup:host03.domain.test.biz
...
2016-09-18 21:33:24,110 ERROR [0x00001514] [upSoapClient\CupSoapClientImpl.cpp(1413)] [csf.jwcpp] [CupSoapClientImpl::GetAllConfig] - @CupSoapCli: soap, get-all-config:12
2016-09-18 21:33:24,110 ERROR [0x00001514] [upSoapClient\CupSoapClientImpl.cpp(1421)] [csf.jwcpp] [CupSoapClientImpl::GetAllConfig] - @CupSoapCli: soap, soap::get_all_config fails:12, reason: SOAP 1.2 fault: SOAP-ENV:Sender[no subcode]
"Invalid session-key element in SOAP request header."
Detail: [no detail]

2016-09-18 21:33:24,110 INFO  [0x00001234] [rx\jwcpp\LoginMgr\LoginCUPState.cpp(508)] [csf.jwcpp] [CLoginCup::OnGetAllConfig] - @LoginMgr: #0, CLoginCup::OnGetAllConfig err-code: 12, err-string: SOAP 1.2 fault: SOAP-ENV:Sender[no subcode]
"Invalid session-key element in SOAP request header."
Detail: [no detail]
...
2016-09-18 21:33:24,110 INFO  [0x00001234] [rx\jwcpp\LoginMgr\LoginCUPState.cpp(217)] [csf.jwcpp] [CLoginCup::HandleCupHA] - @LoginMgr: #0, CLoginCup::HandleCupHA ha, HandleCupHA
2016-09-18 21:33:24,110 WARN  [0x00001234] [erx\jwcpp\LoginMgr\LoginContext.cpp(905)] [csf.jwcpp] [CLoginContext::GetHASoapServer] - @LoginMgr: ha, invalid HA soap server index:2
2016-09-18 21:33:24,110 INFO  [0x00001234] [rx\jwcpp\LoginMgr\LoginCUPState.cpp(235)] [csf.jwcpp] [CLoginCup::HandleDnsSrv] - @LoginMgr: #0, CLoginCup::HandleDnsSrv dns, HandleDnsSrv
2016-09-18 21:33:24,110 INFO  [0x00001234] [erx\jwcpp\LoginMgr\LoginContext.cpp(267)] [csf.jwcpp] [CLoginContext::ChangeState] - @LoginMgr: #0, CLoginContext::ChangeState now:0 auto:1
2016-09-18 21:33:24,110 INFO  [0x00001234] [rx\jwcpp\LoginMgr\LoginStopState.cpp(43)] [csf.jwcpp] [CLoginStop::OnStateChanged] - @LoginMgr: #0, CLoginStop::OnStateChanged CLoginStop::OnStateChanged11
2016-09-18 21:33:24,110 INFO  [0x00001234] [rwerx\jwcpp\xmppsdk\XmppClient.cpp(2028)] [csf.jwcpp] [CXmppClient::FinalCleanData] - @XmppSDK: CXmppClient::FinalCleanData, bReconnect:1
2016-09-18 21:33:24,110 INFO  [0x00001234] [erx\jwcpp\LoginMgr\LoginContext.cpp(627)] [csf.jwcpp] [CLoginContext::NextReconnectAttempt] - @LoginMgr: #0, CLoginContext::NextReconnectAttempt conn, next attempt. count:1, remain:4294967111
2016-09-18 21:33:24,110 INFO  [0x00001234] [erx\jwcpp\LoginMgr\LoginContext.cpp(574)] [csf.jwcpp] [CLoginContext::GetReconnectInterval] - @LoginMgr: #0, CLoginContext::GetReconnectInterval conn, choose a random interval:181 between 20 and 205.
2016-09-18 21:33:24,110 INFO  [0x00001234] [erx\jwcpp\LoginMgr\LoginContext.cpp(296)] [csf.jwcpp] [CLoginContext::StartReconnectTimer] - @LoginMgr: #0, CLoginContext::StartReconnectTimer conn, timer, seconds:181
...
################


In authentication logs from the backup server I could see:


################
21:30:23.173 | error| TokenAuthUtils::executeAuthQuery(): error: 0 Rows Found
21:30:23.173 | error| TokenAuthUtils::authencateUsingOTP(): ExecuteAuthQuery() return authentication failure for userid: [test.user]
21:30:23.183 | error| SXComponent::authenticateVTGToken(): Authentication error: AUTH_SYSTEM_ERROR for userid: [test.user@test.com]
21:30:23.183 | error| SXComponent::_processRequest(): Authentication failed due to internal issue
21:30:23.184 | error| SXComponent::_sendAuthFailure(): Auth component sending Failure
################


In database Logs from backup server I find this:


################
21:33:24.835 HDR|09/18/2016 ttlogin,StandAloneCluster,host04.domain.test.biz,Special,10.5.2.23900-3
21:33:24.835 |SqlParserBase::processSqlReq SELECT
SELECT * FROM clientsessions WHERE sessionid='6cad03a2-4bdf-bb43-ab9a-f74d4861bbbb';
21:33:24.835 HDR|09/18/2016 ttlogin,StandAloneCluster,host04.domain.test.biz,Special,10.5.2.23900-3
21:33:24.835 |{
    sqlstmt(SELECT * FROM clientsessions WHERE sessionid='6cad03a2-4bdf-bb43-ab9a-f74d4861bbbb';)
    m_count(0), errmsg(), rc(success)
    TYPESQLRSP(<msg><type>TTLOGIN</type><table>typesqlrsp</table><action>Q</action><time>0</time><old><1>0</1><5></5><6></6><2>0</2><4>success</4><3></3><0></0></old></msg>
)
}

21:31:40.350 HDR|09/18/2016 ttlogin,StandAloneCluster,host04.domain.test.biz,Special,10.5.2.23900-3
21:31:40.349 |SqlParserBase::processSqlReq SELECT
SELECT * FROM clientsessions WHERE sessionid='1990a498-80a9-3008-8767-a096328af1b7';
21:31:40.350 HDR|09/18/2016 ttlogin,StandAloneCluster,host04.domain.test.biz,Special,10.5.2.23900-3
21:31:40.349 |{
    sqlstmt(SELECT * FROM clientsessions WHERE sessionid='1990a498-80a9-3008-8767-a096328af1b7';)
    m_count(0), errmsg(), rc(success)
    TYPESQLRSP(<msg><type>TTLOGIN</type><table>typesqlrsp</table><action>Q</action><time>0</time><old><1>0</1><5></5><6></6><2>0</2><4>success</4><3></3><0></0></old></msg>
)
}

21:33:39.194 HDR|09/18/2016 ttlogin,StandAloneCluster,host04.domain.test.biz,Special,10.5.2.23900-3
21:33:39.192 |SqlParserBase::processSqlReq SELECT
SELECT * FROM ONETIMEPASSWORD WHERE userid='user1';
21:33:39.194 HDR|09/18/2016 ttlogin,StandAloneCluster,host04.domain.test.biz,Special,10.5.2.23900-3
21:33:39.192 |{
    sqlstmt(SELECT * FROM ONETIMEPASSWORD WHERE userid='user1';)
    m_count(0), errmsg(), rc(success)
    TYPESQLRSP(<msg><type>TTLOGIN</type><table>typesqlrsp</table><action>Q</action><time>0</time><old><1>0</1><5></5><6></6><2>0</2><4>success</4><3></3><0></0></old></msg>
)
}

21:33:39.283 HDR|09/18/2016 ttlogin,StandAloneCluster,host04.domain.test.biz,Special,10.5.2.23900-3
21:33:39.283 |SqlParserBase::processSqlReq SELECT
SELECT * FROM clientsessions WHERE sessionid='baf481d4-60e0-331a-c188-b3acba3e069e';
21:33:39.283 HDR|09/18/2016 ttlogin,StandAloneCluster,host04.domain.test.biz,Special,10.5.2.23900-3
21:33:39.283 |{
    sqlstmt(SELECT * FROM clientsessions WHERE sessionid='baf481d4-60e0-331a-c188-b3acba3e069e';)
    m_count(0), errmsg(), rc(success)
    TYPESQLRSP(<msg><type>TTLOGIN</type><table>typesqlrsp</table><action>Q</action><time>0</time><old><1>0</1><5></5><6></6><2>0</2><4>success</4><3></3><0></0></old></msg>
)
}
################


On the Idle server I could see for authentication these database query:


################
21:20:13.354 HDR|09/18/2016 ttlogin,StandAloneCluster,host03.domain.test.biz,Special,10.5.2.23900-3
21:20:13.337 |SqlParserBase::processSqlReq SELECT
SELECT * FROM onetimepassword WHERE userid='user1';
21:20:13.354 HDR|09/18/2016 ttlogin,StandAloneCluster,host03.domain.test.biz,Special,10.5.2.23900-3
21:20:13.337 |{
    sqlstmt(SELECT * FROM onetimepassword WHERE userid='user1';)
    m_count(0), errmsg(), rc(success)
    TYPESQLRSP(<msg><type>TTLOGIN</type><table>typesqlrsp</table><action>Q</action><time>0</time><old><1>0</1><5></5><6></6><2>0</2><4>success</4><3></3><0></0></old></msg>
)
}

21:20:13.354 HDR|09/18/2016 ttlogin,StandAloneCluster,host03.domain.test.biz,Special,10.5.2.23900-3
21:20:13.337 |SqlParserBase::processSqlReq INSERT
INSERT INTO oneTimePassword (userid,password,tmstamp) VALUES('user1', '54640222', '1476359383733');
21:20:13.354 HDR|09/18/2016 ttlogin,StandAloneCluster,host03.domain.test.biz,Special,10.5.2.23900-3
21:20:13.337 |{
    sqlstmt(INSERT INTO oneTimePassword (userid,password,tmstamp) VALUES('user1', '54640222', '1476359383733');)
    m_count(1), errmsg(), rc(success)
    TYPESQLRSP(<msg><type>TTLOGIN</type><table>typesqlrsp</table><action>Q</action><time>0</time><old><1>0</1><5></5><6></6><2>1</2><4>success</4><3></3><0></0></old></msg>
)
}


Do you know how does Jabber authentication work(in failover solution)? Do I need to consider of user assignment to servers?

Who Me Too'd this topic