08-03-2013 11:10 PM - edited 03-14-2019 12:10 PM
Hi Guys,
My cuurent setup :
UCCE 8.5.4
Side A:
Rogger A, PGA (AgentPG),VRU PGA, AW/HDS A, pub, sub1,sub2,CVPA,CVPB
Side B:
Rogger B, PGB (AgentPG),VRU PGB, AW/HDS B, sub3,sub4,CVPC,CVPD
I am getting error in the Rogger B as below,
1. Connectivity with duplexed partner has been lost due to a failure of the private network, or duplexed partner is out of service.
2. MDS is out of service.
3. MDS has reported failure to the router that it is out of service.
4. Message Delivery Service (MDS) feed from the Router to the Logger has failed.
5. Central Controller service is unavailable.
6. Requesting MDS termination due to error.
7. Application Gateway has been taken out of service. Application Gateway ID - 5000
8. Client rtr stopping due to error.
9. Client hlgr stopping due to error.
10. Client clgr stopping due to error.
11. Synchronizer is unable to establish connection to peer.
12. Process rtr on ICM\suly\RouterB has detected a failure. Node Manager is restarting the process.
13. Process rtr on ICM\suly\RouterB is down after running for 366 seconds. It will restart after delaying 1 second for related operations to complete.
14. ICM\suly\LoggerB node process hlgr exited cleanly and requested that it be restarted by the Node Manager.
but when I check with network team, they have confirmed that there is no issue at network side.
Check the router logs. mds logs. ccagent logs: all these couldnt give other than stopping due to error. what might be causing it to close all these process in the rogger B.
could you guys please advice me to troubleshoot this issue.
Regards,
Shalid K.C
08-04-2013 06:04 PM
Does the rtr service ever go active? If so, have you reviewed the startup log to ensure you're not seeing these same errors? If you're not seeing any errors most of the time, then it seems that you have a network issue or perhaps a route change and the network is fixing it, but not fast enough for the router's liking. My thought would be to not play the cat and mouse game with the network team. Have them put a network sniffer on your segment and wait for the next time this happens. This will be the fasted and easiest way to solve this asap.
david
08-04-2013 11:40 PM
Kindly check the speed of the network cards.
Ashfaque
08-05-2013 08:56 PM
Hello Shalid,
As david suggested Network sniffer would really help in this scenario.
Additionally, there are few Best practices need to be followed based on the UCCE Windows environment. I would suggest you to crossverify that as well.
http://www.cisco.com/en/US/products/sw/custcosw/ps1001/products_tech_note09186a00808160f4.shtml
http://docwiki.cisco.com/wiki/Contact_Center_Networking:_Offload,_Receive_Side_Scaling_and_Chimney
Regards,
Senthil
08-05-2013 10:00 PM
Thank you guys..
Let me check as you suggested and keep you posted....
Regards,
Shalid K.C
08-06-2013 03:38 PM
Hi Senthi and others,
one more thing which I noticed in this is below
Rogger, Agent PG and VRU pg of side B is installed in a single VM ware machine,
today i noticed that all three showing the similar kind of behavior, restarting its essencial process intermittently.
so i suspect there might be some issue with the system itself
but due to my insufficient knowldge in vmware machine i dont know what to check here..
could you guys can help me to suggest what check needs to bedone to isolate the issue..
Regards,
Shalid K.C
08-06-2013 11:49 PM
I think you have the deployment in the UCS Servers which has 3 VM's for each component(Rogger/Agent PG/VRU PG)
Do you use B Series or C series Servers ? I would suggest you check the Network Recommendations for UCCE on UCS servers
Do you have Windows 2008 or 2003 OS ? Can you check the given recommendation on the below given NIC
http://docwiki.cisco.com/wiki/Contact_Center_Networking:_Offload,_Receive_Side_Scaling_and_Chimney
Regards,
Senthil
Rate if it helps
08-07-2013 12:09 AM
Hi Guys,
I have checked the offload setting and please find below,
C:\Users\icmadmin>netsh int ip sh offload
Interface 1: Loopback Pseudo-Interface 1
udp transmit checksum supported.
tcp transmit checksum supported.
udp receive checksum supported.
tcp receive checksum supported.
Interface 11: Public
ipv4 transmit checksum supported.
udp transmit checksum supported.
tcp transmit checksum supported.
tcp large send offload supported.
ipv4 receive checksum supported.
udp receive checksum supported.
tcp receive checksum supported.
Interface 13: Private
ipv4 transmit checksum supported.
udp transmit checksum supported.
tcp transmit checksum supported.
tcp large send offload supported.
ipv4 receive checksum supported.
udp receive checksum supported.
tcp receive checksum supported.
C:\Users\icmadmin>netsh int sh offload
C:\Users\icmadmin>netsh int tcp show global
Querying active state...
TCP Global Parameters
----------------------------------------------
Receive-Side Scaling State : enabled
Chimney Offload State : automatic
NetDMA State : enabled
Direct Cache Acess (DCA) : disabled
Receive Window Auto-Tuning Level : normal
Add-On Congestion Control Provider : ctcp
ECN Capability : disabled
RFC 1323 Timestamps : disabled
and check the offload for both private and public and it is enabled as RX and TX enabled.
So i am going to disable these offload as per the docmnet.
Please let me know if you guys have any suggestion on it.
Regards,
Shalid K.C
08-07-2013 09:04 AM
Hi Shalid,
This is happening due to only private link flaps, your network team/WAN team should look to this issue.
Set up wireshark on Private IPs and run mds logs or enable performance monitor on rogger and PGs.
Ensure when u set all this your network is not carrying too much of traffic or it would heavly impact the production enviornment.
thanks
08-14-2013 12:21 AM
Hi Bala,
I havent perform the above step mentioned.. i will working with network team to get this done.
@ Senthil:,
done the complete configuration as mentioned in the doc, but no luck . still the PGB and RoggerB is critical process getting restarted.
Regards,
Shalid K.C
08-31-2013 06:02 AM
Hi Shalid,
Error message itself shows that the issue with your private network......
Connectivity with duplexed partner has been lost due to a failure of the private network, or duplexed partner is out of service.
You should check your private network connectivity.
for corss verify....... do as below.
Ping from sideA server to SideB server to both the IP Address.( Private and Visible )
and check that on which ip address you are observing packet drops. and whenever you observed packet drops in private nic , check the process at the same time.
Thanks & Regards,
Hardik B Kansara
01-02-2014 05:13 AM
Thank you all for the reply.
done all above and cross verified.. everything seems to be fine.
here what i found, i believe this might be causing the issue,
we notices that the all switches are connected to our UCCE servers in site B are Fast Ethernet which having maximum capacity of 100Mbps. But Cisco recommended that it should be 1000Mbps.
Which might be the major reason for the outage which happening only in Site B . Where in Site A we have switch which are 1000Mbps.
Please refer the below doc from Cisco regarding the network requirement for UCS C series servers.
Network Requirements for UCS C Series Servers
The below design is the default and recommended for all Unified CCE deployments on UCS C series servers. Exampled in Figures 10 and 11 are two possible network side implementations of the same vSphere Hypervisor vSwitch design. This design calls for using the VMware NIC Teaming (without load balancing) of vmnic interfaces in an Active/Standby configuration through alternate and redundant hardware paths to the network, thereby preventing any single point of failure from affecting the Visible or Private network communications.
The network side implementation does not have to exactly match either of those illustrated below, but it must allow for redundancy and not allow for single points of failure affecting both Visible and Private network communications. There are more possible ways that this could be implemented in a supportable design than can be covered here.
Requirements:
Reference: http://docwiki.cisco.com/wiki/UCS_Network_Configuration_for_Unified_CCE
So we believe we have to replace switched for a smooth functioning of CC. we are awaiting to replcae the switch to further...
Regards,
Shalid K.C
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide