04-18-2019 04:10 AM
We went from migration version 10.6 to 11.6.2 and the new version cannot handle the load. Daily occurances of having to restart services agents and supervisors lose Live Data connection. Several TAC cases later Cisco only sites that we have too many configured supervisors and too many agents on teams. We are instructed to build our business around the product, not the other way around. We have a couple of 24x7 teams so building a team based on shifts is a bit ridiculous, but the cpu on the VM pegs to 100% where it affects everything. The UCCX product is no longer a viable product for medium sized customers. We are reviewing whether version 12 performs any better and also looking at outsourcing. None of these issues came with CAD or even version 10.6 on Finesse, but the newer better version can not handle it.
04-18-2019 04:59 AM - edited 04-18-2019 05:25 AM
What are your agents/supervisors using for browsers? Anthony Holloway was the first to uncover that if the browsers lack WebSocket support they will fallback to a polling mechanism that pegs the CPU even at relatively low scale. It effectively creates a DDOS condition.
04-18-2019 05:06 AM
We have a mix browser whatever the agent/supervisor prefers, but IE11/Firefox/Chrome are all used, all are listed as compatible with the product.
04-23-2019 04:43 AM
Our system is losing live data on a daily basis, for over two weeks since upgrade. Cisco is indicating cause to be too many supervisors or agents on a team in the configuration. We are now starting to CPU run on primary server at approximately 50% beginning around 3:00 a.m. with no agents logged in. This is heading into the worst version of this application I have ever seen and have been with it for about 12 years. Multiple server restarts and poor Cisco TAC response to the problems.
04-23-2019 08:34 AM
I can't imagine that the the team/supervisor configuration is the culprit of your issues, but if you're above the documented limit you should change it. In my experience 11.6 has ben very solid across the half dozen customers I have on it. I've actually told customer to wait on the 12.x upgrade until later because 11.6 has been so good.
david
04-23-2019 09:11 AM
Just wondering were any of those large customers 250-260 logged in agents? We have not had this run good at full logon since installing it, freezes up daily. We are not currently having any of the ES patches applied.
04-23-2019 03:22 PM
05-06-2019 07:19 AM
The socket.io pegs the cpu we know that is the runaway process.
We actually have multiple TAC cases open.
04-23-2019 03:18 PM - edited 04-30-2019 08:20 AM
Thanks for the mention Jon, that was an interesting time in my life!
I would suggest OP use the show process cpu load num 5 CLI command to see if the SocketIO process is pegged. (or what process, if any, is)
Also, just because the browser name, and or version is listed as supported, doesn't guarantee you any success.
Just ask the people facing the current Chrome Tab Discarding issue, or the previous Chrome 67 issue, or the FireFox weak diffie helmen issue a few years back.
Also, IE can be a little sneaky about when and how it implements Compatibility mode. Unfortunately, there's no straight forward way to know, centrally from UCCX, whether or not the Agents are connecting with a particular browser, version, or setting enabled. You could get every Agent to click the Send Error Report button in Finesse, and then scrape the data from the server, or you could write a custom gadget to do it automatically. Outside of that, I'm not sure how you would figure out what people are using to connect to Finesse.
EDIT: Actually, I just noticed the following command was introduced in 11.5(1)SU1 "show uccx livedata connections" which shows total long polling connections to Live Data. And for some reason, in the same version they introduced that command, they also disabled HTTP long polling, so the value should always be 0.
Looks like this:
admin:show uccx livedata connections Server Status: Active Client Count: 51 (polling: 0) Command successful.
05-06-2019 07:24 AM
run sql delete from cuic_data:cuicdatasourcefailover where id='D7D7E1A610000132363635BD3F57F543'
utils service restart Cisco Unified Intelligence Center Reporting Service
utils service restart Cisco Unified Intelligence Center Serviceability Service
After doing this socket.io service went down and that causes live data source to go down, however, DS recovers on its own when socket.io service comes up which was not the case earlier where you had to manually restart the Live DS after socket.io service went down and came backup.
Checked Hardware config:
CPU 4
RAM 16GB
Disk 2: each 146 GB
We disabled "VoiceCSQDetailsStats" topic from root and that improved the situation a lot:
Step 1: Need to take root access of UCCX publisher.
After disabling the report the socket.io climbed once again till it crashed, then remained stable for one week, at which time we did a cluster restart, and now - today cpu is climbing with socket.io again, if it continues we will likely crash again today.
05-10-2019 04:47 AM
As an update to the UCCX 11.6.2 uprgrade. Cisco TAC will not pursue assisting with our UCCX application failing on a daily basis until we pare the teams down to 5 supervisor and 50 agents per team. This is affecting our business operations and not an easy thing to change. With no warnings in the documentation ie.. release notes, install and upgrade guides regarding we no longer support excess configurations we were warned that we should have reviewed the srnd prior to upgrading... Lots of help there.... We are reviewing options to replace uccx after 11 years of use original install version was 6 or 7 and upgrades of every version until now met our needs... Seems that Cisco may be wanting to get out of the call center market. We are trying to as quickly as possible make our business fit the application not the other way around...
05-10-2019 05:25 AM
Consider this before 11.6 upgrade or your system will fail:
The following table provides a selected list of capacity limits when deploying Unified CCX.
Maximum number of teams | 8
| ||
Maximum number of supervisors in a team | 5 | ||
Maximum number of inbound agents | 400 | ||
Maximum number of preview outbound agents | 150 | ||
Maximum number of remote agents | 100 | ||
Maximum number of concurrent supervisors | 42 | ||
Maximum number of teams that a supervisor can be assigned | 5 | ||
Maximum number of agents in a team | 50 | ||
Maximum number of IVR ports | 400 | ||
Maximum number of outbound IVR ports | 150 | ||
Maximum number of progressive and predictive outbound agents | 150 |
This table shows absolute limits. Reaching the limits for multiple criteria in a specific configuration might not be possible. Use the Cisco Unified Communications Sizing Tool to validate your configuration. This tool is available at:
05-10-2019 07:04 AM
I'm curious to understand how many teams you have, how many agents and how many supervisors. I don't remember from the top of my head, but CCX only supports about 40 supervisors and 400 agents. That's 10 agents per 1 supervisor and I believe the 50 agents per team max has always been there. Sounds like it was not enforced before but it is now. One thing I would tell you is that you could create CUIC dashboards so your supervisors can keep track of what's going on and act accordingly if they need to take action. This way you could have a few reports and have the supervisor only login to CCX when they need to take some sort of action.
david
05-10-2019 11:13 AM
Hi David,
What we are trying to figure out is why does it matter how many are configured, rather than how many are concurrent?
We know that this shows active connections:
admin:show uccx livedata connections
Server Status: Active
Client Count: 191 (polling: 0)
Command successful.
But how do you measure: "Maximum number of concurrent supervisors" which is in the SRND ?
And this in the SRND: "Maximum number of agents in a team" is that concurrent?
And this in the SRND: "Maximum number of supervisors in a team" is that concurrent? Which would make sense if you run three shifts, so we can gauge that we are over configured, but know that we are not concurrent and cannot measure that.....
05-12-2019 04:08 AM
I can understand your frustration, specially if it really means concurrent or configured, huge difference. The above table mentions that supervisors are concurrent so at least that's clear. I guess I'm trying to pinpoint what exactly is TAC saying is the issue? Is it the number of supervisors configured?
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide