cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
8360
Views
17
Helpful
24
Replies

Prime Collaboration Deployment - PROBLEMS

Clifford McGlamry
Spotlight
Spotlight

I have been working with the Prime Collaboration Deployment tool in the lab trying various things with it.  I like the concept of what it is supposed to do, but in practice it fails far more than it succeeds.  Of note:

 

1.  If a presence server is hooked to CUCM, PCD insists on attempting to discover it.  Doesn't give you the option not to, even if you NEED it not to.

2.  No option at all to address the possibility that the platform OS username and password for the presence server might be different than the CUCM platform OS username and password.  This is a huge miss by Cisco.

3.  If you run a discovery, I find that it often "fails" with no explanation at all.  Why did it fail?  How do I figure out what to fix?  The admin guide references a set of logs to pull, but after reviewing those, they are no help at all.  

4.  Troubleshooting is maddening.  Nothing in the tool, in the documentation, or on line to help with most of the issues I'm seeing.  The issues that are documented, I'm really not seeing occur.  

5. No dedicated forum for addressing issues with a new tool with very skimpy documentation.

 

I would welcome a pointer at additional resources on this tool.  It could be such a huge help.  As is, it's almost too hard to use due to lack of resources to back it up.  We cannot put this into production deployments if we cannot depend on it to behave correctly.

 

TIA 

 

Cliff

 

 

 

24 Replies 24

Anthony Gerbic
Cisco Employee
Cisco Employee

Cliff,

I have passed the link to this thread to the PM for PCD for response.  I believe the forum for PCD is this forum:  https://supportforums.cisco.com/community/5971/unified-communications-applications .

 

Regards

Plz reference original post

 

1.  PCD discover feature is going to try and catch all nodes in the same cluster.   Usually discovery failure is due to the node being down or unreachable.  Or if there are nodes you don't want discovered regardless, you can temporarily remove them from cluster.

2. If discovery fails with default OS/password, can edit to change these.  Most of the time - esp. now with native IM&P in 9.x and 10.x - all cluster nodes (UCM and CUP/IM&P) in same cluster are using  same credentials.

3. There is a wide range of issues that can cause discovery failure, many of them are on app side but symptom may manifest as a false issue with PCD.   Always look at the application log and the PCD log when troubleshooting a failure with TAC.  Same for PCD upgrade failures.

 

4.  See #3 comments.   We also have several suggestions in backlog for improving the serviceability experience.  PCD doesn't eliminate app-side issues that may fail an upgrade, migrate, install or discovery.  We can certainly improve whether or not and to what extent PCD tells you what broke and where to look, but right now you need to look at app-side and PCD-side.  If you have TAC SRs where failures were resolved, interested in those to improve what we capture in Troubleshooting guidance and to guide roadmap for potential improvements.

5. See previous guidance.  PCD doesn't have a separate forum - PCD questions on install/upgrade/migrate are usually posted to Prime Collaboration, Drive to Collaboration or Servers, OS and Virtualization communities.

Hope that helps.

Attempting to use PCD for another migration.  And discovering another maddening set of errors provided in the log that provide NO CLUE as to what's wrong:

 

2015-04-01 20:38:36,983 INFO  [DefaultQuartzScheduler_Worker-2] scheduler.DiscoveryJob.executeJob - discovery node 622 failed with errorcode 103

 

Ummm ...  what does errorcode 103 mean? WHERE IS THE DECODER RING???  Why provide a log to troubleshoot with if it is going to be so blasted cryptic?  The worst part is that I KNOW this has to exist, so why isn't it published and readily accessible?

Guess it is back to once again opening a TAC case that should never have been required because the documentation is incomplete.  :(

Thanks for continuing to provide actionable specific situations to help execute improvements.

Reference my earlier post, in a future release Cisco will be implementing some UCM-side and PCD-side work to improve problem handling, what the user is told when a task fails and what text is in the Troubleshooting section of PCD Admin Guide to reduce dependency on TAC to interpret logs (http://www.cisco.com/c/en/us/td/docs/voice_ip_comm/cucm/pcdadmin/10_5_2/CUCM_BK_P7081B13_00_pcd-administration-guide-1052/CUCM_BK_P7081B13_00_pcd-administration-guide-1052_chapter_01000.html ) .  We're targeting a set of most frequent items that TAC and escalation engineering have seen since PCD 10.0 inception.  I've shared your specific example with them.

Development backlog priorities have been on plugging feature gaps that prevented PCD consideration for certain kinds of installed base scenarios.  With most of those base feature gaps filling as we enter CSR 11.0, we are prioritizing some "bulletproofing" work for the most common UCM root causes and PCD root causes that (sometimes falsely) manifest as a PCD problem with unclear indication of what the true root cause was.  

Please continue to open cases with TAC - I know it's inconvenient but it makes it easier to ID the most frequent issues since Serviceability/Usability improvements are always challenging to implement. 

We did.  The issue was one server would NEVER complete discovery and was throwing that error.

The work around was remarkably simple....log into the GUI of OS Adminsitration on CUCM, select Software Updates (it would say one was in progress and ask if you want to take control....YES), then click through to tell it to keep going.  Problem solved.

This really needs to hit the user documentation...

Additional items uncovered in most recent uses:

 

1.  In upgrades to 10.5(1), CUCM TFTP directory permissions are set wrong.  This requires opening a TAC case to get root access to fix it.  Not sure if it's fixed in later versions, but since it's a known problem, I'm suprised there isn't a COP or something available to handle this.

2. Upgrades from per 10.x directly to 10.5(2) are problematic for some reason,and the bug note suggests going to 10.5(1) first (stair step).  While it certainly addresses the issue, it kind of defeats the purpose of using the tool to speed things up. 

3.  Method of distibution of the tool itself is problematic.  While it's not really that hard to place an order through PUT for an upgrade....why in the world do we force people to do this to get access to the tool?  I've heard there are updates available for the tool itself, but no clue as to where on CCO they would be (I cannot find them, and PLEASE don't tell me I have to open a TAC case to get them.  That's really not a good answer.  It's time consuming on both my part and Cisco's, and it doesn't need to be this hard.)

#1 - fixed in PCD 10.5(3).  That version is done but still going thru the posting process (if you already have a PCD installed, the upgrade-to-10.5.3 will be around here soon: https://software.cisco.com/portal/pub/download/portal/select.html?&mdfid=285963825&flowid=50402&softwareid=282074295 )

 

#2 - Do you have CDETS?  I remember a migration issue for 10.0 to 10.5 last year, but not one for pre-10.0 to 10.5.

 

#3 - for the "installable" PCD image, we ship that with UCM to avoid requiring separate SKU, license, contract, etc. for PCD.  This means can only get PCD with a UCM upgrade from PUT, or from newly purchased UCM.   If you already have a PCD installed and need to update it, the "upgrade-only" PCD image is an .ISO posted in the UCM Updates area (see link above, you can find PCD 10.5.2 in the UCM 10.5.2 area).   Researching how to make searches for "PCD" or "Prime" in Software Download make this show up easier.

Re: #1 - Good.  Not sure how big of a deal it would be, but it would be awesome if the tool had a button to check for available upgrades (Jabber has this).  It would solve a LOT of problems if you could just make sure you (meaning me) were on the latest code.

 

Re: #2 - I had not previously heard of CDETS.  Googled it up, and it's something I'd be interested in having access to.  Is this internal to Cisco only, or can partners get access?  If so, how? 

The specific bug I was hitting the forced me to a stairstep upgrade is CSCur57116  You can see the notes on support case #634362655 if it's helpful. 

 

Re: #3 - Yes, making the searches for PCD in software download would make things much better!  I tried that, and couldn't figure out that magic combination. 


 

Once again ATTEMPTING to use PCD to run an upgrade/migration for a customer.  Customer is on 8.6(2) and going to 10.5(2). 

PCD had serious issues just getting the cop file on to update the certificates.  I had to install it manually as it was complaining that the PAWS service wasn't running on a subscriber.  There is no PAWS service that I can find on 8.6(2).  Tried rebooting....nada.  Ended up installing the cop manually. 

Once installed, PCD still would not see 10.5(2) as a valid upgrade.  The little wizard only showing me what it *thinks* is okay/valid really needs some work, or preferably, REMOVE IT.  Let me pick the image and trust me to know it will work (or fail). 

Ended up having to run two upgrades of 7 servers each MANUALLY because PCD simply will not do the job. 

This tool holds so much promise....I continue to be disappointed in the inability to get it to work correctly when running major upgrades. 

Clifford,

There is some hardening work occurring in PCD 11.5.  Would you be interested in talking live to the BU so we can review what you have been experiencing vs. the work to date on BU/TAC top N?

I would welcome the opportunity to do so.  

Thanks for the info Clifford. Seeing some of the same issues using pcd_vApp_UCOS_11.0.1.20000-2_vmv7_v1.2 trying to get from 8.62 to 11. Please keep us updated.

Currently running 3 large upgrades using PCD.  I have versions 11.5(2) and 11.5(3).

Following issues are noted:

1. STILL cannot get updates for the PCD tool on CCO.  This should be so incredibly simple that I cannot believe we're still fighting this battle.  

2. I have an 11.5(2) build out there and was forced to change the ip address.  After doing so, and rebooting the system, it continously misbehaves.  I cannot built a new migration task.  The "finish" button doesn't work.  Tried both Firefox and Chrome, and dumped the browser cache.  Doesn't make a difference.  Being forced to rebuild it to address this.  This is really silly.  Why would something so simple completely break the tool?

3. If the browser is resized, the dialog windows within the browser do NOT resize.  If I have the browser running half screen when I start and then go to full screen, the dialog boxes within the pane remain half sized.  What's worse is that there are no scroll bars to get to the content that is cut off.  

4. Renamed ISO files won't work.  Not sure why this change was made, but please ditch it.  There are a TON of very good reasons that ISO files might need to be renamed.  

5. Attempted a UCCX build with PCD.  The UCCX documentation doesn't address doing a build with PCD, and the PCD documentation doesn't address building a UCCX HA cluster.  The steps laid down by the tool are flawed.  While creating the install task, you must manually insert a pause in the build after the pub is built to be able to log into it and do the intial configuration and add the sub to the cluster.  CUCM builds do this automatically.  I believe UCxN builds do as well.  PCD specific task documentation by product has GOT to be maintained and owned by someone in a centralized spot.  The docwiki format leads to things being spread all over the place with no logical indexing (think of the internet before there were real solid search engines available).  We know better than this at this point.  Please get the PM's together.  Someone HAS to own this, and it really needs to be in the PCD documentation regardless of who maintains it.  

6.  Attempted an upgrade of UCCX using PCD from 11.0 to 11.5.  It hangs.  Interestingly, if you do it manually, you will note on the command line that even after you complete the upgrade task, the CLI will show the output of the UTILS UPGRADE STATUS command as still being "INSTALLING".  Because of this, PCD never sees the install complete.  This is most likely something the Contact Center BU will need to take a look at, but it's breaking the PCD tool functionality for this type upgrade. 

Another issue noted in 11.5(2) and 11.5(3)

When running a migration, when the migration steps are summarized and there is a requirement to edit them (i.e. not turn off the source machines), there is no scroll bar if the number of steps is off the bottom of the pane.  If you are not extremely careful, you can leave a step turning off machines that need to stay on because you can't see the additional steps.  You have to click on one of the visible steps and then arrow down to be able to get to the hidden steps.