Re: Prime 3.3 very slow

RudiJ · ‎01-15-2018

Hi ,

I have update our Cisco Prime Infrastructure from 3.1 to 3.3 and now facing very slow performance from login to toggle between tabs and pages. If I restart the server everything works fine for about 15 to 30min.

Can somebody help me please?

Cisco Prime Infrastructure
********************************************************
Version : 3.3.0
Build : 3.3.0.0.342

disk:

disk: 1% used (192316 of 138271784)
temp. space 2% used (36396 of 2031952)

memory:

total memory: 16332120 kB
free memory: 263716 kB
cached: 3377140 kB
swap-cached: 2280 kB

CPU statistics:

user time: 21043962
kernel time: 8943886
idle time: 33607924
i/o wait time: 22437766
irq time: 116814

marce1000 · ‎01-16-2018

- Was your original installatation sized or installed according to network sizing parameters :

https://www.cisco.com/c/en/us/td/docs/net_mgmt/prime/infrastructure/3-1/quickstart/guide/cpi_qsg.html#pgfId-67786

- Also , if you are using a VM based installation, check resource parameters or alerts of/for the VM in vcenter, pay attention to memory allocated to the VM.

M.

-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

RudiJ · ‎01-16-2018

Hi Marce1000,

thanks for your reply. The original VM based installation was installed according to network sizing parameters from "Standard".

I also checked the VM in vcenter. Attached you can see the CPU and Memory Usage.

Since if installed PI 3.3 the usage is permanently high. With PI 3.1 I only registered high usage when backups where taken.

On the PI GUI i can also see that there a lot of jobs that have not started at sheduled time. Could this be the reason why the CPU and Memory Usage is so high?

marce1000 · ‎01-16-2018

>installed from Standard

And does this 'comply' with your network size (according to the link send earlier) ?

>jobs not started

I would describe this as a deadlock, where both parameters can influence each other, basically I suspect lack of memory resources. Try doubling available memory (as a test). Check wether these issues persist.

M.

-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

lubkenmsd · ‎01-16-2018

I have the same problem after in-place upgrade from 3.1 to 3.3. Logging into the CLI and running "ncs status" just hangs. I can get to the web login page but no further.

RudiJ · ‎01-16-2018

Hi,

yes the Standard installation comply with our network size. Actually we only use the PI to manage up to 1500 Lightweight APs.

I will try to double the available memory and check if the problem persits.

RudiJ · ‎01-18-2018

Hi Marce,

I doubled the available memory of the VM. The PI now works better but not as usual. Also, there are still a lot of scheduled jobs that cannot be started and the CPU usage is also very high (see attached file).

Doubling the memory is also only a temporary solution because I need the resources for other projects.

Is there a workaround to fix this problem durable? If not are there any possibilities to downgrade to PI 3.1?

marce1000 · ‎01-18-2018

>Is there a workaround to fix this problem durable?

Contact CISCO TAC

>If not are there any possibilities to downgrade to PI 3.1?

You cannot downgrade prime, you can if you have a 3.1 (or compatible) backup re-initialize a VM, install virgin 3.1 and restore your backup.

M.

-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

m.hegeraat · ‎01-24-2018

You may wish to drop to the linux root shell and find out what occupies prime.

Use top to get the pid's hogging the cpu or consuming all memory and publish the ps -ef | grep <pid>

You never know if it rings a bell for someone.

Good luck

winston.barrett1 · ‎01-24-2018

I am facing the same issue, please open a TAC, so they can see that others are having the same.

I have broken my HA, one node running version 3.2MR1 and the other node running 3.3, both using the same restore DB from 3.1.2.

The version with 3.2MR1 workings perfectly fine, and 3.3 runs for about 10-15 minutes and goes to a crawl, so far TAC/BU saying, I am over the interface limit.

lubkenmsd · ‎01-24-2018

I am working with TAC currently - they want to tell me it's a disk IOPS issue. TAC ran "ncs run test iops" and it measured at ~30MB/sec when the sizing chart says it should be 200. I have 10 other VMs on this host and none of them have issues. It could be a problem with my iSCSI storage on my NetApp but I am doubtful. I am going to downgrade back to 3.1 and see if I get my performance back, plus run the same IOPS tests to see what the metrics are when it's working properly.

winston.barrett1 · ‎01-24-2018

They we’re telling me the same, lucky I have exact appliance times three.

m.hegeraat · ‎01-24-2018

The command ncs run test iops can be run when PI is active, but should be run, when it is not!

TAC doesn't always stop PI to make the test :-)

lubkenmsd · ‎01-29-2018

With NCS stopped, I ran the IOPS test several times. Results:

1: 128 MB/s

2: 92.8

3: 88.5

4: 160

5: 94.5

Those are widely varied results... Next I am going to downgrade to 3.1 which was working fine before. Hopefully it will go back to normal even with my less-than-ideal disk write speeds. Maybe if I can show TAC that the web interface behaves normally under similar write times in 3.1 and chokes in 3.3 they will be more helpful.

tcl.itnetwork · ‎08-16-2018

We are facing similar issue after Prime upgrade from 3.1 to 3.3. Did you get a solution for this issue.

Please update.