I recently made a fresh install of LMS 4.1 and added all of our devices (about 400 devices). After configuring all jobs and services everything ran fine.
After a few days a guy from the server team called me and told me that the CPU-usage increased over the days. I made some investigations and found out which process is using CPU ressources. Whenever ANIServer is running and collecting data CPU is running at approx. 100%. First it takes only a few minutes to complete but after some cycles ist takes more and more time to complete, after a week it takes more than 4 hours. After restarting ANIServer process it takes again a few minutes and then the duration increases.
Is this normal behaviour or could it be a bug?
I've attached a screenshot which shows the cpu usage.
Windows Server 2008 R2 64-bit
4 CPUs @ 2.93 GHz
6 GB RAM
to find out what is causing the issue it is necessary to get the processes consuming CPU. You can use "Process Explorer" from Sysinternals / MS Technet to get details on each running process:
the sysinternals suite (with a lot more usefull tools):
or Process Explorer alone:
DataCollection is under control of ANIServer, and it is in deed a CPU and RAM consuming process. This alone should not be a problem.
The first graph you posted is missing the description of the x- and y-axis. So I do not see how often the CPU spikes appear and what the numbers on the right side mean.
You said, your problem ist that the CPU usage is constantly rising. You should see this if you (in process explorer) right click on the process you mentioned to bring up the "Properties" view or if you expand the general view to show also CPU history for each process.
Let Process explorer open for a while so you see if there is one process which does not free up all the resources it takes in the first instance.
DEP usually makes problems during installation. Also processes and tasks could fail because of DEP. I wouldn't have brought DEP in relation with your problem but you should disable it anyway - you will see if it helps.
I am still thinking about what could be the reason for the permanent increase of CPU usage. In fact, I saw DataCollection consuming less CPU and RAM for the first run after dmgtd restart. I think it is because subsequent collections look for any changes in regard to previous runs.
What are the hardware specs of this server.
are the CPUs strictly assigned to the server or is there a dynamic option active?
How many devices are managed with this server?
Check if the pagefile is statically assigned? It should have a size of 12 GB (min).
If you have to change the pagefile settings also, you'll have to reboot the server. Always stop dmgtd first (net stop crmdmgtd), wait until this is finished, now reboot the server.
Also check if antivirus software is running on the server? If yes, make sure that CSCOpx directory and the location to where the database backups are written are excluded from on-access scanning.
Can you provide the ani.log?
It seems that you have port-channels configured. There is a bug first found in LMS 4.0.1 by Marvin concerning data collection and port channels (BugId CSCto06189). The log entries looks very similar, though there are other errors which shouldn't be there...
see this thread:
I cannot find the bug mentioned in the release notes for LMS 4.1 not as still active, nor as resolved. A patchis available on CCO only for LMS 4.0.1:
You could open a TAC case along with the ani.log to confirm if this is the bug (what I assume) and to get an official patch.
...perhaps Joe could confirm it also...