cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1337
Views
0
Helpful
5
Replies

Zombie processes in LMS 3.2.1

aslyuch
Level 1
Level 1

Hi,

I am seeing an increasing number of processes on our Solaris based Ciscoworks server. The parent and the zombie processes are the following:

root@cw:/usr/ucb# ./ps -alxwww | grep 26915

0   101 14737 26915  0   0       0    0                   Z  0:00  <defunct>

0   101 15986 26915  0   0       0    0                   Z  0:00  <defunct>

0   101 16052 26915  0   0       0    0                   Z  0:00  <defunct>

0   101 16065 26915  0   0       0    0                   Z  0:00  <defunct>

0   101 16161 26915  0   0       0    0                   Z  0:00  <defunct>

0   101 16187 26915  0   0       0    0                   Z  0:00  <defunct>

0   101 16265 26915  0   0       0    0                   Z  0:00  <defunct>

0   101 16283 26915  0   0       0    0                   Z  0:00  <defunct>

0   101 16321 26915  0   0       0    0                   Z  0:00  <defunct>

0   101 16377 26915  0   0       0    0                   Z  0:00  <defunct>

0   101 16457 26915  0   0       0    0                   Z  0:00  <defunct>

0   101 16486 26915  0   7 20  416  376          O ?         0:00 /bin/sh /opt/CSCOpx/bin/pdshow DCRDevicePoll

0   101 26915 24170  1   7 208297650648 30016d88366 S ?        224:20 CSCO.1494 -cw:jre lib/jre -cp lib/classpath:www/classpath:MDC/tomcat/webapps/cwhp/WEB-INF/classes:MDC/tomcat/webapps/cwhp/WEB-INF/lib/rocksaw-0.6.2.jar:MDC/tomcat/webapps/cwhp/WEB-INF/lib/dcrdevpoll.jar:MDC/tomcat/webapps/cwhp/WEB-INF/lib/ctm.jar:MDC/tomcat/webapps/cwhp/WEB-INF/lib/log4j.jar:MDC/tomcat/webapps/cwhp/WEB-INF/lib/iText.jar:MDC/tomcat/shared/lib/MICE.jar:MDC/tomcat/shared/lib/NATIVE.jar:MDC/tomcat/shared/lib/jdom.jar:MDC/tomcat/common/lib/servlet.jar:MDC/tomcat/common/lib/mail.jar:MDC/tomcat/common/lib/activation.jar -Dnm.jrm.jobid=1494 com.cisco.nm.dcr.devpoll.DevPollJobInvoker 1494

This isn't the first time this happens, the last time I restarted the server and the problem disappeared for a while.

Do you have any idea how to fix this ?

Akos

5 Replies 5

Martin Ermel
VIP Alumni
VIP Alumni

the underlaying job is DCRDevicePoll which does the availability polling in Commen Services / DCR. Is the number of these zombies cummulating over time or do some disappear?

There have been some issues when this function was introduced but I thought they where fixed. Do you see any problems with the avaolability polling , i.e. the job is hanging and you do not get any updated results ?

The number of zombie processes is increasing constantly appr. at the rate of 3 proc. / 10 min. There are numerous unreachable devices in CS, that are actually reachable, so something is not working properly.

What are your settings in

    Common Services > Device and Credentials > Admin > Device Polling

Have you setup the job to get notified by e-mail? If so, do you still get the mails at the end of each cycle?

can you post the DCRDevicePoll.log ?

As I said, I have similar issues seen in LMS 3.2 but the have gone with LMS 3.2.1.

I attached a screehshot about the Device Polling settings and also attached the DCRDevicePoll.log.

Email notification wasn't set and when I tried to configure it  , it was denied because the job was running currently.

I was out of office for the last 2 weeks - sorry for the delay.

Do you still have the issue? If so, please post the output of pdshow. Are you using SNMPv3 to access your devices? Would ICMP also be an option or would it be blocked by an ACL or Firewall?