cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1234
Views
0
Helpful
1
Replies

LMS 4.1 - Solaris 10 Filedescriptor-Limit reached Tomcat-Process

lo.mueller
Level 1
Level 1

Hi all,

environment: Oracle/Sun HW T5140 with Sol10 u9

last friday we had a HTTP 500 Error-Message trying to connect to web-interface of LMS 4.1.

After some troublehooting (/opt/CSCOpx/MDC/tomcat/logs/stdout.log) we discovered tomcat-problems unable to open Sockets.

WARNING: Exception executing accept

java.net.SocketException: Too many open files

        at java.net.PlainSocketImpl.socketAccept(Native Method)

        at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408)

        at java.net.ServerSocket.implAccept(ServerSocket.java:462)

        at java.net.ServerSocket.accept(ServerSocket.java:430)

        at org.apache.jk.common.ChannelSocket.accept(ChannelSocket.java:312)

        at org.apache.jk.common.ChannelSocket.acceptConnections(ChannelSocket.java:667)

        at org.apache.jk.common.ChannelSocket$SocketAcceptor.runIt(ChannelSocket.java:878)

        at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)

        at java.lang.Thread.run(Thread.java:662)

Seachring thru Sun-Tuning docs I found following kernel-parameters for Filedescriptor tuning

rlim_fd_cur and rlim_fd_max

Settings on un-tuned Solaris 10 system

</proc/5142/fd>#               echo rlim_fd_max/D | mdb -k

rlim_fd_max:

rlim_fd_max:    65536          

</proc/5142/fd>#               echo rlim_fd_cur/D | mdb -k

rlim_fd_cur:

rlim_fd_cur:    256            

With setting parameter rlim_fd_cur up to 1024 in /etc/systems I tried to tune Tomcat behavior.


Are there any known but unfortunately not documented Best-Practise kernel-parameter settings for CiscoWorks LMS 4.1

running on Solaris 10 SPARC-Systems?

Thanks for any feedback

Lothar

1 Reply 1

Martin Ermel
VIP Alumni
VIP Alumni

I have had this 3 times with LMS 3.2 installed on solaris 9.

I am wondering if you DCRDevicePoll service is still running correctly? This is the process that checks for device reachability (ICMP and/or SNMP) of the DCR devices. I found that this process is hanging from time to time and assume that it is interfering with some other process (I think IC polling/collection or Config polling/collecion).

I was unable to get the poll job running again without stop, delete and recreate it.

Have you observed simmilar issues with the poll job while the "file descriptor" issue was occuring?

btw. ... I doubt that adjusting these parameter will prevent the problem...