07-16-2009 01:51 AM
Hi,
I have an issue that, usually, the archive poll job hangs (still shows as running). This also stops all other archive jobs running until LMS is restarted. The only stacktraces are xdi related. Are all the known xdi issues fixed in RME 4.3 ?
Thanks
[ Wed Jul 15 21:09:08 BST 2009 ],ERROR,[Thread-339],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,error,19,Unexpected Ssh2Exception stacktrace:
[ Wed Jul 15 21:09:08 BST 2009 ],DEBUG,[Thread-339],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,printStackTrace,51,stacktracecom.cisco.nm.lib.cmdsvc.ssh2.Ssh2Exceptio
n: Disconnected from remote host
at com.cisco.nm.lib.cmdsvc.ssh2.StreamPair.readBytes(StreamPair.java:332)
at com.cisco.nm.lib.cmdsvc.ssh2.StreamPair.readPacket(StreamPair.java:183)
at com.cisco.nm.lib.cmdsvc.ssh2.Ssh2Engine.run(Ssh2Engine.java:234)
[ Wed Jul 15 21:09:08 BST 2009 ],DEBUG,[Thread-45],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,printStackTrace,51,stacktracejava.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
at com.cisco.nm.lib.cmdsvc.ssh2.StreamPair.flush(StreamPair.java:341)
at com.cisco.nm.lib.cmdsvc.ssh2.StreamPair.write(StreamPair.java:164)
at com.cisco.nm.lib.cmdsvc.ssh2.StreamPair.write(StreamPair.java:128)
at com.cisco.nm.lib.cmdsvc.ssh2.Ssh2Engine.write(Ssh2Engine.java:119)
at com.cisco.nm.lib.cmdsvc.ssh2.Ssh2Engine.disconnect(Ssh2Engine.java:375)
at com.cisco.nm.lib.cmdsvc.SSH2Session.disconnect(SSH2Session.java:180)
at com.cisco.nm.lib.cmdsvc.SSH2Session.close(SSH2Session.java:169)
at com.cisco.nm.lib.cmdsvc.OpConnect.revert(OpConnect.java:74)
at com.cisco.nm.lib.cmdsvc.SessionContext.revert(SessionContext.java:587)
at com.cisco.nm.lib.cmdsvc.SessionContext.invoke(SessionContext.java:216)
at com.cisco.nm.lib.cmdsvc.Engine.process(Engine.java:57)
at com.cisco.nm.lib.cmdsvc.LocalProxy.process(LocalProxy.java:22)
at com.cisco.nm.lib.cmdsvc.CmdSvc.close(CmdSvc.java:591)
at com.cisco.nm.xms.xdi.pkgs.LibDcma.persistor.CliOperator.cleanupOperator(CliOperator.java:1219)
at com.cisco.nm.xms.xdi.pkgs.SharedDcmaPIX.transport.PIXCliOperator.cleanupOperator(PIXCliOperator.java:844)
at com.cisco.nm.xms.xdi.pkgs.SharedDcmaPIX.transport.PIXConfigOperator.cleanupOperator(PIXConfigOperator.java:252)
at com.cisco.nm.xms.xdi.pkgs.LibDcma.persistor.OperatorCacheManager.clearCache(OperatorCacheManager.java:95)
at com.cisco.nm.xms.xdi.pkgs.SharedDcmaPIX.transport.PIXConfigOperator.operationDone(PIXConfigOperator.java:259)
at com.cisco.nm.rmeng.dcma.configmanager.ConfigManager.updateArchiveForDevice(ConfigManager.java:840)
at com.cisco.nm.rmeng.dcma.configmanager.ConfigManager.performCollection(ConfigManager.java:1646)
at com.cisco.nm.rmeng.dcma.configmanager.CfgUpdateThread.run(CfgUpdateThread.java:27)
Solved! Go to Solution.
07-20-2009 08:14 AM
There are a lot of bug IDs associated with this (e.g. 6533630). If you apply the latest Solaris recommended patch cluster, you should be okay. I'm running it on my servers, and I have not seen this hang.
07-16-2009 06:19 AM
All of the known lock-up bugs have been fixed in RME 4.3. In order to troubleshoot this, you will need to get a full Java thread dump from the ConfigMgmtServer process. If this is on Windows, the procedure can be somewhat involved, and you should contact TAC to have them walk you through it.
07-16-2009 06:46 AM
Ok Thanks. It's Solaris, but I will open a TAC case. Will post back anything informative.
07-16-2009 06:58 AM
Solaris is much easier. You can send a SIGQUIT to the ConfigMgmtServer PID. The thread dump will be written to daemons.log.
07-16-2009 11:30 PM
07-17-2009 09:12 AM
Looks like you're hitting a Solaris bug. To workaround this, edit /opt/CSCOpx/lib/jre/lib/security/java.security, and change the line:
security.provider.1=sun.security.pkcs11.SunPKCS11 ${java.home}/lib/security/sunpkcs11-solaris.cfg
to:
security.provider.1=sun.security.provider.Sun
Then restart dmgtd.
07-17-2009 10:28 AM
Thanks, I've made the change. Should know by Monday.
07-19-2009 11:53 PM
All looks good. Do you by any chance have a Solaris patch or bug ID for this?
Any-which-way I will mark it resolved.
Thanks.
07-20-2009 08:14 AM
There are a lot of bug IDs associated with this (e.g. 6533630). If you apply the latest Solaris recommended patch cluster, you should be okay. I'm running it on my servers, and I have not seen this hang.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide