02-10-2006 08:57 AM - edited 03-10-2019 02:28 PM
I'm experiencing a prblm where I have a Cisco ACS 112-K9 appliance that seems to hang and the only way to access it again is to reboot it. (I'm not able to get to the web GUI, nor to a telnet prompt)
It's configured as the failover device to another appliance so I'm not too stressed at the moment.
On the Appliance Upgrade page the setup is as follows:
Cisco Secure ACS 3.3.1.16
Acs EAP-TLS PSIRT fix (Patch: 3.3.1.16 Thu 09/09/2004 9:55:30.72)
Appliance Management Software 3.3.1.16
Appliance Base Image 3.3.1.6
CSA build 4.0.1.543.2 (Patch: 4_0_1_543)
From the System Configuration page > Diag Log > CSAlog file I get the following: (I'm not even sure this is the right Log file to be looking at? If there's another log file, please let me know and I'll post it.)
[2006-02-09 08:40:02.812] [PID=520] [Csamanager]: Agent initialization complete.
[2006-02-09 08:40:09.687] [PID=520] [Csamanager]: Warning: It took 4 seconds to process the last batch (1) of events. Last event: type=EVTM EvSrcComp=13 EvDst=1 EvDstComp=7 EvCode=APCR_ALLOW EvPInt=601 EvPString:30:C:\WINNT\system32\services.exe EvPString
2:55:D:\Program Files\CiscoSecure ACS v3.3\CSAuth\CSAuth.exe Evtime=45.5 (seconds since boot) Evtype=FILE EvFileOp=OPEN EvFilePrivateOp=IRP_MJ_CREATE EvFileOpFlags=EXECUTE|NOISY|NEW_FILEID EvFileCacheOp=INSERT EvFileAccess=EXECUTE EvFileAccessToken=SYSTEM
EvFileId=-2058248312 EvFilePath:30:C:\WINNT\system32\NETAPI32.dll EvFileDrive:1:C EvFileDriveType=FIXED EvFileName:12:NETAPI32.dll EvProcessId=620 EvCredentials=¨=
[2006-02-09 22:59:59.703] [PID=500] [CsaCtrl]: .
[2006-02-09 22:59:59.703] [PID=500] [CsaCtrl]: Service CSAgent starting...
[2006-02-09 22:59:59.843] [PID=500] [CsaCtrl]: Started process leventmgr.exe pid=520
[2006-02-09 23:00:00.312] [PID=520] [Csamanager]: Csamanager starting ...
[2006-02-09 23:00:00.453] [PID=520] [Csamanager]: Agent version=V4.0-1 build 543, os='Windows 2000', os version=5.0.4.2195
The actual event when the device fails occured at 22:57 that's when the device needed a reboot.
I guess I need a starting point for troubleshooting this.. any suggestions would be great.
The fix is frustrating especially if I have to reboot the device everytime.
02-13-2006 06:53 AM
Hi
Most likely the problem is in CSAuth - usually is!
Try looking in the CSMon csv (system monitoring) to see if test authentications were still working. If there are what appear to be failures, look in the csauth log file (auth.log) for the same time period.
Could be an exception inside a critical section causing internal deadlock. So try searching all the logs for "exception", "too busy", "worker thread" and see what you get.
Good luck!
Darran
02-13-2006 09:40 AM
Hi Darran,
Thanks for the reply.
I checked for "exceptions, "too busy", "worker thread" and couldn't find anything.
I check the following logs:
CSAuthlog.csv *** Nothing indicating the above ***
CSMonlog.csv *** The logs jump from 02/07/2006 - 02/10/2006) ***
Keeping in mind this device is redundant with another appliance there would have been no failures indicated with authentication as this paticular device is the "failover" device.
Any other suggestions? I can alway open a TAC ticket, but I wanted to explore this method of help first.
Thanks again!
Reuben
02-13-2006 11:43 AM
hmmm, sounds familiar. I remember an escalated case once where there was "missing time" in the logs. Sounds like an X Files case...
I dont have access to ACS source code anymore so cant help much more. You need to open a TAC case and ask for escalation to the UK ACS team (aka Doug)
Darran
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide