ACS 5.3 and Windows AD account lockout

Andy Johnson · ‎03-21-2012

Currently on 5.3.0.40.2 when a invalid password is attempted via TACACS or RADIUS to the AD identity store is locks the account out on the first failed attempt. The AD policy is lockout after three attempts. Is there a way to fix this issue so the account is not locked out with only one failed attempt? I see options for local password policys in ACS but nothing for the identity store. For what its worth this happened also with ACS 4.X deployment before we moved to ACS 5.3.

Just wanted to see if this is the expected behavior or if I should open a TAC case to see what is causing this.

Thanks.

Travis Hysuick · ‎03-21-2012

I've encountered this exact issue twice since the patch 2 upgrade, if you have a case open with TAC, I would appreciate if you could please post back and let me know what the resolution invovles.

Jerry Cao · ‎06-04-2012

Any updates on this? I'm having the same issue here. I checked the AD log and there are 3 failed attemps from the ACS at the exact same time.

Andy Johnson · ‎06-05-2012

We were hitting a bug: CSCty60915. TAC first had me run:

1) SSH to ACS

2) acs stop adclient

3) acs-config (enter your GUI credentials when prompted)

4) ad-agent-configuration adclient.force.salt.lookup true

5) exit

6) acs start adclient

This seemed to fix the issue, but the bug is fixed in, patch 4 on ACS version 5.3,

Jerry Cao · ‎06-05-2012

Thanks Andy, I will try the patch.

scrye · ‎09-20-2012

Hi;

Same horrible problem, struck earlier in the week, we are dead in water. Patched up to:

Version : 5.3.0.40.6

Internal Build ID : B.839

Patches :

5-3-0-40-6

Had some clock drift and UTC vs Local problems, already fixed those. The "salt" thing did not work:

ecb-acs1/bubba(config-acs)# ad-agent-configuration adclient.force.salt.lookup true

Performing AD agent internal setting modification is only allowed with ACS support approval. continue (y/n)?

Unable to restart AD agent. Define AD configuration or check current AD configuration settings

Below is the tail of the log, some other info. Really need help, thousands of users locked out.

Steve

ecb-acs1/admin# sh application status acs

ACS role: PRIMARY

Process 'database' running

Process 'management' running

Process 'runtime' running

Process 'adclient' Execution failed

Process 'view-database' running

Process 'view-jobmanager' running

Process 'view-alertmanager' running

Process 'view-collector' running

Process 'view-logprocessor' running

ecb-acs1/admin# sh logging system tail

ADEOS Platform log:

-----------------

Sep 20 22:30:23 ecb-acs1 ACS adclient INFO: adclient monitoring already enabled

Sep 20 22:30:45 ecb-acs1 debugd[2754]: [13824]: application:operation cars_install.c[785] [admin]: Application opr initiated for app

name - acs

Sep 20 22:30:45 ecb-acs1 debugd[2754]: [13824]: application:operation cars_install.c[789] [admin]: Verifying app (acs) is installed

...

Sep 20 22:30:45 ecb-acs1 debugd[2754]: [13824]: application:operation cars_install.c[797] [admin]: Reading the manifest init param

Sep 20 22:30:45 ecb-acs1 debugd[2754]: [13824]: application:operation cars_install.c[806] [admin]: Executing init tag

Sep 20 22:30:46 ecb-acs1 admin: [MGMT-active-test] starting

Sep 20 22:30:48 ecb-acs1 admin: [MGMT-active-test] GUI is active

Sep 20 22:30:48 ecb-acs1 admin: [MGMT-active-test] finished

Sep 20 22:30:48 ecb-acs1 debugd[2754]: [13824]: application:operation cars_install.c[814] [admin]: Operation of application complete

- out = ACS role: PRIMARY Process 'database' running Process 'management' running Process 'runtime

' running Process 'adclient' Execution failed Process 'view-database' running Process

'view-jobmanager' running Process 'view-alertmanager' running Process 'view-collector' running Process

'view-logprocessor' running

Sep 20 22:30:54 ecb-acs1 debugd[2754]: [13910]: logging: logutils_cli.c[1202] [admin]: Got cfg: Server localhost location /var/log/a

de/ADE.log loglevel 6 islocal 1

Sep 20 22:31:10 ecb-acs1 monit[4836]: 'adclient' process is not running

Sep 20 22:31:10 ecb-acs1 monit[4836]: 'adclient' trying to restart

Sep 20 22:31:10 ecb-acs1 monit[4836]: 'adclient' start: /opt/CSCOacs/bin/exec_wrapper.sh

Sep 20 22:31:11 ecb-acs1 ACS adclient INFO: Run, Initializing DB query...

Sep 20 22:31:11 ecb-acs1 ACS adclient ERROR: log4j:WARN No appenders could be found for logger (org.hibernate.cfg.Environment).

Sep 20 22:31:11 ecb-acs1 ACS adclient ERROR: log4j:WARN Please initialize the log4j system properly.

Sep 20 22:31:11 ecb-acs1 monit[4836]: 'adclient' failed to start

Tarik Admani · ‎09-20-2012

Hi,

Please delete the ACS computer account in Active Directory and then reboot the ACS appliance to force it to rejoin the domain. Also make sure the account credentials that are used to connect to the domain are accurate and saved before the reboot.

Thanks,

Tarik Admani
*Please rate helpful posts*

Tarik Admani · ‎09-20-2012

Here is a document that i created which will help troubleshoot AD related issues, you can provide the log output to TAC or you can post the pertinent information here:

https://supportforums.cisco.com/docs/DOC-26787

thanks,

Tarik Admani
*Please rate helpful posts*

scrye · ‎09-20-2012

Hi Tarik;

If this most recent reboot does not fix I'll try your DOC-26787. Thanks!

Steve

scrye · ‎09-20-2012

Hi Tarik;

Thanks, but that is something we already tried. I guess we can try again ... already deleted it a half-dozen times but one more try won't hurt. Have not rebooted the ACS since the last delete.

We can "test" the credentials and the pass the test, but if we try to hit the save button it says they are invalid. I suspect that happens because the adclient process will not run. In any event, we have tried with two different sets of credentials, we know that both work ... one of them is an account I use every day.

Stand by ...

Steve

Tarik Admani · ‎09-20-2012

Steve,

Can you check to see if the dns record is till present? Try a "nslookup ecb-acs1.domain.com" and nslookup ipaddofacs and see if that resolves correctly.

Also in ACS 5.3 the force SALT lookup option was already enabled so we need to see what errors you see on the DC side, also do provide the logs in the ACSADAgent.log file at around the timeframe the acs services are started. Also when you delete the computer account do you try to search for it to make sure it is deleted across the entire domain?

Thanks,

Tarik Admani
*Please rate helpful posts*

scrye · ‎09-21-2012

Hi;

Well, we got it working. Not sure of the exact fix, but allow me to ramble, perhaps it will help someone else.

We think that a combinationof factors caused the problem. First, we had clock drift, and that resulted in clock skew messages in the logs like these:

Sep 20 18:06:03 ecb-acs1 adclient[8322]: INFO

base.adagent start: Problem connecting to domain controller (KDC refused skey: Clock skew too great), will try again later.

and

ecb-acs1 adclient[1163]: WARN <27 capigetobjectbyname=""> base.bind.cache LDAP fetch CN=bubba,OU=staff,OU=edcenter,OU=edcenterarea,OU=episd,DC=episd,DC=org threw unexpected exception: SASL bind to ldap/ecb-dc-domain3.episd.org@EPISD.ORG - GSSAPI Mechanism with Kerberos error ": Clock skew too great"

Somehow the ACS lost the ntp config, very disturbing, because I know that one of the first things I did was setup NTP. So I re-did the ntp config, confirmed the time was accurate. Still failed. Then, because I was annoyed by the log entries comning out in UTC, I did a clock timezone to set it to local. That made the logs come out in local time, but might have caused other problems (I saw another forum entry for that) so I set it back to UTC.

This begs the question - how to leave the timezone at UTC but fix the timestamps for the logs? This is easy on Cisco switches.

Various reboots of the ACS after deleting the object in AD did not fix the problem. During these reboots I continued to use the original userid and password to authenticate. At all times, the "test connection" button showed that the credentials were OK.

Because we had recently added our first Win2008 domain controller to our world (all ther other DCs are Win2k3), we started worrying about this:

http://support.microsoft.com/kb/978055/en-us

But, after some checking, it seems as if we already had the fix applied.

Next, we created a dedicated user in AD for the ACS to use when authenticating. Deleted the ACS object, restarted the ACS, applied those new credentials. Still broken.

Our AD admin looked in various logs and found some things, here is his summary:

----------- from Danny --------

Checked the domain controller log under system. Found the following:

While processing an AS request for target service krbtgt, the account ecb-acs1$ did not have a suitable key for generating a Kerberos ticket (the missing key has an ID of 1). The requested etypes : 17. The accounts available etypes : 23 -133 -128 3 1. Changing or resetting the password of ecb-acs1$ will generate a proper key.

and

While processing an AS request for target service krbtgt, the account stcrye did not have a suitable key for generating a Kerberos ticket (the missing key has an ID of 2). The requested etypes : 18. The accounts available etypes : 23 -133 -128 3 1. Changing or resetting the password of stcrye will generate a proper key.

This may be related to either clock scew between acs and the domain or introducing server 2008 domain controllers into an existing server 2003 domain.

-----------

On a desperate hunch, after yet again deleting the ACS object in AD and reloading the ACS, I used the new dedicated ACS user account, but gave it a wrong password. Hit save, watched it fail. Then I put in the correct password, hit save, and it worked! Finall we have re-joined and are connected to the domain.

BUT ... I have now lost all confidence in ACS 5.3 . We are in the middle of a major rollout of WiFi clients using 802.1x authentitcation, replacing our previous pre-shared WPA setup. We are talking > 20,000 WiFi clients. If ACS <--> AD is not rock-solid, I need to try something else. Should we consider using LDAPS instead?

Steve

Travis Hysuick · ‎09-21-2012

Hi Steve, please keep in mind that AD-integration with any 3-rd party service including ACS, ISE, etc. absolutely requires accurate network time. If you're experiencing clock differences between the AAA appliance and the Domain Controller, that would largely account for why the authentication is failing. Kerberos is a token-based authentication mechanism, and as such relies heavily on time synchronization between all client systems.

Where are your domain controllers pulling their time from (hopefully the same place that your ACS instance is pointing)? If you don't already have a dedicated NTP appliance(s), it would be very much worth your while to look into procuring a hardware-based NTP appliance (such as the Symmetricom SyncServer units) which can take an external clock reference from a GPS signal, 1PPS, Sysplex, etc.

We have been running ACS 5.3 for a while, with both RADIUS and TACACS-based policies, and 802.1x WLAN client and VPN client authentication; since patch 5 it has been absolutely rock-solid stable. The inconsistencies you note above could also possibly be due to the AD domain and/or forest functional level settings (personally I've never been a fan of mixed version domain controller deployments). Also, please make sure that the AD account you are using for integration has the "Read All Properties" right on all user and computer objects in your AD structure (The permission for all computer accounts is required if you are going to permit Computer Authentication as part of your Dot1X service policy)

Hope this helps.

scrye · ‎09-21-2012

Hi Travis;

My post was long, you might have missed where I mentioned that the ACS had lost the NTP config. All our devices and servers get time from a Tempus GPS-based NTP server:

Primary NTP : 10.254.8.123

synchronised to NTP server (10.254.8.123) at stratum 2

time correct to within 54 ms

polling server every 1024 s

remote refid st t when poll reach delay offset jitter

==============================================================================

127.127.1.0 LOCAL(0) 10 l 33 64 377 0.000 0.000 0.004

*10.254.8.123 .GPS. 1 u 968 1024 377 11.330 7.343 2.548

I hope the ACS has an SNMP trap that will warn us in the future when if it has problems. I hate being notified of a failure via help desk complaints.

I'm pretty sure the AD integration account has Read All Properties, but will double-check. Never had problems with it until the recent troubles. What was scary was even after time was perfectly in sync, it took a full day of crystal-dangling and goat-entrails inspection before it would re-connect to the domain.

Regarding mixed OS in DCs, yeah, it would be nice to have the kind of budget/staff that MS expects the world to have, but we have to live with what we got...

Do you have any thoughts on logging timestamps? I hate UTC timestamps on logs (even though I wear a watch with T2 set to Zulu) , but I'm afraid to set the clock to the local timezone now ... I can't find the equilvalent of "service timestamps log datetime localtime" in the ACS.

Thanks,

Steve

Travis Hysuick · ‎09-21-2012

Hi Steve,

You're right I missed that section of your post, I've never experienced an appliance losing a section of the configuration.

However regarding the logging, timestamps will be recorded based on the configured timezone via the clock timezone command in the CLI:

http://www.cisco.com/en/US/docs/net_mgmt/cisco_secure_access_control_system/5.3/command/reference/cli_app_a.html#wp1894584

'show timezones' will give you the exhaustive list of zones you can use (ie: America/Chicago, etc)