cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
10963
Views
0
Helpful
11
Replies

ACS loses connection with AD occasionally after upgrade from 5.2 to 5.3.0.40

vciric
Level 1
Level 1

ACS had been integrated with Active Directory before ACS upgrade to 5.3. After the ACS 5.3 upgrade users aren’t able to login to AAA devices occasionally. Error message is:

{AuthenticationResult=Error; Type=Authentication; Authen-Reply-Status=Error; }

24429 Could not establish connection with Active Directory

At the same time, when this issue occurs, ACS connection to AD works fine (checked with Users and Identity Stores> External Identity Stores > Active Directory “Test Connection”)

11 Replies 11

bperez_ti
Level 1
Level 1

Something seems to have changed with how AD integration works from 5.2 to 5.3

I had trouble with my lab server after going from 5.2-6 to 5.3.  Nothing would work and it kept saying LDAP UDP Status error.

I reinstalled ACS at 5.2, deleted the computer object and rejoined without issue.  Stepped from 5.2 to 5.2-4 and then -6 without any problem.  Once I upgrade to 5.3, the domain status is disconnected and I can't seem to get it to connect.

I deleted the computer object and cleared the config in ACS and tried again without success.

I'll try to start with 5.3 from scratch and see what happens.

Nope.  Fresh install of 5.3 and I get this:

Connection test to 'domain.local' failed.

Further information on status:

   - LDAP UDP status error.

This is the same service account that I used for 5.2.

I see the ACS server connecting to the domain controller via UDP 389 as well as TCP 389.  I don't see anything blatant in the wireshark dump, nor is there anything at all telling in the DC's security log.

Could you try this:

Configure ACS 5.3 to use only a specific name server that has the required Active Directory configuration. Use the ACS 5.3 CLI to do this.

The ACS administrator should:

1. Log into the ACS configuration mode using the command acs-config.

2. Use ad-agent-configuration dns.servers to set the IP of the correct IP name-server to use.

For example, if the name of the server to use is 10.56.60.150, then the following commands should be entered, using the ACS 5.3 CLI:

cd-acs5-13-50/admin# acs-config

Escape character is CNTL/D.

Username: acsadmin

Password:

cd-acs5-13-50/acsadmin(config-acs)# ad-agent-configuration dns.servers 10.56.60.150

Performing AD agent internal setting modification is only allowed with ACS support approval. continue (y/n)?

cd-acs5-13-50/acsadmin(config-acs)# show ad-agent-configuration dns-servers

dns-servers: 10.56.60.150

cd-acs5-13-50/acsadmin(config-acs)# exit

This operation should be performed when the ACS machine is joined to the required domain for each server in the deployment.

We have found a similar issue that was caused when the DNS had two entries for the AD server; only one of which was responsive. This seems to work for most connections, but not the UDP.

We are trying to understand why this is no longer working with the same AD configuration like it did in the previous release. However, when we removed the second entry from the DNS server for the AD server the issue got resolved

VERY interesting.  I have a network setup that could easily cause the ACS server to not be able to reach the two DNS servers via one of their IPs.

I basically have a storage network (/24 w/o gateway) using a second NIC on the relevant VMs, including the two AD controllers.  Since the ACS server isn't on that network, it won't reach the AD controllers if the DNS query response is the AD controller's storage network IP.

This is still weird because the ACS server does get some reponse back from the AD controller.  I can test this by putting in a bad password and getting a response that the password is wrong.

I wonder if the method of obtaining AD controller information has changed ie., multiple DNS queries where there used be fewer or just one, etc.

I'm going to try temporarily deleting the DNS entries belonging to the AD controllers for the 'storage network' and see what happens.

My previous response related to a case where a single DNS server had two entries for the AD server. If you have two DNS servers you can direct ACS as to which DNS server to use the commands that vciric posted.

The following is copied from the release notes for ACS 5.3. in ACS 5.3 the DNS server that responds the fastest will be selected by default. But it is possible to use the commands below to override this selection in cases where the DNS server selected is not the correct one

CSCts31991 AD join may fail when there are multiple DNS entries in ACS

ACS fails to join to AD

This problem occurs when there are multiple IP name-server entries configured in an ACS configuration CLI, but not all of the IP name-server entries are configured with Active Directory DNS Records.

It occurs where the AD DNS responds slower than the corporate DNS or if there is a DNS that does not resolve in AD DC/GC SRVs

Workaround1:

Ensure that all IP name-server entries have the required configuration for Active Directory. This way, the fastest responding name server will have the required Active Directory configuration.

Workaround 2:

Configure ACS 5.3 to use only a specific name server that has the required Active Directory configuration. Use the ACS 5.3 CLI to do this.

The ACS administrator should:

1. Log into the ACS configuration mode using the command acs-config.

2. Use ad-agent-configuration dns.servers to set the IP of the correct IP name-server to use.

For example, if the name of the server to use is 10.56.60.150, then the following commands should be entered, using the ACS 5.3 CLI:

cd-acs5-13-50/admin# acs-config

Escape character is CNTL/D.

Username: acsadmin

Password:

cd-acs5-13-50/acsadmin(config-acs)# ad-agent-configuration dns.servers 10.56.60.150

Performing AD agent internal setting modification is only allowed with ACS support approval. continue (y/n)?

cd-acs5-13-50/acsadmin(config-acs)# show ad-agent-configuration dns-servers

dns-servers: 10.56.60.150

cd-acs5-13-50/acsadmin(config-acs)# exit

This operation should be performed when the ACS machine is joined to the required domain for each server in the deployment.

Still no dice.  I went into the regular config and removed one of the name servers.  Restarted ACS, then went into acs-config and ran:

show ad-agent-configuration dns.servers

The output had one server-Domain controller #1. 

I wasn't able to run ad-agent-configuration dns.servers successfully.  I get an error:

Unable to restart AD agent.  Define AD configuration or check current AD configuration settings.

I've also disabled the windows firewall on both DCs and am using the built-in administrator account.

I can't save the AD configuration in the GUI.  When I try that, I get a response that the credentials are invalid-they're most certainly correct.

It's worth noting that all test and save attempts have a 30-60 second delay before the little popup tells me the error.

Hi everybody

Funny, I have two ACS: On the first it works correctly:

SAMUCD0002/acsadmin(config-acs)#

SAMUCD0002/acsadmin(config-acs)# ad-agent-configuration dns-servers XXXX

Performing AD agent internal setting modification is only allowed with ACS supprt approval. continue (y/n)?
SAMUCD0002/acsadmin(config-acs)# show ad-agent-configuration dns-servers
dns-servers: XXXX

SAMUCD0002/acsadmin(config-acs)#

On the second, it doesnt work:  

SAMUCD0003/acsadmin(config-acs)# ad-agent-configuration dns-servers XXXX

Performing AD agent internal setting modification is only allowed with ACS support approval. continue (y/n)?

SAMUCD0003/acsadmin(config-acs)#

SAMUCD0003/acsadmin(config-acs)# sh ad-agent-configuration dns-servers

Performing AD agent internal setting modification is only allowed with ACS support approval. continue (y/n)?

and no output comes...

I have a primary and two secondary ACS in deployment. It seems to happen regularly that AD connection is lost. I can only get connection working by deleting all AD relevant config, clear all AD settings and then reconnect.

This is of course not acceptable for a live environment :-(

matthew.stohr
Level 1
Level 1

Any update on an upcoming fix to this significant issue?  I have been seeing this issue frequently since upgrading to 5.3.0.40 and the workaround does not appear to work.  A restart of the acs application from the CLI will fix the issue temporarily, but as one of the earlier posters stated, this is not acceptable for a production environment. 

Hi,

in our case, it is definitely caused by http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCte92062

We have a domain with 30 domain controllers, where at least one ist down for patching et cetera. So this is pretty severe for us.

I had the same problem, I opened a Cisco TAC case and my issue was resolved.

Sent: Tuesday, 14 August 2012 9:58 AM
Subject: RE: 622739355 HelpDesk#SVR328332-2 : Troubleshoot Cisco ACS 1121 v5.3 With Windows Active Directory

Hi Ramraj,

Thanks for the link to the article, but from what I’ve seen in the logs I’m not sure that we’ve got the same root cause to the issue.

From the ACSADAgent.log files I can see log messages like:

Aug 11 11:10:56 CSSC-TPM-DC-ACS-1 adclient[5524]: DEBUG network.state NST: SniffList: postfailsort=mykulad11p.cssc.dksh.net

Aug 11 11:10:56 CSSC-TPM-DC-ACS-1 adclient[5524]: DEBUG base.kerberos.adhelpers Encryption (id 1) is not supported by KDC. Try next in the list

Aug 11 11:10:56 CSSC-TPM-DC-ACS-1 adclient[5524]: DEBUG base.osutil Module=Kerberos : KDC refused skey: KDC has no support for encryption type (reference base/adhelpers.cpp:216 rc: -1765328370)

Aug 11 11:10:56 CSSC-TPM-DC-ACS-1 adclient[5524]: DEBUG base.adagent Unable to refresh computer credentials: KDC refused skey: KDC has no support for encryption type

This lines up with the error message that we see in the TACACS+ Authentication logs:

24493 ACS has problems communicating with Active Directory using its machine credentials.

I have come across a NETBIOS limitation (it’s not an ACS bug, but a bug has been filed for tracking and documentation purposes) that prevents two ACSs from being connected to Active Directory at the same time if the first 15 characters of their hostnames are the same. The bug ID is CSCtj62342 and its externally visible details are available here: http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCtj62342

The hostname of the primary ACS is : MYMY-TPM-DC-ACS-1

The hostname of the secondary ACS is: MYMY-TPM-DC-ACS-2

From the hostnames, we can see that the first 16 characters of the hostnames are the same. What this means is that once the primary is connected to AD, after some time passes (this will depend on when the secondary goes an talks to AD) the secondary will lose its connection to AD and any authentications hitting the secondary will fail with the same error: 24493 ACS has problems communicating with Active Directory using its machine credentials.

To resolve this issue, the hostnames of the ACSs will need to be changed so that the first 15 characters of their respective hostnames are not the same. Please keep in mind that this is a NETBIOS limitation and not a software bug.

Warm regards,
Ramraj Sivagnanam Sivajanam