cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1496
Views
0
Helpful
3
Replies

"External DB Not Operational" or "Internal Error" - NT Auths

mmletzko
Level 1
Level 1

Does anyone know what would cause an NT remote agent to start failing NT authentications from trusted domains of the agent server without any changes to ACS, or no known changes to AD?

Here is the scenario:

ACS

Domain A - Remote Agent 1

Domain A - Remote Agent 2

Domain B

Domain C

Domain A has 2-way trusts with B and C.

ACS authenticates accounts from Domains A, B & C for months/years without issue.  Then one day, without any known changes to ACS or AD, Remote Agent 1 stops authenticating Domain B and/or C with ACS showing "External DB Not Operational"/"Internal Error" errors, but continues to authenticate accounts from Domain A without issues.  Remote Agent 2 continues to authenticate users from A, B and C without issues.

Here are a couple of things that have been an issue, but are not in this case:

1) NT Remote agent didn't have the domain suffixes of B and C defined in the TCPIP Properties on the NIC

2) Problem with DC in the trusted domain - the one that the remote agent tries to communicate to for authentications in that domain.  A reboot of that destination DC sometimes fixes the issue.

Here is an example of one of these failed attempts from a package.cab auth.log file:

AUTH 03/30/2013 01:02:31 I 1267 0580 pvAuthenticateUser: authenticate 'userxxxx' against Windows Database (358865680, 40)

AUTH 03/30/2013 01:02:31 I 0365 0580 External DB [NTAuthenDLL.dll]: Starting MSCHAP authentication for user [userxxxx]

AUTH 03/30/2013 01:02:31 I 0365 0580 External DB [NTAuthenDLL.dll]: AgentLib: Attempting to connect to agent manager at x.x.x.x:2004

AUTH 03/30/2013 01:02:35 I 0365 0580 External DB [NTAuthenDLL.dll]: AgentLib: Connection established, handle 0x15b29520

AUTH 03/30/2013 01:02:36 I 0365 0580 External DB [NTAuthenDLL.dll]: AgentLib: AgentService_Connect failed - no CSWinAgent agent found on that host

AUTH 03/30/2013 01:02:36 I 0365 0580 External DB [NTAuthenDLL.dll]: AgentLib: Disconnecting from agent manager, handle 0x15b29520

AUTH 03/30/2013 01:02:36 E 0365 0580 External DB [NTAuthenDLL.dll]: Cannot connect to agent service at x.x.x.x:2004

AUTH 03/30/2013 01:02:36 E 0365 0580 External DB [NTAuthenDLL.dll]: Cannot connect to any agent

AUTH 03/30/2013 01:02:36 I 0365 0580 External DB [NTAuthenDLL.dll]: AgentLib: Failed to get connection to agent

AUTH 03/30/2013 01:02:36 I 5081 0580 Done RQ1026, client 50, status -2129

The highlighted error doesn't make any sense since the agent is successfullly authenticating users from Domain A. 

The agent services have always used the local system account.  AD is v2003.


I should add that this is ACS v3.3.4.  I know, it's out of support - long story.  Hoping someone has some insight or suggestions anyway, other than "upgrade". 

Any ideas?


Thanks!

3 Replies 3

Amjad Abdullah
VIP Alumni
VIP Alumni

Well, I would say although 3.3.4 is out of support it is also high probable to have bugs with it. Too many fixes to the ACS have been made since that version (until latest version 4.2.1.15). Let's hope what you see can be fixed without the need of upgrade.

I would suggest some points:

- Make sure your RA version and patch level are same as the ACS version and patch level.

- Make sure the system requirements and AD requirements apply for the RA and AD auth with your ACS version (you can read the installation and configuration guides for ACS and RA of your version).

- There could probably be some windows server updates that had been installed and caused the issue.

- You mentioned the RA is using a local account? How will it handle the auth if it has no valid access to the domain? I am not memembering the installation requirements but I would say it must has a domain user privilege to be able to process auth requests. Don't you agree?

- One step that can alway be taken is to uninstall one RA, reload then install it again as a step to resolve the problem. But if you are not already on the latest patch you can then try to upgrade to the latest patch before applying this step.

HTH

Amjad

Rating useful replies is more useful than saying "Thank you"

Rating useful replies is more useful than saying "Thank you"

Thanks for the reply Amjad.


I did verify the agents and SEs are the correct/same version.

With respect to the RA and AD - there are a LOT of configuration possibilities here.  For example, we've always used the local system account and things have worked fine - with all kinds of cross-forest/child-domain authentication.  We have other infrastructures that haven't had any problems using local system account this way either.  Strangely, it is sometimes months/years before we see this issue.  I may, however, try a global domain admin account with the appopriate rights to see if it makes any difference.

I know upgrading is really the right way to go - was just hoping there was something I was overlooking.

Thanks!

Hi mmletzko,

It's very late, but just wanted to add my inputs:

The service account should be JUST part of domain user group.

http://www.cisco.com/en/US/docs/net_mgmt/cisco_secure_access_control_server_for_solution_engine/3.3/installation/guide/remote_agent/rawi.html#wp289256

One change I can think of is windows update on the server.

Jatin Katyal
- Do rate helpful posts -

~Jatin