cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Announcements

This community is for technical, feature, configuration and deployment questions.
For production deployment issues, please contact the TAC! We will not comment or assist with your TAC case in these forums.
Please see How to Ask the Community for Help for other best practices.

2279
Views
0
Helpful
4
Replies
radiomoskau
Beginner

Cisco ACS 4.2 - Server Busy

Hi!

We're authenticating our Desktops and IP-Phones via 802.1x using two Radius-servers running Cisco ACS v4.2 on Win2k8.

From time to time we run into the problem, that one of the servers 'get's too busy' and stops answering authentication requests. That results in many failed authentications with our VoIP-phones (Siemens OpenStage).

What I don't understand is why the ACS acts that way...

TAC says that all 42 or so threads are in use when the server says it's too busy.

While the server is 'busy' the CPU runs at 1 - 2 % !! And there's loads of RAM left...

This is an extract from the CSRadius-Log-File:

RDS 06/09/2011 07:51:13 E 1495 2072 0x0 Server too busy - request from 10.104.204.249 ignored

RDS 06/09/2011 07:51:13 E 1495 5124 0x0 Server too busy - request from 10.104.204.249 ignored

RDS 06/09/2011 07:51:13 E 1495 5124 0x0 Server too busy - request from 10.100.204.22 ignored

RDS 06/09/2011 07:51:13 E 0958 3712 0x0 Error processing accounting request - no response sent to NAS

RDS 06/09/2011 07:51:13 E 5947 4916 0x0 Failed to update logged on list for IPPhone (UDB_SERVER_BUSY)

RDS 06/09/2011 07:51:13 E 1495 5124 0x0 Server too busy - request from 10.100.204.22 ignored

RDS 06/09/2011 07:51:13 E 0958 1880 0x0 Error processing accounting request - no response sent to NAS

RDS 06/09/2011 07:51:13 E 6025 3560 0x0 Matching class attribute failed for user IPPhone, no further processing will be done assuming this is out-of-order packet due to UDP

RDS 06/09/2011 07:51:13 E 1825 1532 0x0 Error UDB_SERVER_BUSY authenticating host/hostname.xxx.yyy - no response sent to NAS

...

RDS 06/09/2011 07:51:20 E 3089 2704 0x0 Error AS_NO_FREE_CONNECTIONS authenticating IPPhone - no response sent to NAS

Did any of you encounter the same problem? Did you find a workaround or fix? Maybe there's a way to increase the number of authentication threads?

Thanks alot!

4 REPLIES 4
Calvin Ryver
Beginner

There are a lot of things that can cause the server too busy in the ACS. It can also be an issue with bogus tacacs requests eating up the request process. Both radius and tacacs uses the auth request. If you are back ending to AD it may be a delay in getting replies from the AD.

You may want to bump up the logging on the ACS and then look at the auth logs and see what is hapening.

go to "System Config" "Service Control" set the logging to full. Wait till you see the issue again. The go to

"System Config" "Support"

select diagnostic logs and logs for 1 day then run the support.

After you do go to the ayth log and take a look at the time you saw the error and see what is going on. You should see all of the requests that came in at that time.

Hi!

Thanks for your reply!

We've used detailed logging several times and send them to Cisco TAC. It seems that the clients took too much time to answer.

Do you have any suggestions for altering the authentication timers on the switches? Or is there a possibility to increase the number of threads of the ACS? On the servers there's lots of power idling while the ACS is "too busy".

The key is to get all of the information needed. Normally when they say it takes too long for the client to answer that is not always the exact fault.

You may seem to get that answer if the ACS is taking a long time to process the request and the switch or client has basically timed out its requests.

The information needed is the following

all of these items really need to be gathered at the same time

switch debugs including

debug radius

debug aaa authen

debug aaa accounting

sniffer capture between the switch and the ACS

logs from ACS with debugs enabled.

If you are going to AD on the backend you may also want a sniffer capture between the ACS and the AD

all of these together should tell you where the delay of failure lays and then at that time some changes can be suggested

dpatzold1979
Beginner

we are running acs 4.2 as well and we are having the same problem. you can look at the rds logs and see the same thing.

CPU MEM look ok at the time though they are continually rising as if its a memory leak. though again plenty of ram free.

let me know what you figure out with this.

Content for Community-Ad