cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
299
Views
1
Helpful
5
Replies

CSCvz55484 AAA server Down

We had the problem from 16 of mars on wlc 9800-40 17.12.4 and after upgrade to 17.12.5

Could it be continue of the issue and not really fixed?

Platform State from WNCD (0) : current DEAD, duration 35s, previous duration 70s

     Platform State from WNCD (1) : current DEAD, duration 6s, previous duration 146s

     Platform State from WNCD (2) : current UP, duration 36s, previous duration 120s

     Platform State from WNCD (3) : current DEAD, duration 93s, previous duration 83s

     Platform State from WNCD (4) : current UP, duration 205s, previous duration 120s

     Platform State from WNCD (5) : current UP, duration 317128s, previous duration 0s

     Platform State from WNCD (6) : current UP, duration 317128s, previous duration 0s

     Platform State from WNCD (7) : current UP, duration 317128s, previous duration 0s

     WNCD Platform Dead: total time 544734137s, count 4681UP

5 Replies 5

marce1000
Hall of Fame
Hall of Fame

 

 - Upgrade to one of the Known Fixed Releases mentioned in the bug report. If done already go a 'step higher'. If not feasible or possible contact TAC,

 M.



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

Saikat Nandy
Cisco Employee
Cisco Employee

This is not a straightforward thing to troubleshoot but you can narrow it down to some extent. You need to check - 
1. During the problem state are you seeing all the WNCDs going 'DEAD' ? From your logs doesn't looks like can you can validate this behaviour few more times.
2. Lets consider you have 2 servers, ISE1 (For SSID 1)& ISE2(For SSID 2). During problem state, are you seeing WNCDs going down for both ISE1 and ISE2 or just a specific ISE? 'show aaa servers' will give you this data.
3. Do you have any dead-criteria and deadtime configured?

Quite unlikely you are hitting CSCvz55484, because pretty much the entire 17.12.x train has the fix of this defect. Also if you look into the 'How to Identify:' section in the defect, number 4 might be your scenario. If you have accounting enabled, you can try disabling that and give it a shot.

In reality this issue comes when your WLC is not getting any response back from the AAA server...it could be somewhere in between WLC & AAA server the traffic is getting dropped. Also seen this issue with ISE if there is significant amount of load on it. Pcap on both WLC and AAA side in sync will give you a picture about the communication between these two devices.

Do you mean disable accounting-interim? og accounting fra AAA?

Accounting as a whole.

Rich R
VIP
VIP

Agreed with @Saikat Nandy I don't think you're hitting CSCvz55484 - I have not seen it at all on 17.12.

But there is a common problem with the radius timeouts which is already covered in the Best Practice guidance - have you read and followed that @orlando-g-suarez ?
https://www.cisco.com/c/en/us/td/docs/wireless/controller/9800/technical-reference/c9800-best-practices.html#RADIUSServerTimeout

The BU are planning to change the config defaults in a future release to address this.

Review Cisco Networking for a $25 gift card