cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
5316
Views
5
Helpful
16
Replies

AAA status - SMD Platform State: DEAD

Walker
Level 1
Level 1

Good morning,

I am attempting to troubleshoot an odd issue I am seeing with a Cat9300 v17.3.3 pointing to a v3.1 p4 ISE Server. When configuring interfaces with 802.1x/MAB, the devices will fail to Auth. The switch configuration matches a known good working config, and I have repeatedly deleted the device and recreated it within ISE to no avail. While troubleshooting, their are a few things I don't quite understand so I am attempting to find some answers. 

When I perform a "sh access-ses brief" I see the following:

Interface MAC Address      AuthC     AuthZ        Fg  Uptime
-----------------------------------------------------------------------------
Gi2/0/4 0011.74a6.2cdf m:AD d:TO UZ: SA- FA- X 3668245s

AD = AAA Failure, TO = Timeout

When I check the AAA server status, I see the following:

RADIUS: id 1, priority 1, host X.X.X.X, auth-port 1812, acct-port 1813, hostname ISE-RADIUS
State: current UP, duration 4883s, previous duration 60s
Dead: total time 1980s, count 1
Platform State from SMD: current DEAD, duration 3668794s, previous duration 60s
SMD Platform Dead: total time 60s, count 0
Platform State from WNCD (1) : current UP
Platform State from WNCD (2) : current UP
Platform State from WNCD (3) : current UP
Platform State from WNCD (4) : current UP
Platform State from WNCD (5) : current UP
Platform State from WNCD (6) : current UP
Platform State from WNCD (7) : current UP
Platform State from WNCD (8) : current UP, duration 0s, previous duration 0s
Platform Dead: total time 0s, count 0
Quarantined: No

When checking the RADIUS live logs, you can not see any attempts from this NAD from any device. The next weird issue is that when I use the "test aaa" command from the troublesome NAD, the authentication request is seen by ISE and is properly Rejected.

There are no Firewall or ACL in the path of the NAD to the PSN, so that can be ruled out.

My concern is that the Platform state from SMD is showing as DEAD and it may be the cause, but I can not find any answers within Cisco docs as to what the SMD platform is. Is there any experts on this board that can explain what I am seeing here and what exactly the SMD Platform is? I have exhausted all my troubleshooting steps and unsure how to proceed further.

 

 

1 Accepted Solution

Accepted Solutions

Walker
Level 1
Level 1

Thank you all for the responses. On Friday I requested that the switch be reloaded to see if it would resolve the issue, but it was also upraded to a new code. Since then the issue seems to be resolved. Platform State from SMD is now showing as UP.

I still have not heard any responses as to what the SMD Platform state refers to, so please let me know if anyone finds this out in the future!

View solution in original post

16 Replies 16

thomas
Cisco Employee
Cisco Employee

Please verify your switch RADIUS configuration against our ISE Wired Deployment Guide.

If there are no ISE LiveLogs it sounds like the RADIUS packets are never reaching ISE. The dead status on the switch would suggest the same. This sounds like a routing/firewall problem even though you said there is no firewall. If so then then you will want to very carefully check your ISE PSN IP address in your configuration for typos.

I have the exact same issue here with 17.9.3, I did a capture on the uplink but as Walker said there was even no auth request sending out. After reload the SMD state changes to UP however the issue remains. Below is periodic log message  I got.

%MAB-5-FAIL: Switch 1 R0/0: sessmgrd: Authentication failed for client (c85b.76c1.37b0) with reason (AAA Server Down) on Interface Gi1/0/27 AuditSessionID 2AF149640000003EA4C78D0C

Walker
Level 1
Level 1

@thomas ,

Thank you for your quick reply. I have verified my RADIUS configuration with the ISE Wired Deployment Guide and they are essentially the same, with few tweaks requested by our organization. I also do not see any typos w/ the ISE PSN configuration.

Would you happen to have any idea what the SMD Platform is in regards to RADIUS? Out of hundreds of switches, this is the first time I'm seeing this platform state as DEAD.

Edit: Just to add, the ISE server is reachable via the NAD and vice versa.

thomas
Cisco Employee
Cisco Employee

ISE Wired Deployment Guide > Validating Basic Settings

Consider using debug radius on the switch too.

I always like to turn off RADIUS protocol suppression in ISE so I can see every authentication attempt when trying a new setup or scenario for the first time:

Managing Network Devices in ISE

10:44 Network Devices MUST be defined in ISE
16:56 Disable Suppression of repeated Failures and Success
17:36 Enable Repository and Packet Capture
19:10 RADIUS with an Undefined Network Device
21:08 Enable and Use the Default Network Device
24:43 Network Device with an IP Range
26:30 Network Device with a Specific IP Address
28:00 Packet Capture Review

But yours still sounds like an ACL/Firewall problem.

Walker
Level 1
Level 1

@thomas When I run the test aaa group radius new-code command, it shows up in the Live Logs. I would imagine if it was blocked somewhere, I would not be seeing them in the logs.

@Walker normally when ISE cannot see authentication requests coming in I'd check the NAD configuration on ISE to determine with the correct IP address and shared secret is defined....though that doesn't explain why the test from the switch works. I'd double check.

You can run tcpdump from ISE, filter on the switch in question and test authentication....provide the output for review if you wish.

thomas
Cisco Employee
Cisco Employee

Excellent! RADIUS from your switch is working and so is ISE!

The problem is in your 802.1X/MAB configuration on your switchport(s).

We use the same interface template config for all devices, it would be strange that this particular NAD would have issues with it. I have scheduled to have this device rebooted tonight and hope to dive more into the config on Monday.

hslai
Cisco Employee
Cisco Employee

@Walker IOS-XE 17.3.3 is somewhat old. 17.6.4 and 17.3.5 are the suggested releases, as of today (2022-Nov-19). So, I would suggest upgrading. If still problematic, please engage Cisco TAC to troubleshoot the switch.

same issue on
Cisco IOS XE Software, Version 17.03.06
System Bootstrap, Version 17.9.1r 

tried removing and re-adding the radius server without success ( I suspect the upstream switch was offline of the time of startup, however the radius deadtime which should default to 0 to trigger it back online is awaiting infinity)
#Before

RADIUS: id 1, priority 1, host 10.100.1.143, auth-port 1812, acct-port 1813, hostname SJ-ISE-PSN-01
State: current UP, duration 328959s, previous duration 60s
Dead: total time 60s, count 0
Platform State from SMD: current DEAD, duration 261946s, previous duration 67022s
SMD Platform Dead: total time 261995s, count 1
Platform State from WNCD (1) : current UP
Platform State from WNCD (2) : current UP
Platform State from WNCD (3) : current UP
Platform State from WNCD (4) : current UP
Platform State from WNCD (5) : current UP
Platform State from WNCD (6) : current UP
Platform State from WNCD (7) : current UP
Platform State from WNCD (8) : current UP, duration 0s, previous duration 0s
Platform Dead: total time 0s, count 0
Quarantined: No
Authen: request 13131, timeouts 13098, failover 0, retransmission 9824
Response: accept 24, reject 4, challenge 0
Response: unexpected 0, server error 0, incorrect 0, time 82269261ms
Transaction: success 32, failure 3274
Throttled: transaction 0, timeout 0, failure 0
Malformed responses: 0
Bad authenticators: 0
Author: request 1, timeouts 0, failover 0, retransmission 0
Response: accept 1, reject 0, challenge 0
Response: unexpected 0, server error 0, incorrect 0, time 26ms
Transaction: success 1, failure 0
Throttled: transaction 0, timeout 0, failure 0
Malformed responses: 0
Bad authenticators: 0
Account: request 18589, timeouts 17511, failover 3, retransmission 13133
Request: start 170, interim 0, stop 162
Response: start 47, interim 0, stop 40
Response: unexpected 0, server error 0, incorrect 0, time 45ms
Transaction: success 1078, failure 4378
Throttled: transaction 0, timeout 0, failure 0
Malformed responses: 0
Bad authenticators: 0
Elapsed time since counters last cleared: 3d19h23m
Estimated Outstanding Access Transactions: 1
Estimated Outstanding Accounting Transactions: 0
Estimated Throttled Access Transactions: 0
Estimated Throttled Accounting Transactions: 0
Maximum Throttled Transactions: access 0, accounting 0
Consecutive Response Failures: total 7651
SMD Platform : max 7651, current 7651 total 7651
WNCD Platform: max 0, current 0 total 0
IOSD Platform : max 0, current 0 total 0
Consecutive Timeouts: total 30605
SMD Platform : max 30605, current 30605 total 30605
WNCD Platform: max 0, current 0 total 0
IOSD Platform : max 0, current 0 total 0
Requests per minute past 24 hours:
high - 0 hours, 1 minutes ago: 4
low - 19 hours, 23 minutes ago: 0
average: 0

 

#after (intially adding)
RADIUS: id 4, priority 3, host 10.100.1.143, auth-port 1812, acct-port 1813, hostname SJ-ISE-PSN-01
State: current DEAD, duration 10s, previous duration 71s
Dead: total time 141s, count 0
Platform State from SMD: current DEAD, duration 9s, previous duration 0s
SMD Platform Dead: total time 9s, count 0
Platform State from WNCD (1) : current UP
Platform State from WNCD (2) : current UP
Platform State from WNCD (3) : current UP
Platform State from WNCD (4) : current UP
Platform State from WNCD (5) : current UP
Platform State from WNCD (6) : current UP
Platform State from WNCD (7) : current UP
Platform State from WNCD (8) : current UP, duration 0s, previous duration 0s
Platform Dead: total time 0s, count 0
Quarantined: No
Authen: request 0, timeouts 0, failover 0, retransmission 0
Response: accept 0, reject 0, challenge 0
Response: unexpected 0, server error 0, incorrect 0, time 0ms
Transaction: success 0, failure 0
Throttled: transaction 0, timeout 0, failure 0
Malformed responses: 0
Bad authenticators: 0
Author: request 0, timeouts 0, failover 0, retransmission 0
Response: accept 0, reject 0, challenge 0
Response: unexpected 0, server error 0, incorrect 0, time 0ms
Transaction: success 0, failure 0
Throttled: transaction 0, timeout 0, failure 0
Malformed responses: 0
Bad authenticators: 0
Account: request 7, timeouts 7, failover 0, retransmission 7
Request: start 0, interim 0, stop 0
Response: start 0, interim 0, stop 0
Response: unexpected 0, server error 0, incorrect 0, time 0ms
Transaction: success 0, failure 0
Throttled: transaction 0, timeout 0, failure 0
Malformed responses: 0
Bad authenticators: 0
Elapsed time since counters last cleared: 0m
Estimated Outstanding Access Transactions: 0
Estimated Outstanding Accounting Transactions: 0
Estimated Throttled Access Transactions: 0
Estimated Throttled Accounting Transactions: 0
Maximum Throttled Transactions: access 0, accounting 0
Consecutive Response Failures: total 0
SMD Platform : max 0, current 0 total 0
WNCD Platform: max 0, current 0 total 0
IOSD Platform : max 0, current 0 total 0
Consecutive Timeouts: total 6
SMD Platform : max 6, current 6 total 6
WNCD Platform: max 0, current 0 total 0
IOSD Platform : max 0, current 0 total 0

 

#after a few miniutes (marked UP but "Platform State from SMD: current DEAD" Authen not being sent to radius; Manual test is recieved by radius server)


RADIUS: id 4, priority 3, host 10.100.1.143, auth-port 1812, acct-port 1813, hostname SJ-ISE-PSN-01
State: current UP, duration 2008s, previous duration 60s
Dead: total time 191s, count 0
Platform State from SMD: current DEAD, duration 2068s, previous duration 0s
SMD Platform Dead: total time 2068s, count 0
Platform State from WNCD (1) : current UP
Platform State from WNCD (2) : current UP
Platform State from WNCD (3) : current UP
Platform State from WNCD (4) : current UP
Platform State from WNCD (5) : current UP
Platform State from WNCD (6) : current UP
Platform State from WNCD (7) : current UP
Platform State from WNCD (8) : current UP, duration 0s, previous duration 0s
Platform Dead: total time 0s, count 0
Quarantined: No
Authen: request 1, timeouts 1, failover 0, retransmission 0
Response: accept 0, reject 1, challenge 0
Response: unexpected 0, server error 0, incorrect 0, time 40ms
Transaction: success 1, failure 1
Throttled: transaction 0, timeout 0, failure 0
Malformed responses: 0
Bad authenticators: 0
Author: request 0, timeouts 0, failover 0, retransmission 0
Response: accept 0, reject 0, challenge 0
Response: unexpected 0, server error 0, incorrect 0, time 0ms
Transaction: success 0, failure 0
Throttled: transaction 0, timeout 0, failure 0
Malformed responses: 0
Bad authenticators: 0
Account: request 1817, timeouts 1754, failover 63, retransmission 1753
Request: start 3, interim 0, stop 3
Response: start 0, interim 0, stop 0
Response: unexpected 0, server error 0, incorrect 0, time 40ms
Transaction: success 1, failure 1
Throttled: transaction 0, timeout 0, failure 0
Malformed responses: 0
Bad authenticators: 0
Elapsed time since counters last cleared: 34m
Estimated Outstanding Access Transactions: 0
Estimated Outstanding Accounting Transactions: 62
Estimated Throttled Access Transactions: 0
Estimated Throttled Accounting Transactions: 0
Maximum Throttled Transactions: access 0, accounting 0
Consecutive Response Failures: total 1
SMD Platform : max 1, current 1 total 1
WNCD Platform: max 0, current 0 total 0
IOSD Platform : max 0, current 0 total 0
Consecutive Timeouts: total 1754
SMD Platform : max 1754, current 1754 total 1754
WNCD Platform: max 0, current 0 total 0
IOSD Platform : max 0, current 0 total 0
Requests per minute past 24 hours:
high - 0 hours, 34 minutes ago: 2
low - 0 hours, 35 minutes ago: 0
average: 0

 

*****Cisco_SW# sh access-ses brief
Interface MAC Address AuthC AuthZ Fg Uptime
-----------------------------------------------------------------------------
Gi1/0/44 ****.9e04.**** m:OK d:NR AZ: SA-D:V: X 331950s
Gi1/0/37 ****.8315.**** m:OK d:NR AZ: SA-D:V: X 332409s
Gi1/0/9 ****.2840.**** m:AD d:NR AZ: SA- X 14921s
Gi1/0/4 ****.222d.**** m:OK d:NR AZ: SA-D: X 272693s
Gi1/0/43 ****.8b7d.**** m:OK d:NR AZ: SA-D:V: X 332312s
Gi1/0/7 ****.1953.**** m:AD d:NR AZ: SA- X 13886s
Gi1/0/3 ****.3a4c.**** m:AD AZ: SA- X 14559s
Gi1/0/5 ****.e5d4.**** m:AD d:NR AZ: SA- X 393s

Key to Authentication Attributes:

RN - Running
ST - Stopped
OK - Authentication Success
CF - Credential Failure
AD - AAA Server Failure
NR - No Response
TO - Timeout
AR - AAA Not Ready

according to ISE Wired Deployment Guide. automate-tester  command isnt needed as the switch by default dosnt have a deadtime (default 0 so always up) however we have this in there confgured as per the below; i am going to add the "ignore-acct-port" as per the guide have also added (hopefully it works better next poweroff)
"
radius-server dead-criteria time 10 tries 12
radius-server deadtime 1
"

radius server HIVE-ISE-PAN-01
address ipv4 10.100.64.143 auth-port 1812 acct-port 1813
automate-tester username netman probe-on
key

Walker
Level 1
Level 1

Thank you all for the responses. On Friday I requested that the switch be reloaded to see if it would resolve the issue, but it was also upraded to a new code. Since then the issue seems to be resolved. Platform State from SMD is now showing as UP.

I still have not heard any responses as to what the SMD Platform state refers to, so please let me know if anyone finds this out in the future!