11-12-2020 11:44 PM - edited 11-12-2020 11:52 PM
Every 6 hours on some devices we see in logs RADIUS_DEAD not responding with all three radius servers. Other parts of the network it happenes even more frequently. From debug it an accouting session initiates but soon after a tcp reset occurs due to NAS error (auth failed). Between these six hours the radius servers comms are ok. No issues. We can access the device no problem but during RADIUS_DEAD were not able to access the host.
Im trying to determine if the issue is with our local host that it has a parameters that the radius sever rejects or the issue is with the a misconfiguration at the radius server with local config is ok.
This particular device is C9300 but im seeing the same problem with 3650 and Nexus 9k all running their Cisco suggested IOS.
Below is the full configuration and attached debug. Appreciate people assistance in this.
aaa group server radius X-RADIUS
server name X-RADIUS-1
server name X-RADIUS-2
server name X-RADIUS-3
ip radius source-interface VLANX
exit
aaa authentication login default group X-RADIUS local
aaa authentication login vty group X-RADIUS local
aaa authentication login con group X-RADIUS local
aaa authentication enable default line group X-RADIUS enable
aaa accounting send stop-record authentication failure
aaa accounting exec net-supp start-stop group X-RADIUS
aaa accounting connection net-supp start-stop group X-RADIUS
aaa authorization exec net-supp group X-RADIUS local
aaa authorization commands 1 net-supp group X-RADIUS local
aaa authorization commands 15 net-supp group X-RADIUS local
radius server X-RADIUS-1
address ipv4 10.X.X.X auth-port 1812 acct-port 1813
key 1234abcd
radius server X-RADIUS-2
address ipv4 10.X.X.X auth-port 1812 acct-port 1813
key 1234abcd
radius server X-RADIUS-3
address ipv4 10.X.X.X auth-port 1812 acct-port 1813
key 1234abcd
!
exit
line vty 0 15
password JGHGGS1
login authentication vty
accounting connection net-supp
accounting exec net-supp
transport input ssh
transport output none
exec-timeout 9 0
11-13-2020 12:27 AM
>...
>This particular device is C9300 but I m seeing the same problem with 3650 and Nexus 9k all running their Cisco suggested IOS.
- Since different device types are involved I also suggest in analyzing network behavior and or traffic. Look for bursts, networking spikes , or just traffic data from/to those devices at that time but also during normal operation. Also use syslogging to a central syslog server, and the same for snmp-traps to a trap receiver.
M.
11-13-2020 06:31 AM
11-13-2020 07:58 AM - edited 11-13-2020 07:59 AM
Does this test the connection to the radius server only at set intervals or sends probe once the radius is marked as dead?
11-13-2020 12:01 PM
11-14-2020 04:15 PM
at interval the client send test to RADIUS and check if the RADIUS response or not.
11-14-2020 02:26 PM - edited 11-14-2020 02:26 PM
If such events always associated with the accounting stop requests following auth failures, then you might consider to remove this line
aaa accounting send stop-record authentication failure
Regarding the configuration command automate-tester, see
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide