5436 RADIUS packet already in the process

alliasneo1 · ‎01-07-2025

Hi, has anyone seen this before or offer any advice?

Running ISE 3.4.068 with no patches installed.

Lots of devices are no longer matching against my polcies. Occassionaly they will match like in the example below but they will then continue in the logs as a fail. when I click on the details icon I get the following message:

Event	5436 RADIUS packet already in the process
Failure Reason	5436 RADIUS packet already in the process
Resolution	Check whether the Average RADIUS Request Latency statistic is close to or exceeds the client's RADIUS request timeout. If so, determine whether the latency is caused by a slow external Identity Store or because this instance of ISE is being overloaded. To resolve this, increase the client's RADIUS request timeout, using a faster or additional, external Identity Stores, or reduce the load on this instance of ISE.
Root cause	Ignoring this request because it is a duplicate of another packet that is currently being processed

MHM Cisco World · ‎01-07-2025

....

MHM

alliasneo1 · ‎01-07-2025

We have 2 deployment nodes with Admin, Monitoring and Policy Service on both and pxGrid on one.

MHM Cisco World · ‎01-07-2025

...

MHM

alliasneo1 · ‎01-07-2025

And what is load balance you use ? - I'm not sure if I've set load balancing up, I've registered both nodes, one as a primary and one as a secondary. What else would I need to do?

How you config NAD to send request to both PSN? The commands on the Switch have both ISE nodes configured under the radius and aaa config

MHM Cisco World · ‎01-07-2025

...

MHM

alliasneo1 · ‎01-08-2025

I've now upgraded to 3.0.4.608 Patch 1 but I'm still seeing the same errors:

Arne Bier · ‎01-08-2025

Is this happening on switches or wireless?

What does your IOS config look like?

show run | section radius
show derived int x/y/z (example switch interface with NAC enabled)
show access-session int x/y/z detail

What endpoint is this (Windows PC, etc.?) - does the endpoint have a 802.1X supplicant configured?

Cisco best practice: RADIUS Accounting is normally set to send Interim-Updates only if there has been an update from the Device Sensor (if configured) and latest, every 2880 minutes (48 hours) if there have been no updates.

alliasneo1 · ‎01-09-2025

#sh run | section radius
aaa group server radius XXX
server name XXXX
server name XXXX
ip radius source-interface XXXX
load-balance method least-outstanding
aaa authorization auth-proxy default group radius
aaa accounting system default start-stop group radius
aaa server radius dynamic-author
client XXXX server-key XXXX
client XXXX server-key XXXX
auth-type any
radius-server attribute 44 include-in-access-req default-vrf
radius-server attribute 6 on-for-login-auth
radius-server attribute 6 support-multiple
radius-server attribute 8 include-in-access-req
radius-server attribute 25 access-request include
radius-server attribute 31 mac format ietf upper-case
radius-server attribute 31 send nas-port-detail
radius-server dead-criteria time 5 tries 3
radius-server deadtime 10
radius server XXXX
address ipv4 XXXX auth-port 1812 acct-port 1813
timeout 5
retransmit 3
key XXXX
radius server XXXX
address ipv4 XXXX auth-port 1812 acct-port 1813
timeout 5
retransmit 3
key XXXX

These are all 9200 switches.

This is happening with devices that are not configured with dot1x. so for example a windows laptop with no dot1x so it doesn't authenticate but then even when it is unplugged its still showing up under #sh authentication sessions and it repeates the authentication every 60 seconds on the port. If I manually go in and clear the sessions then it clears it out.

Arne Bier · ‎01-19-2025

I don't have an IBNS 1.0 switch to check the command syntax, but you should never have sessions showing AFTER a physical interface is disconnected. Unless ... is that device that is disconnected connected to the back of a desk-phone (and phone is physically attached to the switch)? If the phone is not configured correctly, it won't send the disconnect on behalf of the connected device.

In IBNS 2.0 there is a very clear Policy that ensures that disconnected endpoints are cleared

policy-map type control subscriber PORT-AUTH-POLICY-I
...
...
  event inactivity-timeout match-all
    10 class always do-until-failure
     10 clear-session
...
...

Arne Bier · ‎01-07-2025

Regarding RADIUS load balancing in IOS-XE devices, it's really simple and very effective. One command under the aaa group statement - e.g. in my example, the aaa group is called "dnac-client-radius-group" :

conf t
aaa group server radius dnac-client-radius-group
  load-balance method least-outstanding
  end
wr mem

You can check the effectiveness of the load balancing with the "show aaa servers" command:

CSR#show aaa servers | in request
     Authen: request 22, timeouts 20, failover 0, retransmission 15
     Author: request 3, timeouts 0, failover 0, retransmission 0
     Account: request 0, timeouts 0, failover 0, retransmission 0
     Authen: request 22, timeouts 20, failover 0, retransmission 15
     Author: request 0, timeouts 0, failover 0, retransmission 0
     Account: request 0, timeouts 0, failover 0, retransmission 0

The first three rows are RADIUS server 1, and the last three rows are RADIUS server 2.

You should also reset the counters after enabling load balancing to see an accurate result

clear aaa counters servers radius all

alliasneo1 · ‎01-08-2025

Hi, thanks for this, it looks like it's working now:

Switch#sh aaa servers | in req
Authen: request 63, timeouts 0, failover 0, retransmission 0
Author: request 1, timeouts 0, failover 0, retransmission 0
Account: request 31, timeouts 0, failover 0, retransmission 0
Authen: request 61, timeouts 0, failover 0, retransmission 0
Author: request 0, timeouts 0, failover 0, retransmission 0
Account: request 19, timeouts 0, failover 0, retransmission 0

I didn't have this configured on any switches so I think that's good now

MHM Cisco World · ‎01-08-2025

...

MHM

MHM Cisco World · ‎01-10-2025

Sorry I am busy now' maybe other VIP can help you.

Goodluck

MHM

Rob Ingram · ‎01-07-2025

@alliasneo1 install ISE 3.4 patch 1, you could be hitting this bug https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwm38826

Symptom: RADIUS Packets are incorrectly being dropped with failure reason "RADIUS: RADIUS packet already in the process" after upgrading to or installing ISE 3.4.

Conditions: ISE is running 3.4.0.608 with the "Reject RADIUS requests from clients with repeated failures" RADIUS suppression feature enabled.

Workaround: Disable RADIUS suppression for "Reject RADIUS requests from clients with repeated failures" in Administration > System > Settings > Protocols > RADIUS and restart services in the PSN using "app stop ise" and "app start ise" to clear the sessions stuck in the duplicate manager.

This is resolved in ISE 3.4 patch 1, so install the latest patch.

https://www.cisco.com/c/en/us/td/docs/security/ise/3-4/release_notes/b_ise_34_RN.html#c-resolved_caveats_34p1