Does anyone know the exact communication flow in when ISE sends a CoA due to an endpoint profiler change? Does the PSN that owns the endpoint send a CoA (UDP/1700) or is the PAN/MNT always involved?
I have a fully distributed ISE 3.0 deployment (Dedicated PANs, dedicated MNTs, many PSNs) where all the nodes are in the DC (including the PSNs). Profiling has been working well and the sessions get a re-auth when ISE profiles an endpoint. No CoA errors.
A new PSN has been added to the deployment which is outside of the data centre (over a WAN) - the issue I have is that I have CoA failures after profiling (ISE doesn't get the CoA ACK). Yet, when I perform a manual session reauth for endpoints on that "remote PSN", I don't have any errors. The CoA config on the NAS in the remote location is correct (I have seen the CoA and the CoA ACK in Wireshark).
The Wireshark of a profiling use-case was performed (I deleted the endpoint in ISE prior to starting). I ran tcpdump on both PANs, both MNTs and the remote PSN. I didn't see any UDP/1700 or UDP/3799 traffic relating to that endpoint when ISE learned the new endpoint and profiled it. The session is created correctly - but LiveLogs show multiple CoA errors relating to the failed CoA.
I suspect that either the PAN or MNT is initiating some communications to the remote PSN - and that a firewall is blocking that. If not, then what could possibly cause this, given that a manual CoA works just fine (meaning that RADIUS shared secret and NAS config is correct). I can see TCP communications (looks healthy) between the remote PSN and the PAN/MNTs in the data centre.
Hi @Arne Bier. I haven't done a packet capture on this for a few years, but last time I did one it confirmed that a triggered CoA (after profiling) is initiated directly by the owning PSN (UDP/1700). When a manual CoA is issued from the GUI, the PAN sends a CoA (UDP/3799) to the owning PSN which then sends the CoA (UDP/1700) to the NAD. Upon ACK from the NAD (UDP/1700), the PSN sends an ACK to the PAN (UDP/3799). AFAIK, this behaviour has not changed.
The CoA behaviour is configured in two places, globally (Admin > System > Settings > Profiling) and in the Profiling Policy itself. I know you mention that it works at other sites, but if you haven't already, I would suggest confirming that the Global setting is 'Reauth' and the Profiling Policy that's being hit in your remote site testing is not overriding that.
If the global and policy level CoA settings are correct, you'll likely have to set the Profiling logs to debug on the PSN using the Debug Wizard and see if the logs give you any indication as to what is happening.
thanks @Greg Gibbs - my experience with ISE 3.0 showed that when I perform a Manual CoA Reauth in Live Sessions, the UDP/1700 was initiated directly from the PSN that owns the endpoint. When ISE profiles that same endpoint for the first time, I don't see any UDP/1700 on any PAN, MNT or PSN. I think I may have an odd issue here (a WLC defined in ISE with two IP addresses, due to the way the network was setup). In Live Log Details I see both WLC IP addresses referenced, and it's probably causing the issue.
I am going to simplify the WLC setup and then only have one IP address for the WLC - that should hopefully fix it.