cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
4958
Views
37
Helpful
16
Replies

Posture 2.2-style

craig.beck
Level 1
Level 1

Hi Everyone,

Has anyone deployed ISE posture using the new 2.2-style method without URL-redirection?

I'm keen to hear if anyone has, how well it has worked for them, any gotchas, workarounds to issues, etc.

I've deployed a distributed solution using 2.2 (patch 6 currently but will be pulling that ASAP due to a defect) and wanted to make use of the functionality, but it just hasn't been a good experience so far and have therefore gone back to doing posture via URL-redirection.  We have seen that if the client uses a PSN which isn't the current session owner, but a session still exists at that [other] PSN for that client for whatever reason, it breaks the process as a CoA containing an unknown session-id (at the switch) is sent, leaving our client stuck with a restrictive ACL and in the wrong VLAN.  Accounting is configured but in some instances the PSN either doesn't receive the STOP or, particularly where the PC/Laptop is behind a phone (non-Cisco), a STOP is never sent.  I understand why this is a problem but would have thought that this would have been considered, given that the method is supposed to enable non-Cisco NADs to be used.

My understanding is that the client can use the "call-home" list to find a PSN during posture.  That can be any or all of your PSNs.  The contacted PSN will check its session database to see whether a client with a matching MAC/IP is known.  If it is, that PSN it will return the CoA itself, but if not it will check the MnT and then direct the client to the correct PSN.  When the MnT is checked, it works.

I've raised a TAC for this and it seems that the logic in 2.2 posture isn't quite there yet and could be improved.  My feeling is that the MnT should be checked each time, to at least check the timestamp of the session.  Saying that, TAC told me to do it the old way because 2.2 's method could break easily if the accounting packet isn't received by the original PSN, but that doesn't sit well with me or my customer.  All of the literature around 2.2 posture says it should work, but it doesn't with 100% accuracy.

Is this something that works better in 2.3?

Cheers,

Craig

1 Accepted Solution

Accepted Solutions

First understand that the ISE 2.2 Posture option was not intended to replace Posture redirection, but rather augment or serve as fallback in absence of redirection.  This is why the posture discovery still starts by attempting redirected discovery first, i.e. let system automatically direct agent to current PSN as quickly as possible.

In the event that redirection fails or is not present by network, the agent proceeds to a phase 2 where it leverages different methods including the call home list.  Yes, this is a list of PSNs (or LB targets) that will hot a PSN in the system.  The PSN does not send any CoA at this point. If the client started a new RADIUS session, then that should be updated at MnT even if old session was not terminated. 

If I interpret issue correctly, it is the fact that client returns to the original PSN for posture and one with an active session with that endpoint--as it was never terminated, and does not check with MnT if another session started elsewhere.  Note that I do not think this issue is necessarily limited to ISE 2.2 alternative discovery flow as it is possible for agent to hit original PSN via ConnectionData.XML and it will think session is still active.

Certainly failure to reliably send RADIUS Accounting Stops can complicate situation.  Make sure idle timers are configured to terminate idle clients behind non-Cisco phones.  As you may know, Cisco phones can immediately signal the PC disconnect to NAD to expedite session termination.

Please continue to work with TAC and raise a defect if one not already applied.  The reason to NOT query MnT for every connection is to limit load and latency required for all PSNs to send and wait for update from MnT for every posture session / reassessment.  That said, there is certainly opportunity to improve flow to handle exception cases.  In the meantime, I would look at ways to improve the communication of session termination for the 3rd-party phones.

View solution in original post

16 Replies 16

imbashir
Cisco Employee
Cisco Employee

Hi Craig, correct, this is how it is designed to work, could we pls fwd the SR or tac-case ID

Thanks

Imran

First understand that the ISE 2.2 Posture option was not intended to replace Posture redirection, but rather augment or serve as fallback in absence of redirection.  This is why the posture discovery still starts by attempting redirected discovery first, i.e. let system automatically direct agent to current PSN as quickly as possible.

In the event that redirection fails or is not present by network, the agent proceeds to a phase 2 where it leverages different methods including the call home list.  Yes, this is a list of PSNs (or LB targets) that will hot a PSN in the system.  The PSN does not send any CoA at this point. If the client started a new RADIUS session, then that should be updated at MnT even if old session was not terminated. 

If I interpret issue correctly, it is the fact that client returns to the original PSN for posture and one with an active session with that endpoint--as it was never terminated, and does not check with MnT if another session started elsewhere.  Note that I do not think this issue is necessarily limited to ISE 2.2 alternative discovery flow as it is possible for agent to hit original PSN via ConnectionData.XML and it will think session is still active.

Certainly failure to reliably send RADIUS Accounting Stops can complicate situation.  Make sure idle timers are configured to terminate idle clients behind non-Cisco phones.  As you may know, Cisco phones can immediately signal the PC disconnect to NAD to expedite session termination.

Please continue to work with TAC and raise a defect if one not already applied.  The reason to NOT query MnT for every connection is to limit load and latency required for all PSNs to send and wait for update from MnT for every posture session / reassessment.  That said, there is certainly opportunity to improve flow to handle exception cases.  In the meantime, I would look at ways to improve the communication of session termination for the 3rd-party phones.

Hi Craig,

Thanks for the reply.

I do understand (now, after speaking with TAC) that 2.2 posture wasn't intended to replace URL-redirection, but that's not how the documentation makes it sound.  The documentation I read states:

"When you consider pre ISE 2.2 flow posture relies on NADs not only for user authentication and access restriction but as well for provisioning of information to the agent software about specific ISE node which has to be contacted. Information about ISE node is returned to the agent software as part of redirection process.

Historically, redirection support either on NAD or on ISE side was a key requirement for posture implementation. In ISE 2.2 requirement to support redirection is eliminated for both initial client provisioning and posture process."

There's no caveat mentioned there that this is not the preferred method.  It just says to me that the 2.2 method is the new way to do it.  Also, customers reading that statement simply assume that URL-redirection isn't required anymore, thus making deployment simpler.  It certainly sounds like it makes life easier.

We have configured idle timers in the absence of Cisco phones, however we set them to 10 mins with a 12 hour reauthentication timer.  Do you think we should do it differently by reducing the timers maybe?  The reason we used these timers was as you say; to enable the switch to send accounting STOP to PSN when the client disconnects as the phones don't support second-port disconnect, but not disrupt sessions for fixed PCs, etc.  The problem we have in some cases is that if a user takes a laptop to a meeting room, for example, the 10 minute timer hasn't expired, so they could always hit a new PSN for auth.  This often breaks posture as the Anyconnect ConnectionData.xml file appears to take preference and sends Anyconnect to the last-known PSN.  I have requested a feature enhancement for this via the TAC case, as I suggested in my opening post, and I appreciate that the reason this isn't done today is to reduce load on the MnT.  IMO though I think the process would benefit from a lookup every time, especially if the solution is supposed to allow 3rd-party NADs to be used.

Cheers,

Craig

I would consider lowering the idle timer, for example, 5 minutes, with expectation that endpoints will trigger traffic to keep session active.  Of course, if clients go into sleep/hibernate, then sessions may be terminated.  IP Device Tracking may also be able to keep connection active via poller. 

Although MNT query on EVERY connection may seem to be favorable option upon initial inspection, it would be very expensive from PSN/MNT resource perspective depending on # endpoints doing posture in current framework.  MnT query is currently used as a last resort for PSN discovery.  Here the suggestion is to potentially incur heavy load on entire system to overcome limitations of 3rd-party phones.

Please escalate issue with TAC to help drive the prioritization to address these exception cases.

Thanks again, Craig.  I'll give the reduced idle-timer a try.

Thinking about querying the MnT on every connection... doesn't that happen already when the PSN authenticates the client (if posture-applicable), to check posture status?  If so, adding a check during posture PSN-lookup would be a less-frequent occurrence, wouldn't it, especially if we only check posture once per day, for example?  I don't think that would put a heavy load on the MnT, given that the traffic should be a few KB at most, unless the actual lookup process strains the CPU/RAM/Disk on the MnT to such a large extent?

Posture status is always tracked at session level by "owning" PSN.  Posture Lease is actually replicated to all PSNs so also a local lookup.  MnT node is often a remote node and generally not queried directly for general AAA / service operations. Consider the fact that ISE supports 500k concurrent sessions, the potential load is huge.  Also consider the fact that some clients may authenticate / reauthenticate many times per hour.  For MnT scale, we assume 10 auths / hr for mobile deployments and 1 / hr for wired, although a few misbehaving clients can greatly skew those estimates.

Cheers, Craig.

That's interesting then.

If posture lease is replicated to all PSNs, why not replicate the posture status, or is it that way already?  Is that why we do a MnT lookup when session-owner isn't the authenticating PSN?  I'm assuming that's because the session info isn't replicated across all PSNs? [EDIT] Or are you saying that the lease is the 24h timer, and that is replicated to the PSNs?  (That's how I understand it to work already)

That brings me back to why the AnyConnect client continually floats around PSNs during posture without URL-redirection.  If the client postures successfully in the morning, for argument's sake, then disconnects (with no acct STOP) and connects somewhere else, then tries to posture at its last-known PSN, why does it then fail?  If we set posture to check every time, I can understand why it breaks, but not when we set posture to check daily.  Surely the client would just hit the compliant rule straight away and AnyConnect would show that, as the PSN thinks the client is already posture-compliant?

Craig,

Replication of a posture lease flag which is retained for many hours or days is a very different situation that replication of active session state, which is essentially what you are proposing.  We do not replication session state across PSNs, which is the reason central session directory must be referenced as a last resort measure.  Other facets of the discovery process will try to prevent successive MnT queries via updates to local table that lists previously connected PSNs.

Ideally clients should not continually float between PSNs which is why I recommend use of load balancers with persistence.

Posture Lease can help reduce the impact of what you are currently experiencing.   Also, disconnects without Accounting Stops should not be considered typical case.

Please continue to work issue with TAC and escalate as needed.  We won't solve current behavior or limitations in forum.  I can only provide best practice guidance to help reduce such exceptions.

Regards,

Craig

Thanks, Craig, but I think you misunderstand somewhat.

"Replication of a posture lease flag which is retained for many hours or days is a very different situation that replication of active session state, which is essentially what you are proposing"

No, that's not what I'm proposing at all.  What I am proposing is to check the MnT every time a posture check occurs, as it is possible for the client to talk to ANY PSN in posture 2.2 mode, and in the world of networking UDP traffic isn't guaranteed to be delivered.  This means we can't 100% rely on acct STOP to tell PSN when session is over.

We have loadbalancers with persistence, using your guide, but it didn't work properly.  We changed it, as per my input in a previous thread, and it worked better.  Saying that though, posture has nothing whatsoever to do with loadbalancers so I'm unsure why you've made that statement.  With posture 2.2, the client uses the call-home list and connectiondata.xml file to find a PSN.  This is the bit I'm saying floats around for clients.  There's not any real logic in what it does, especially in my testing.  Sometimes the client goes to the last-known PSN and others it just picks any of the others.

Also, I was hoping that other forum users would share their experiences, but that's unlikely now you've marked the question as answered.

My intent is to assist and not to debate.  The discussion keeps returning to an enhancement/change request in current behavior and this is not the proper forum for such requests.  That should be handled through Cisco account team and submitted to the Business Unit. You can also escalate issue with TAC and that may raise priority for addressing any identified defects or incorrect behavior.

I understood your request for an MnT query on every connect.  What I attempted to explain is the potentially negative impact of such a behavior and why we chose not to execute that option.  I also attempted to explain the difference between posture state and posture lease and why we do not replicate session state across PSNs.

Posture state can certainly be impacted by LB design.  I do not recall previous discussion on this topic but glad you were able to tune system to your requirements.  If you feel that there is an error in a guide, feel free to send reference and I can track for possible update to a future version of the guide.

There is definitely a logic to the current discovery mechanism but certainly the results can be impacted by your configuration and network connectivity at time of discovery.  I understand that you are hitting an exception case due to current logic which is aggravated by inconsistent or delayed transmittal of RADIUS Accounting Stops, so my advice here was not to assume absolute resolution for your exception case, but to reduce its occurrence and impact.

To address remaining exception cases due to current processing logic, I again recommend escalation of TAC case and working with account team for any necessary enhancements.

Regards,
Craig

Mr Hyps,

Although I am suggesting that the process needs enhancement, I am not asking for that here.  My initial question was a simple one:

"Has anyone deployed ISE posture using the new 2.2-style method without URL-redirection?

I'm keen to hear if anyone has, how well it has worked for them, any gotchas, workarounds to issues, etc."

I wanted other users to tell me what their experiences were, and what, if anything, have they done to make the deployment work for them.  You responded and marked the question as answered, even though the answer was not to my satisfaction in that I haven't had any feedback from any other users.  This forum enables me to ask questions, does it not?

I completely appreciate that you are attempting to assist, however I feel that you are also removing focus somewhat from my question.  The documentation describes the 2.2 posture-style discovery method as the new way of doing things, and nowhere in any literature can I see anything to contradict that.  If this is not the case, I understand, however my customer is giving me a hard time, as a Cisco partner, because ISE doesn't do what it claims, or not very well at the very least.  Again, the documentation here (ISE posture style comparison for pre and post 2.2 - Cisco) says this:

"When you consider pre ISE 2.2 flow posture relies on NADs not only for user authentication and access restriction but as well for provisioning of information to the agent software about specific ISE node which has to be contacted. Information about ISE node is returned to the agent software as part of redirection process.

Historically, redirection support either on NAD or on ISE side was a key requirement for posture implementation. In ISE 2.2 requirement to support redirection is eliminated for both initial client provisioning and posture process."

The word "historically" there implies to me that redirection support was the old way.  The statement; "In ISE 2.2 requirement to support redirection is eliminated for both initial client provisioning and posture process.", further bolsters my assumption that 2.2 posture is the new way, not an alternative-but-less-preferred method.  Apologies if I misunderstood.

I am pursuing the issues via TAC already and continue to do so.  The TAC engineer has confirmed that the logic doesn't seem to be 100% and that, as you say, URL-redirection is the way to go.  As you can appreciate, this is disappointing to hear, given the statements in the documentation I've read, as this means we have to deploy ACLs to nearly 1000 switches, maintain them, etc, and also learn that the functionality which offers 3rd-party support for non-Cisco NADS won't actually really be a feasible option.

I understand that posture could be affected by LB design, however once the auth/acct transactions are dealt with, the client is free to go wherever it chooses using posture 2.2.  That was my point and seems less than reliable without some additional checks.  I appreciate why it works this way today, but it could be improved IMO.  The client talks direct to the PSN so LB is only a router for all intents and purposes and is not involved in the client's decision when choosing a PSN.  I'm sorry if making suggestions offends - that's not my intention, and I'm not asking for enhancements here as I already have via the TAC case.

I'll take this offline and speak with my account team and the BU, as you suggested.

Thanks for your time and valuable input.

Craig

I have Unmarked my response as Correct (Note that I did not mark as such).  This will hopefully allow others to respond to your query as requested to your satisfaction.

Yes, I have worked with accounts that have implemented the new ISE 2.2 Posture Discovery logic but will not bias the feedback that you seek from others.

The primary focus of this feature is to allow deployments that do not support URL redirection, not necessarily as a replacement for networks that support redirect. I will forward the thread to the ISE PM team for Posture so that they can review the wording in documentation and make changes as appropriate.

Yes, this is a forum for asking questions and hope you will continue to leverage the Community for this purpose. Suggestions are certainly welcome, but ultimately these need to be communicated to the proper teams noted above to make these suggestions actionable.

Best Regards.

hi @craig.beck 

Can you share with me what the idle timer and re authentication timer you set on the profile? still 10 min and 12 hour? 

i have same issue on the 3rd party cisco phone

hruizman
Cisco Employee
Cisco Employee

Hello Craig, I will attempt the same as you in lieu to resolve an issue with a customer. If redirection is not enabled, how is client provisioning working for the first time for an endpoint that does not have the AnyConnect installed? Do we have to go to a given URL (which is?)?

Thank you!