cancel
Showing results for 
Search instead for 
Did you mean: 
cancel

ISE and Load Balancing

23350
Views
40
Helpful
38
Comments
Cisco Employee

So, this is my first blog post on here.  Hope it goes well.

One of the most commonly asked questions of late is how to properly use a load-balancer with Cisco's Identity Services Engine.  Here are some basic guidelines to use when configuring a Load Balancer for the ISE Policy Services Nodes (PSNs).

Understanding terms:

PSN = Policy Services Node.  The PSN is the ISE persona that handles all of the radius requests, and make the policy decisions.  If you are using profiling, the PSN is also handling the profiling for you.

VIP = Virtual IP Address.  This is the IP Address that Load Balancer listens on, and will redirect traffic destined to the VIP to the real IP Addresses of the servers in the Server Farm.

Server Farm = The Grouping of servers that will be load balanced when traffic is destined to the VIP

Endpoint = the actual device accessing the network.

NAD = Network Access Device.  The Access-Layer device (switch / wireless controller) that provides and enforces network access to the endpoint.

SNAT = Source Network Address Translation.  Function of load balancers to hide the source ip address of the NAD, which allows the load-balancer to run "out of band". 

General Guidelines

Edit section

When using a Load-Balancer (anyone's) you must ensure a few things.

  • Each PSN must be reachable by the PAN / MNT directly, without  having to go through NAT (Routed mode LB, not NAT).  NO NAT.  This  includes the Accounting messages, not just the Authentication ones.
  • Each PSN Must also be reachable directly from the Client's – for redirections / CWA / Posture, etc…
  • You may want to "hack" the certs to include the VIP fqdn in the SAN field.
  • Perform sticky (aka: persistance) based on Calling-Station-ID and Framed-IP-address
  • VIP gets listed as the RADIUS server of each NAD for all 802.1X related AAA.
  • Each PSN gets listed individually in the Dynamic-Authorization (CoA).  Use the real IP Address of the PSN, not the VIP.
  • The LoadBalancer(s) get listed as NADs in ISE so their test authentications may be answered.
  • ISE  uses the Layer-3 Address to Identity the NAD, not the NAS-IP-Address in  the RADIUS packet...  This is another reason to avoid SNAT.

Failure Scenarios:
Edit section

  • The VIP is the RADIUS Server, so if the entire VIP is down,  then the NAD should fail over to the 2ndary DataCenter VIP (listed as  the 2ndary RADIUS server on the NAD).
  • Probes on the  Load-Balancers should ensure that RADIUS is responding, as well as HTTPS  (at minimum).  LB Probes should send test RADIUS messages to each  server periodically, to ensure that RADIUS is responding, not just look  for open UDP ports.  Same goes for HTTPS.
  • Should use  node-groups with the L2-adjacent PSN's behind the VIP.  If the session  was in process and one of the PSN's in a node-group fails, then another  of the node-group members will issue a CoA-reauth; forcing the session  to begin again.  At this point, the LB should have failed PSN1 due to  the probes configured in the LB; and so this new authC request will hit  the LB & be directed to a different PSN…

Why can't we use Source NAT (SNAT)?

Edit section

One of the most common questions when load balancing, is: "Why can't  we use SNAT?".  Source NAT is a fantastic thing for general  Load-Balancing - but not with ISE.  The resons listed below pertain to ISE version 1.1.x; and may change with ISE 1.2+

Reason #1:  Network Access Device (NAD) will be wrong:
With SNAT, the source Network Access Device will show up in ISE as being the Load-Balancer, NOT the Network Access Device.

(click image to enlarge)

Source_is-ACE.png

ISE uses sessionized network authentication.  This means ISE is  tracking the session along with the NAD - so the NAD & ISE stay in  sync about the state and location of the endpoint...  This session also  gives ISE the NAD address to send Change of Authorizations to, as well  as the location of the endpoint.  We use the source NAD in many  different ISE Policies - and if all nodes always appear to be coming  from the Load-Balancer, instead of the NAD - how can we know the  location of the endpoint?

Location is not nearly as big of a deal as the Change of  Authorizations. ISE records the Layer-3 Address of the NAD  from the Layer-3 headers.  There is a RADIUS field known as  NAS-IP-Address; which embeds in the IP Address of the Network Device in  the RADIUS Packet.  However, ISE does not currently use that field; and therefore the L3 IP Address of the NAD must be correct for Change of Authorization to be sent to the correct device.  If the NAD  appears as the IP Address of the Load-Balancer, then ISE will send the  Change of Authorization to the Load-Balancer - not the switch.

Reason #2:  URL Redirection and Web Portals:
Next, ISE 1.1.x only has one interface that can be used for all functions.  Yes, we can run RADIUS on any of ISE's four interfaces, but the Gigabit 0/0 interface is the ONLY interface for Management Traffic.  Also, the fqdn of the Policy services node is embedded into the certificate for ISE 1.1.x; and that is what gets used for URL Redirection for WebAuth & Device Registration &  Supplicant Provisioning, etc...

(click image to enlarge)
LB - Cert_FQDN.png

So, when the URL Redirection occurs, the endpoints will need to talk to ISE Directly (not the VIP) - and reach the web portals.  The Portals can ONLY exist on the Gigabit 0/0 Interface in 1.1.x.  (This may change in a future version of ISE).

Reason #3:  Routing Tables:
Unless you add a static route to ISE for every NAD Subnet, ISE does not  have the ability in 1.1.x to return traffic on a different subnet  through a different Gateway, only it's default Gateway.  Therefore, the  Load-Balancer MUST be the Default-Gateway for the ISE PSN's.

Since the Load-balancer must be the default Gateway, then all Management Traffic is also flowing through the Load-Balancer, unles you physically locate the Policy Administrative Node (PAN) and Monitoring & Troubleshooting Node (MNT) behind the load-balancer as well (just don't include those in the ServerFarm).

I hope that helps. 

Aaron

38 Comments

Very useful blog. thanks... Doesn't look like i'll be using Cisco ACE modules for doing this though...

Cisco Employee

Thanks.  ACE is not a requirement.  I have customers using non-Cisco Load Balancers successfully.

Beginner

Hi Aaron, great post; you mentioned

"You may want to "hack" the certs to include the VIP fqdn in the SAN field."

Is it supported to use a single 'multiple-SAN' certificate, for EAP, on multiple load-balanced PSN group members?

(assuming you just list the FQDNs of the load-balanced PSnodes in the SAN field)

Therefore (on some user supplicants) users only need to accept the certificate once during PEAP auth, no matter which PSN node member they are balanced to

Thanks

Cisco Employee

So this may need to be my next Blog Post...  Ultimately, there is often a need to include other FQDN's in the SAN field of the ISE Certificate.  It is roadmapped to eventually allow the admin to customize the CSR's and include fields like the SAN Fields - but that is a future item...

Things like the friendly names of the MyDevices Portal, or the Sponsor Portal - can prompt for cert mis-match errors.  So using open SSL to create a CSR from the private/public keys on ISE is a nice trick!

Hopefully, the PSN's are using certificates that are signed by a trusted root of all clients.  That is ultimately the only way to ensure the client will accept the PSN's certificate - without prompting or error. For example it is VERY difficult to get the native Windows supplicant to accept a certificate from a server that it doesn't already trust - and how can you trust it if it never prompts?

Beginner

Hi,

Nice article. Where is the VIP configured?  On Load balancer or ISE?  Does ISE support VIP configurations?

Cisco Employee

The VIP is configured on the Load-Balancer.  It is the Virtual IP Address created by load balancer - that you send the RADIUS traffic to.  The LB then sends the traffic to each server in the Server Farm based on load.

Aaron

Beginner

Aaron,

Thanks for the response.  Is there a way I can avoid Load Balancer (new variable in the setup) and just use ISE with a VIP configured on it directly?  I couldn't find documentation information on setting up VIPs on ISE.  Appreciate the info.

Beginner

Aaron, this is great, we'll be doing a deployment of ISE with ACE next week and I was wondering if any of this has been revised or update since then. Maybe I'll call you if that's ok.

Beginner

Hi,

I deploying ACE and integrating it with AD for machine authentication. Since the AD domain is domain.local I will not get Public certs for this as it will freeze after Nov 2015. My question is that can I host Sponsor and Guest Portal over ACE so that I can put public cert with domain.com etc on ACE to front end Guest portal. Public CA is needed as customer does not want its guest to be prompted for security exception.

We had planned to use LDAP integration instead of AD and use EAP-TLS but the problem comes that in LDAP that my customer wants to allow its users to use guest internet on their personal devices via AD credentials but LDAP authentication is not working with Logon Name but with Username..

Beginner

Hi,

I would like to loadbalance PSN nodes. Instead of using a loadbalancer I would like to use Jgroups which I have seen in release 1.2 presentations. However I am unable to find any information in configuration guides detailing how to design and configure this feature. Has anyone seen more detail? is it even possible to use this feature to loadbalance without a loadbalancer such as an F5 or have I misinterpurated the purpose of Jgroups?

Thanks in advance.

Cisco Employee

Hi all,

To aijazbeigh:  You will need to host the certificates on ISE itself.  You can use separate certificates for HTTPS then for EAP which allows different certificates to be used for the GUEST/Sponsor portals than the 802.1X authentications, etc.  However, in 1.2 - we added the ability to use for flexible certificate options - like to use the SAN Fields to ensure that the certificates have all the right DNS.Name entries for both .com & .local, etc.  Even to use Wildcards if you would like.  Take a look at my Network World Blog on wildcard certificates: http://www.networkworld.com/community/blog/26151  - also at my Cisco Live Advanced ISE session on CiscoLive365 (free to access) and you can even watch the VoD there: 

https://www.ciscolive365.com/connect/sessionDetail.ww?SESSION_ID=8241&backBtn=true

@danhosking:  jGroups are not a load balancing mechanism, they are a communications bus that ISE now uses for the synchronization of it's database(s).  There is no configuraiton for those, simply place your Policy Nodes into a node group in the Deployment tab of ISE, and they will start using multicast for local updates -- Unicast to the Admin Node for Global updates.  There are lots of options to load balance without having to spend $$ on an F5 Big-IP or other load balancer like the ACE.  With Cisco IOS we have built-in RADIUS load-balancing.  See my Cisco Live presentation for details, it's on CiscoLive365 (free to access); and you can even watch the VoD:

https://www.ciscolive365.com/connect/sessionDetail.ww?SESSION_ID=8241&backBtn=true

Aaron

Beginner

Thanks for the clarification Aaron. I misunderstood a slide on 365 live. All good now. Is there any chance of a similar loadbalancing function on non IOS WLCs in the future? as I have seen the new 5760 WLC does.

Cisco Employee

Not that I'm ever allowed to comment about roadmap on public forums anways - but that is actually a question I cannot answer anyways   

The reason it exists in the 5760, is because the 5760/3850's use the same authentication sub-system as any other IOS-based switch would have.  Provides a LOT of added value - but still missing a number of features that are in the traditional AerOS-based controllers.  So, right now, it's certainly a trade off.  Won't always be that way.

-Aaron

Beginner

Hi Aron

I have a question regarding profiling Load balance using F5 as i configured the NAC agent to point to the VIP and on F5 monitor i see session well balanced between two PSN but in ISE monitoring i see just one node does the whole profiling , any idea ?.....  BTW your cisco 365 live session rocks

Beginner

Aaron,

 

Any updates to this or possible configuration examples for ISE 1.2 with Node Groups and F5?

 

Specific to Wireless Users where there is a single SSID being used, I have suggested to the client to use a F5 given they will have 50k+ Endpoints concurrently in their deployment.  We did not find another method for load balancing the PSN's (Virtual) without an F5 for Wireless Deployments over 10K Endpoints.