How to Determine the Scale of an ISE Deployment

thomas · ‎03-03-2023

The Cisco Identity Services Engine (ISE) platform was built to be highly distributed (from 2 - 50 PSNs) and scalable (up to 2M active endpoints). There are several major factors in determining the scale and distribution of an ISE deploymet, many of which were also covered in our ISE webinar ▷ ISE Deployment Architectures:Nodes, Services & Scale if you prefer to watch and listen.

ISE Nodes and Personas

An ISE node - or server - may be deployed as an appliance, virtual machine (VM), or cloud instance. The ISE software does not care about the underlying physical or virtual platforms of the individual nodes - you may mix and match the server types and sizes for your needs based on the supported ISE Performance and Scale requirements.

The ISE nodes may be configured to handle one or more sets of services - or personas - to delegate the configuration, processing, logging, and sharing of authentication, authorization, and accounting (AAA) requests against your defined network access security policy. These ISE node personas are:

Policy Administration Node (PAN) : Administrative GUI for configuration, policy replication, centralized Guest and BYOD databases, and configuration REST APIs
Monitoring and Troubleshooting Node (MNT) : receives logs from all nodes, replicates to remote logging targets, generates summary Dashboard Views, runs scheduled reports, handles reporting API queries
Policy Service Node (PSN) : processes RADIUS & TACACS requests, runs profiling probes, invokes identity store queries, hosts the Guest/BYOD portals, handles MDM/Posture/Compliance queries, and manages TC-NAC & SXP services
Platform Exchange Grid Node (PXG) : runs the pxGrid controller, authorizes pxGrid Pubs/Subs, publishes pxGrid topics to subscribers, handles ANC/EPS requests, supports additional REST APIs

ISE nodes may run all of the personas in a single, Standalone node deployment in your lab or fully distribute them to individual ISE nodes in a Large deployment with up to 54 nodes (2 PANs, 2 MNTs, <=50: PSNs + <= 4 PXGs) for the greatest scale.

Multiple ISE Deployments

An ISE deployment - or also affectionately known an ISE Cube - is a distributed set of ISE nodes (appliances or VMs) configured and synchronized for handling authentication, authorization, and accounting (AAA) requests. ISE deployments may range from 2 to 54 nodes to accommodate a wide range of performance, availability, and distribution. However, even with this flexibility, ISE deployments may need to be broken up into multiple, independent deployments. This happens for the following reasons:

Scale : if a very large customer has more then 2 million active endpoints (not total endpoints), they will need to consider using 2 or more separate deployments due to their large scale
Latency : all ISE nodes must have a network latency <=300ms to the PAN node in order to maintain synchronization of configuration and endpoint changes. The most extreme example of this is on naval or cruise ships that cannot meet the latency requirements over satellite links and run each ship as an independent deployment. If the latency requirement cannot be maintained, you must consider doing one of the following:
- moving all PSNs to more centralized locations to meet the latency requirment
- moving your PAN nodes to a more central location between all of your PSN nodes
- split your ISE deployment so the PAN nodes in each deployment can meet the latency requirement.
Services : a customer may wants to have multiple, separate ISE deployments for increased security or reliability:
- separated by ISE policy services to manage Guest network separately
- separated by protocol handling (RADIUS vs TACACS) to minimize the impact of one AAA service on the other or for separation of responsibilities
- reliability dictates the need for a separate disaster recovery (DR) deployment
Organizations : some organizations may have autonomous groups that manage their own, seperate ISE services due to different financial, political, historical, governmental, or policy needs. That's OK, too!

Active Endpoints

Active endpoints are the total number of concurrently active endpoints on the network. This is primary factor in Licensing counts and individual PSN performance and scale for every deployment. The RADIUS session for an active endpoint is defined by ISE receiving a RADIUS Accounting Start event from a network device (RADIUS client) about an endpoint and it ends with a Stop event which is triggered by a disconnect (link down or dis-assocation), ide timeout, or session expiration. Coincidentally, the RADIUS session also determines the consumption of an ISE Base (2.x) or Essentials (3.x) License. If ISE never receives the RADIUS Accounting Stop message due to a network or power outage or configuration error, ISE will terminate the session - and decrement the license count(s) - after approximately 4 days.

Locations

The total number of network branches, sites, or regions is the next major factor in an ISE deployment's scale. It is generally recommended to centralize ISE nodes in regional data centers with load balancing to optimize the opportunity for both availability, efficiency. This also minimizes the latency for ISE PSN synchronization over the WAN which has a maximum of 300ms between an ISE PSN and the PAN nodes. However, some customers prefer to distribute more ISE nodes into local branches or sites without a backup WAN, putting local availability over general efficiency for their deployment.

Network Access Method

Not all endpoints and the users behind them (if any) behave the same on the network. You may get a much better understanding of the potential authentication load by counting the endpoints by their Access Method and usage scenario in your environments:

wired : workstations, phones, IOT, etc. are static with longer session times (4-24 hours) and therefore generate less load
wireless : laptops, tablets, mobile phones are implicitly mobile and generate more authentications throughout the day due to roaming and battery power-saving methods. IOT endpoints may be wireless and static (cameras, point of sale, etc.) or highly mobile (scanners, robots, etc.). Daily authentication counts and session durations may vary greatly depending on the device type and usage scenarios. The most extreme example of this is probably a school or university network at the top of the hour between classes!
VPN : remote and mobile workers with virtual private networks (VPN) generally have longer sessions similar to wired endpoints but may disconnnect and reconnect multipe times per day or only to perform a quick task.
Private 5G : this is a new access method supported in ISE 3.2 and generally for mobile IOT devices that have longer sessions

The reality is that you will most likely have a mix of access methods and use cases within those methods. The more you can estimate the endpoint populations for each of your scenarios, the better you can anticipate your ISE usage and scale. Some other considerations for your scaling your ISE deployment are :

misconfigured endpoints (untrusted certificate, old password, unsupported protocol, etc.) may have 100-1000X larger than average authentication load on the system as they continually retry and fail throughout the day
mobile endpoints that hibernate & roam may cause a 3-10X or larger load for authentications over a static endpoint
usage patterns from certain scenarios or populations that will cause ebbs, flows, and even spikes in usage throughout your day
hours of operation or changes in regional activity that “follows the sun” around the world
authentication spikes at the top of the hour for changing meetings and classrooms

Authentication Protocols & Methods

While access methods have different impacts on performance, so do your choices of authentication protocols. Your protocol choice is typically driven by cryptographic capability (if any), the subject's credential type (pre-shared key, password, certificate, or token) and identity store (if any). The most popular network authentication protocols and methods used with IEEE 802.1X are usually based on the Extensible Authentication Protocol (EAP) and have names like PAP, CHAP, PEAP, TEAP, EAP-GTC, EAP-TLS, EAP-TTLS, and EAP-FAST.

The number of authentications per second that a single ISE PSN (Policy Service Node) may handle can vary from single digits to more than 1000! This is related to the latency of the network, number of roundtrips for tunnel and method negotiation, the cryptographic complexity, and finally the network and processing latency of any external identity store that ISE depends on to authenticate the subject. Given the wide range of authentication rates for different protocols, you can see why you need to estimate the number of endpoints using a given authentication protocol at a given time of day or region of your network to ensure you have enough capacity to handle it.