04-10-2021 07:04 PM
Hi All,
Can some advise on the design strategy for large scale deployment.
We are trying to deploy a 28-30 node deployment with individual nodes in DC and DR and some dedicated local PSNs as VM in critical sites so that local user authentication is not impacted if there are any issues in DC and DR
1) Putting Primary admin and Secondary MNT in main DC and 3 PSNs behind F5 ( does it make sense to have F5)
2) Putting Secondary Admin and Primary MNT in DR with 1 to 2 PSNs in DR in case DC goes down
3) There are 18 critical sites so putting a Local PSN ( VM based) so that they are always up and running
Based on the deployment guides, is there any justification why we cannot have primary admin and primary MNT in main DC and Secondary admin/Secondary MNT in DR unlike in point 2)
Also , is it advisable to put 18 PSNs in critical sites behind F5 to have redundancy all the time.
In terms of sizing does 3655 is suffice or we should go for 3695 for 100-200k endpoints, please advise. The critical sites are connected to DC via SGTs or DMVPN tunnel as a part of SDA
Rgds,
Meh
Solved! Go to Solution.
04-10-2021 07:38 PM
On 1, put the primary pan and mnt in the same DC, you'll have better gui performance. The mnt is active/active so if you lose the mnt, it doesn't really matter.
2, it also doesn't matter where your psns are either, just make sure you have the network devices configured for them so that they are reachable in failure scenarios. Load balanced vips are helpful in this scenario.
On 3, there is no issue running 18 psns deployed to critical sites, but it's expensive and you also have to look at the services unavailable when the PAN is not reachable. This is covered in the high availability section of the ISE admin guide. If you are considering putting these behind f5, then you need at least two per site, I wouldn't bother with this. Just focus on putting F5s in the DCs with the correct number of psns behind them to make patching/upgrading easy.
Speaking to the scale, you're likely way over killing the deployment infrastructure. I have a customer running 200k active endpoints on 14x 3595 psn vm's with n+1 dc failure. This has been running fine for a few years since 2.1 released. Keep in mind that a single 3655 will handle up to 50k active endpoints on its own, and a 3695 up to 100k. You likely could deploy 3615 VMs to most critical remote sites.
For reference, I have a customer moving towards a million active endpoints on 18x 3695 psns.
04-11-2021 07:44 PM
Thanks Damien
It really helps so I will just review the expected active sessions to get the right sizing else customer might end up as overbuying the PSNs and higher range which is not needed.
Regards,
Meh
04-10-2021 07:38 PM
On 1, put the primary pan and mnt in the same DC, you'll have better gui performance. The mnt is active/active so if you lose the mnt, it doesn't really matter.
2, it also doesn't matter where your psns are either, just make sure you have the network devices configured for them so that they are reachable in failure scenarios. Load balanced vips are helpful in this scenario.
On 3, there is no issue running 18 psns deployed to critical sites, but it's expensive and you also have to look at the services unavailable when the PAN is not reachable. This is covered in the high availability section of the ISE admin guide. If you are considering putting these behind f5, then you need at least two per site, I wouldn't bother with this. Just focus on putting F5s in the DCs with the correct number of psns behind them to make patching/upgrading easy.
Speaking to the scale, you're likely way over killing the deployment infrastructure. I have a customer running 200k active endpoints on 14x 3595 psn vm's with n+1 dc failure. This has been running fine for a few years since 2.1 released. Keep in mind that a single 3655 will handle up to 50k active endpoints on its own, and a 3695 up to 100k. You likely could deploy 3615 VMs to most critical remote sites.
For reference, I have a customer moving towards a million active endpoints on 18x 3695 psns.
04-11-2021 02:38 AM - edited 04-11-2021 02:39 AM
Hi @Damien Miller ,
once again thanks for sharing ...
Since you have a Customer moving to a million Active Endpoints on 18x 3695 PSNs, I would like to add a question to this post ... how do you handle the Maximum Concurrent Sessions GUI limit of 20?
Note: I opened an Enhancement (CSCvr13484 Increase ISE Max Concurrent Sessions GUI limit beyond 20) on the past, but still no idea if Cisco put this enhancement on it's roadmap, the problem is ... having only 20 Admin at the same time, in Huge Deployments, with PSNs all around the Country (imagine at least one Admin per State) is a challenge.
04-11-2021 01:19 PM
We don't provide admin access to the network engineers, they access logs through splunk. PAN GUI admin access is limited to about six people.
04-11-2021 12:15 PM
Thanks Damien,
If the both ISE PAN in DC and DR goes down then what is the best way to recover. I thought if PAN goes down then PSN still authenticate with old configuration so the only problem would be managing the nodes.
If in the case of critical sites if Primary PAN is not reachable then what is the impact on authentications ? SHould be put separate cluster for Critical sites but this would need a technical justification as why it can be problem if PAN is in DC ( Latency should not a problem as it is via DNAC which is 10 ms)
On scaling what would be recommendation
Individual PANs : 3695
Individual MNT: 3695
PSNs in critical sites which is expected to have low traffic : VM based 3615 or higher
PSNs in DC or DR: Should we go for 3655 or 3695, assuming the worst cases.
My understanding is that if PSNs in critical sites goes down we can still point those towards main DC with some changes in DNAC.
Regards,
Meh
04-11-2021 01:28 PM
The following page goes over the services available/unavailable when the primary admin node is down or unreachable.
You'll not have a very noticble performance increase going from 3655 to 3695 PAN/MNT nodes. I've found that the performance improvement most will really notice is hosting the PAN/MNT nodes on flash storage. There is a very noticble difference running the admin and monitoring services on fast storage.
The node sizes you select for PSNs should really be determined by the tps and expected active sessions. For the number of active sessions you're looking at I would lean towards 3655 since 3695s behind f5's would likely be overkill unless you expected a scenario where all load would fall to just two PSNs.
If you have the budget then 3695's wouldn't hurt, just might be overbuilt.
04-11-2021 07:44 PM
Thanks Damien
It really helps so I will just review the expected active sessions to get the right sizing else customer might end up as overbuying the PSNs and higher range which is not needed.
Regards,
Meh
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide