cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2679
Views
1
Helpful
18
Replies

Cisco ISE Deployment in Azure - Nightmare experience!

InfraISE2020
Level 1
Level 1

Hi,

Has anyone been able to successfully deploy ISE in Azure using expressroute from on-premise to the cloud.

We have had ISE running in Azure for about 3-4 months now and have noticed a large amount of fragmentation using EAP-TLS.

 

The Cisco guide suggests a fix has been applied in East Asia and West Central US however it's not been applied to UK South where our VMs are located. We have also raised this with Microsoft support however they cannot tell us what fix this is or when it will be rolled out to our region. 

We enquired about the "enable allow out-of-order-fragments" option however they said this could only be applied if the traffic is coming from the internet, not via expressroute or VPN which is obviously not going to work as we wouldn't send radius traffic straight over the internet! Other requirements include deploying VMs in a brand-new empty subscription and deploying to a Dv4 VM, again this is not possible as the VMs are already in use within an existing subscription. 

It's incredibly frustrating as Cisco can't seem to provide much info on the workaround and Microsoft are just fobbing us off by saying that the information is from Cisco and not from them! 

I'd be grateful if other members on this forum have successfully deployed ISE in Azure with connectivity via ER or VPN and not seen the fragmentation issues when using EAP-TLS. 

TIA. 

18 Replies 18

Cristian Matei
VIP Alumni
VIP Alumni

Hi,

   Welcome to the jungle, it is a well-known challenge. Look here, work with Microsoft and move your VM's on Gen7 HW:

https://www.ciscolive.com/on-demand/on-demand-library.html?zid=pp&search.event=1716482947962001yag9&search=BRKSEC-2039#/session/1717269125663001tXab

Best,

Cristian.

Thanks @Cristian Matei - the errors on that webinar are exactly what we are seeing in Azure. Microsoft just keep fobbing us off with the following:

 

  • Customer is receiving out-of-order fragments via an instance level public IP from the internet. ExpressRoute and VPN first party gateways are not supported. IP fragments do not work in load balancing scenarios, so ensure this is a public IP attached directly to a VM.
  • Customer requires an empty subscription.
  • Customer wishes to deploy a VM SKU that is compatible with hardware that supports out-of-order fragments. Typically, this is Dv4 and earlier (the newest SKUs such as Dv5 do NOT support this).

Are you aware of anyone who has successfully got Microsoft to fix this issue? It would be good to know who the people are in the Cisco engineering team who have worked on Microsoft with this are so maybe they could shed some light on whats required. 

 

Hi,

  Mate, as an engineer I feel your pain; my scenarios were done via Internet, being aware of the challenge. I suggest open a TAC case to get the info you're looking for from Cisco, someone has to know; otherwise, if you're stuck and can't move VM's to a region with the fix, try deployment over Internet and use IPsec tunnels for RADIUS packets or RADIUS over DTLS.

Good luck,

Cristian.

thomas
Cisco Employee
Cisco Employee

InfraISE2020
Level 1
Level 1

thanks @thomas i have also logged a support ticket with Cisco to see if they can provide any information. My account manager at Microsoft has asked me to find out exactly what fix has been applied in East Asia and West Central US as it specifically highlights this in the deployment guide, do you know what this fix is?

 

Due to this known issue, do one of the following:

  1. Select regions where Azure Cloud has already implemented the fixes: East Asia (eastasia) and West Central US (westcentralus).”

This is 100% a Microsoft Azure issue. You will need to ask them. This problem does not exist in other cloud providers.

@thomas I absolutely agree however the documentation from Cisco says it's been resolved in 2 regions but nobody can tell me what the resolution was so I can ask Microsoft to make the same fix in UK South. I have asked senior engineers at Microsoft but they cannot seem to find out what this supposed fix is so I am hoping someone at Cisco can point me in the right direction! 

From Known Limitations of Cisco ISE in Microsoft Azure Cloud Services :

 

  • In Azure, a networking virtual network stack drops out-of-order fragments without forwarding them to the end virtual machine host. This design aims to address the network security vulnerability FragmentSmack, as documented in Azure and fragmentation.

    Cisco ISE deployments on Azure typically leverage VPN solutions like Dynamic Multipoint Virtual Private Networks (DMVPN) and Software-Defined Wide Area Networks (SD-WAN), where the IPsec tunnel overheads can cause MTU and fragmentation issues. In such scenarios, Cisco ISE may not receive complete RADIUS packets and an authentication failure occurs without triggering a failure error log.

    Due to this known issue, do one of the following:

    1. Select regions where Azure Cloud has already implemented the fixes: East Asia (eastasia) and West Central US (westcentralus).

    2. Cisco ISE customers should raise an Azure support ticket. Microsoft has agreed to take the following actions:

      1. Pin the subscription to ensure all instances within that subscription are deployed on hardware generation 7.

      2. Enable the "allow out-of-order fragments" option, which allows fragments to pass through to the destination instead of being dropped.

 

I understand that and have seen that article numerous times but Microsoft are saying they can only enable out of order fragments for VMs with a public IP attached to the NIC, this isn't applicable to us because its internal traffic it doesn't go over the internet.

 

Damon Kalajzich
Level 1
Level 1

I found most cisco devices fragment incorrectly, but I managed to work around the fragmentation issue in azure by implementing the following
https://www.cisco.com/c/en/us/support/docs/security/identity-services-engine-33/220568-configure-ise-3-3-native-ipsec-to-secure.html
Essentially this hides the out of order fragments within IPSEC so Azure is none the wiser.

Hi @Damon Kalajzich , thanks for your feedback, how would you achieve this if you're using Cisco Meraki for Wireless 802.1x? 

Sorry I don't have any experience with Meraki,  to work it would require the device (NAD) sending the radius request to have the ability to configure a ipsec tunnel to ISE.

CitizenGenet
Level 1
Level 1

Is anyone else dealing with this problem still? We're currently in the testing phase of an ISE deployment in Azure and I believe we're encountering this exact situation.

However, the Microsoft agreement no longer seems to be in place because our consistent entreaties to their support engineers seem to be in vain. Something may have changed on their side, but they're saying the 'enable udp out of order fragments' solution is no longer supported (this is the response even after account rep involvement). 

I don't have to remind anyone here, but it throws a major wrench in your plans when core tenet networking laws, such as udp packet reassembly, are no longer something you can reliably account for when implementing a solution.

A lot of the threads on this topic seem to just end or they're older so I'm wondering if anyone has recently experienced this and has successfully developed a creative workaround?

  • I think the ipsec tunnel is an interesting idea but it doesn't seem to scale well if you have a lot of NADs and the maintenance, difficulty in troubleshooting, etc. seems to be a major consideration. 
  • We've also been down the radius-dtls path, which seemed promising. However some of our equipment, like Meraki, only implemented Radsec - so that's out the window (at least for our specific situation).
  • Also, I'm wondering if anyone approached this from the MTU perspective? For example, restricting the MTU at the OS level of client PCs to something like 1200 to avoid fragmentation from the start?

Just trying to come up with some ideas and explore all the contingency plans. I'm interested to see if anyone else is still fighting this and if you're having any success.

Thanks,

I have been dealing with it the last few months, and yes it is difficult to get MS to acknowledge this.  I got to the point where I was advised that MS could enable this feature for me, but...
It can only been enabled on a empty subscription, They pin any new resources added to the subscription to a specific hardware cluster with the feature enabled.  There are other caveats which would mean a complete re-architect of our azure deployment, like you can't use a Azure VPN gateway or Azure Firewall in the path of the traffic to ISE, as these can't be pinned to the specif hardware with the feature enabled.

This is why I have gone down the path of using IPSEC from the NAD to ISE to hide the out of order fragments that cisco devices send from the azure network stack.