Azure Packet Fragmentation

InfraISE2020 · ‎10-08-2024

Hi all,

We deployed ISE in Azure back in March (version 3.3.0.430) with the following setup:

- 4 x ISE servers (PAN/PSNs)

- ExpressRoute from on-premise to Azure

- Meraki APs

We noticed we have hundreds of clients stopped responding errors every day and when we look further the main error is "12935 Supplicant stopped responding to ISE during EAP-TLS certificate exchange". Quite a few online posts suggest this is an MTU sizing issue where Azure drops fragmented packets.

The deployment guide suggests that this is a known issue with DMVPN and SD-WAN connections and the fix is to contact Microsoft support for them to allow "out-of-order fragments" option.

We logged this with Microsoft and apparently this isn't just a case of enabling a setting and the fix is to create a brand new subscription, use gen 7 VMs and route traffic via the internet!!! Obviously this isn't viable as our connection to Azure has to go via our Express Route circuit!

The guide suggests this has now been fixed in East Asia and West Central US however nothing has changed in UK South.

Has anyone else come across a similar issue and managed to get the issue resolved without the things they suggested to us?

Also is there anywhere in ISE where we can prove that Azure is dropping fragmented packets so we can go back to our account manager with evidence?

TIA.

Greg Gibbs · ‎10-08-2024

See EAP Fragmentation Implementations and Behavior

There are multiple levels of fragmentation involved and one of the problems is that the Windows native supplicant uses large EAP messages (1470 bytes), which forces the IP fragmentation. This is a hardcoded setting which cannot be changed.
The result of the fragmentation is that the last packet is smaller, leading to a faster transmit, and therefore received out-of-sequence.

I'm not sure I understand why MS is stating that the traffic has to be routed via the internet, but the only way to verify that the issue is due to dropped packets is to take a packet capture on each side of the connection (client and ISE) and compare them.

InfraISE2020 · ‎10-11-2024

Hi @Greg Gibbs thanks for the reply.

Are you aware of any other customers who are experiencing the same issues as it sounds like it's a common issue when deploying ISE in Azure? I noticed on the recent deployment guide that it refers to a fix by Microsoft in certain regions, do you know what the actual fix is? We've escalated this with our account manager at Microsoft but it would be good to understand if others are having the same issue as its a big problem for us and we cannot resolve it at the moment.

Cloud Deployment Guide

Due to this known issue, do one of the following:

Select regions where Azure Cloud has already implemented the fixes: East Asia (eastasia) and West Central US (westcentralus).
Cisco ISE customers should raise an Azure support ticket. Microsoft has agreed to take the following actions:
1. Pin the subscription to ensure all instances within that subscription are deployed on hardware generation 7.
2. Enable the "allow out-of-order fragments" option, which allows fragments to pass through to the destination instead of being dropped.

Greg Gibbs · ‎10-13-2024

My understanding is that pre-Gen 7 hardware is unable to reassemble the out-of-sequence fragments properly, but Microsoft would have to confirm that is the case.

Any customer running ISE nodes in Azure with EAP-TLS flows would have this issue. I've had customers with multi-cloud environments deploy ISE in AWS instead of Azure as AWS does not have this issue.

InfraISE2020 · ‎10-24-2024

Hi @Greg Gibbs ,

We aren't having much joy with Microsoft, there suggested fixes are not applicable and we are unable to migrated to AWS as all our infrastructure is in Azure.

I've seen articles suggest setting the MTU on the PSN interface to 1300 and other sites suggest adding the framed-mtu to the authz profiles but i'd like to get more information before we start making random changes.

I can see packet fragementation on our Azure inferface on our fortinet firewall but our networking team are telling us its normal to see fragmentation on L3 network and that there is nothing we can do to re-order the packets before sending to Azure to ensure they are in order and do not get dropped?

Its incredibly frustrating as Microsoft are saying its nothing to do with them and the Cisco guide implies some fixes have been applied in certain regions but it would be good to understand exactly what has been changed.

We migrated ISE to azure based on the Cisco deployment guide and it was suggested that its a simple fix from Microsoft when it doesn't appear to be that way.

Greg Gibbs · ‎10-24-2024

You could certainly try those options, but I'm not confident they will make any difference. None of them will change the fact that the Windows native supplicant uses large EAP messages (1470 bytes), which forces the fragmentation at the IP layer.

The combination of these two factors (expected fragmentation and dropping out of sequence fragments) results in the problem.

FWIW, the OSX supplicant appears to use 1270 byte EAP messages, so Apple appears to have a better grasp on basic networking than MS. I know this doesn't help the situation you're having though.

pritamCTC · ‎03-03-2025

@Greg Gibbs

https://learn.microsoft.com/en-us/azure/virtual-network/virtual-network-tcpip-performance-tuning#azure-and-fragmentation

Cisco Bug id updated on 3rd Feb, 2025

https://bst.cisco.com/quickview/bug/CSCwe82033

I am not sure under this bug why it's only mentioned Network Access Device (Access Point or Switch), when we are doing the EAP-TLS with central auth that time NAD device should be WLC. is not it? is this something incorrectly mentioned in the bug ID?

pritamCTC · ‎07-17-2025

Choose the regions where Microsoft Azure Cloud has already implemented the fixes: Central Canada (CanadaCentral), Central France (FranceCentral), Central India (CentralIndia), Central Poland (PolandCentral), Central Sweden (SwedenCentral), Central UAE (UAECentral), East Asia (eastasia), East Australia (australiaeast), East Canada (CanadaEast), East Japan (japaneast), East Norway (NorwayEast), East US (eastus), North Central US (northcentralus), North Germany (GermanyNorth), North EU (EUNorth), North Switzerland (SwitzerlandNorth), North UAE (UAENorth), South Africa North (SouthAfricaNorth), South Brazil (brazilsouth), South East Asia (southeastasia), South East Australia (australiasoutheast), South India (SouthIndia), South UK (uksouth), West Central US (westcentralus), West Central Germany (GermanyWestCentral), West UK (ukwest), and West US (westus).

Cisco Document updated on 15th July, 2025

https://www.cisco.com/c/en/us/td/docs/security/ise/ISE_on_Cloud/b_ISEonCloud/m_ISEonAzureServices.html

stubush · ‎09-23-2025

I recently deployed an ISE PSN in Azure for the first time (the rest of the deployment is on-prem across multiple sites), as there was to be no server infrastructure at our new India site. Ahead of the install I was bracing myself re the fragmentation issue for EAP-TLS traffic (we are using it for both wired and wireless auth), but to my surprise we didn't run into the issue. The PSN is located in the Azure Central India Region, one of the regions listed in your post, so it does appear they are implementing fixes.

InfraISE2020 · ‎09-23-2025

HI @stubush ,

Thats interesting that your not seeing the fragmentation issues in Central India. I will go back to our contact at Microsoft but it would be good to provide some additional information/context if possible?

What size VM did you use?
How are you connecting from your site where your NADS are located to Azure?
Are you using ExpressRoute or S2S VPN or third party NVA?
What method did you use to check for fragmentation? TCP dump on the PSNs?
What NADs are you using? i.e. what switches and APs?

Apologies for lots of questions but it will help me go back to Microsoft and also the product team at Cisco who i've been working on this with...

stubush · ‎09-23-2025

No worries. In answer to your questions

The VM is a Standard F16s v2 running the PSN persona only
Connectivity is currently a S2S VPN over the internet from our on-prem Fortigate 600F's, and is terminated on our Central India vWAN Hub. We are likely to be moving to ExpressRoute for this site in the near future
NADs are C9300-48UN switches and Meraki MR57 APs

Since we had successful EAP-TLS auths from the get go, we didn't get as far as taking any PCAPs etc.

InfraISE2020 · ‎09-23-2025

Thanks @stubush - just sent you a PM with a few more questions. Appreciate your support.

InfraISE2020 · ‎11-07-2024

Hi @Greg Gibbs ,

We have passed this information onto Microsoft and have also spoken to someone else at Cisco and they have said something similar regarding the hardware.

Are you aware of anyone currently using ISE in the East Asia and West Central US regions who don't experience fragmentation issues in Azure?

I don't suppose you have any more information on the supposed hardware "fix" do you?

Greg Gibbs · ‎11-07-2024

@InfraISE2020, I have only personally worked with customers to deploy ISE in AWS. I don't have visibility of any specific customers that have deployed ISE in Azure for these regions.

I'm not aware of any MS documentation that specifically states what is changed in the newer hardware that resolves this issue. It stands to reason that it would be something in the fragmentation reassembly code and/or hardware (like the ASIC).

pritamCTC · ‎02-26-2025

@InfraISE2020, @Greg Gibbs can you please share the latest update on this is this already resolved from MS side or still ongoing for many customers? do you think the customer with EAP-TTLS also face the same issue?