cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
11594
Views
1
Helpful
19
Replies

Azure Packet Fragmentation

InfraISE2020
Level 1
Level 1

Hi all,

We deployed ISE in Azure back in March (version 3.3.0.430) with the following setup:

- 4 x ISE servers (PAN/PSNs)

- ExpressRoute from on-premise to Azure

- Meraki APs

We noticed we have hundreds of clients stopped responding errors every day and when we look further the main error is "12935 Supplicant stopped responding to ISE during EAP-TLS certificate exchange". Quite a few online posts suggest this is an MTU sizing issue where Azure drops fragmented packets. 

The deployment guide suggests that this is a known issue with DMVPN and SD-WAN connections and the fix is to contact Microsoft support for them to allow "out-of-order fragments" option.

We logged this with Microsoft and apparently this isn't just a case of enabling a setting and the fix is to create a brand new subscription, use gen 7 VMs and route traffic via the internet!!! Obviously this isn't viable as our connection to Azure has to go via our Express Route circuit! 

The guide suggests this has now been fixed in East Asia and West Central US however nothing has changed in UK South.

 

Has anyone else come across a similar issue and managed to get the issue resolved without the things they suggested to us? 

Also is there anywhere in ISE where we can prove that Azure is dropping fragmented packets so we can go back to our account manager with evidence?

TIA. 

 

19 Replies 19

I cannot speak to what MS is doing to resolve this. You would need to speak with them for current info.

The fragmentation and out-of-sequence packets are due to the large certificate payload. EAP-TTLS(PAP) does not use client side certificates, so it should not have the same issues. It is password-based, however, so there may be various other implications.

This is still going on and I think it is not going to change.  I'm facing the same issue and reached out to Microsoft about this, the basically said the following:

1. If out-of-order fragment reordering is needed, Azure can only enable this with the following limitations and requirements:

- The VM needs to be fully maintained and have all the applicable security patches in a timely manner

- The out-of-order fragments must originate from the internet to a public IP address attached directly to a VM.

- The out-of-order fragment reordering flag only supports specific VM SKUs, generally Dsv4, Ev4, Bv1 and earlier.  The compute optimized FSv2, which Cisco recommends to use for a PSN, wasn't supported according to the Azure engineer

- Allowing out-of-order fragments reordering exclusively applies to public IPs attached to the VMs. It is not supported for load balancing, ExpressRoutes or VPN gateways

- All the VMs MUST be deployed into a new empty subscription, which is pinned to compatible hardware clusters

- If VNets need to communicate across subscriptions you can use VNET peering, although VNet peering does NOT inherit the UDP fragment flag. 

Also, Cisco is stating that two Azure regions have a 'fix' applied already and that these regions will allow out-of-order fragmented UDP. I've asked Microsoft about this and they claim that this is false.  The Azure engineer told me that Azure East Asia and Azure West Central US also need this flag and will drop out of order fragments by default. 

Note that it seems that ONLY Microsoft Azure is doing this. AWS, OCI and physical DCs that I've tried  seem to not drop the fragmented traffic and my 802.1 eap-tls supplicants could authenticate fine if the PSN was hosted there. 

elbertdue
Level 1
Level 1

We deployed ISE in Azure back in March (version 3.3.0.430) with 4 ISE servers (PAN/PSNs), using ExpressRoute from on-premise to Azure and Meraki APs. However, we're seeing hundreds of "Supplicant stopped responding to ISE during EAP-TLS certificate exchange" errors daily. From what I've read, this could be an MTU sizing issue since Azure might be dropping fragmented packets. Minecraft Pocket Edition

The deployment guide mentions this is a known problem with DMVPN and SD-WAN connections, and the solution is to ask Microsoft to enable the "out-of-order fragments" option. But when we reached out, they told us the fix would require setting up a new subscription, using Gen 7 VMs, and routing traffic over the internet – which isn’t feasible for us since we rely on ExpressRoute.

I’ve also read that this issue is supposedly fixed in East Asia and West Central US, but nothing’s changed in UK South. Has anyone else faced this and found a workaround that doesn't involve completely redoing the setup? Also, is there a way in ISE to confirm that Azure is dropping fragmented packets so we can provide evidence to our account manager?

What you can do is pointing a supplicant to test with to a specific PSN node and then run packet captures across your network path.  You can track them by using the identification field field in the IP header (e.g. wireshark: ip.id == 0x66c4). Look if there is a lot of fragmented traffic and try to track the fragmented access-requests.  Check if you see certain sessions that aren't received by the ISE node and check if they egress your Express route interface on prem. If there's an intermediate device like a firewall in between you can check if that hop receives all the fragments or not. If it was egressing correctly at your on prem express route interface but not received on the PSN you at least know that its getting lost somewhere in between the express route and Azure. 

To create a tcp dump on the PSN:

Go to Operations -> Diagnostic tools -> TCP dump and select the PSN node to create a pcap on the PSN VM itself. 

 

Damon Kalajzich
Level 1
Level 1

I found most cisco devices fragment incorrectly, but I managed to work around the fragmentation issue in azure by implementing the following
https://www.cisco.com/c/en/us/support/docs/security/identity-services-engine-33/220568-configure-ise-3-3-native-ipsec-to-secure.html
Essentially this hides the out of order fragments within IPSEC so Azure is none the wiser.