cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
7520
Views
15
Helpful
17
Replies

Extremely slow WiFi Performance over some L3 routed links

Jan Gilhooley
Level 1
Level 1

Hi,

This is a long one - so bear with me....

 

This issue has been bugging me for a while now. We have a pair of Cisco 5520 Wifi controllers in a SSO configuration (just upgraded both to 8.5.151.0) and approx 300 APs (mainly a mixture of 3702 & 3802's) over a number of sites and running Centrally Switched SSIDs (i.e. we tunnel everything back to our main site). Our main campus network is L2 switched, and we have some remote sites over L3 routed links. Wifi works normally over all these links. The problem is a site belonging to another organisation that is over an internal L3 routed network - at this site the APs get around 0.2mb/s download and 6mb/s upload! Far too slow to be usable. Wired connections are absolutely fine, and as is trying to use FlexConnect and break out traffic locally (there are reasons why this isn't the "proper" solution). Its just the WiFi clients that experience this slow performance.

 

For troubleshooting I've created a test switch connected straight to our Cisco 6800 Distribution Switch with a L3 routed link over Cat5 cable and similar config as the "issue" site - and when I connect up an AP to the test switch I also get the same poor performance - so its definitely something in our network that is doing this.

 

So far I have considered:

1) Wifi Radio Interference - (everyone says this first!) unlikely given that the FlexConnect works normally on the APs and my test AP works normally when in its normal switch and has slow performance when connected to the test switch located in the same room.

2) Routing - cant see any evidence of asymmetrical routing or other routing "issues". Traceroutes from APs/PCs/Switches all go to where I would expect them to go.

3) QoS - we do need to sort out QoS across the Edge/Distribution/Core - but I cant see any packet drops anywhere that would account for the poor performance. And network bandwidth is fine so QoS normally wouldnt get involved. The rates we get do kind of look like a rate limit of some description is being applied - but where? And why would it just be L3 routed?

4) Packet Size - the MTU is set to the standard of 1500 bytes on the switches, and the Path-MTU on the APs is 1485, but from packet captures it looks like the "Do not Fragment" bit has been set. From packet captures of working APs I can see that there is some fragmentation going on - one packet has the full payload, and the fragmented on is around 62 bytes (which is mostly header as far as I can see). I'm aware that there is the CAPWAP tunnel overhead, but that the AP should be able to negotiate with the WLC for the maximum MTU. Its almost like "something" is adding a few bytes to the packets somewhere between AP and WLC that messes up the negotiated maximum MTU. I personally still think this is the most likely cause somehow.

5) Fast Packet Switching not happening - the Cisco 6800 (and the Cisco 3850's the APs are connected to) all support Fast Packet Switching - so if the packets from the APs aren't being processed at a port level but sent to the Switch's CPU instead we might have slow performance there. However I cant see that this is happening, and if it does then why just routed packets?

 

I've been going slowly crazy looking into this one - and what is frustrating is nothing I have tried has had any effect, good or bad, on the Wifi performance rates. So I'm still none the wiser as to what is going on, and what the root cause might actually be. 

 

Any clues for things to go and look at would be very gratefully received! I have the sense I might have overlooked something really obvious here

 

Thanks

 

Jan

 

17 Replies 17

Andditional info, just to make it clear. In post above, when I describe problematic scenario, I refer everywhere to L3 interfaces (SVI - interface VlanX with IP configured ). You are right, ICMP redirects is pure L3 feature, so here we need to consider the path from L3 perspective

 

"3. When WLC sends return traffic to AP, it's when issue happens. WLC sends packet to default gateway, which is 6800 switch, IP 10.180.10.254 VLAN200. 6800 needs to send it further to AP 192.168.2.10. 6800 VSS doesn't have interface in this subnet, instead, this route most likely is available via sw-man, 10.180.10.10. It's also VLAN200. So, 6800 switch receives packet from WLC in VLAN200, and sends it via VLAN200. That's when ICMP redirect is generated and software switching is happening." 

Hi,

Sorry for the delay as today is my first day back after the Christmas break. I've looked at the solution posted by vb10 and it really does look like that is the root cause. I see that the fact that its the VLAN interface for inbound and outbound is happening, and I can also see that the packets that get punted to the CPU do appear to be subject to a Control Plane policy. So, subject to management approval, I'm planning on adding the "no ip redirects" statement to the VLAN interface on the core switch on Thursday 2/1/20 (late afternoon UK time). I shall know more then.....

Right - success! 

 

Simply adding "no ip redirects" to the VLAN IP interface on the core-vss 6800 switch cured the problem (I also added to the sw-man switch VLAN interface for good measure whilst I was doing it). So far all tests I run from sw-test have shown normal wifi performance to internal hosts and the internet, plus one of the users on the actual site in question has confirmed the internet is better (i.e. usable). Sadly the main application that all this is to support is "still slow", but I feel we have removed a major configuration error and, yes, there might other problems we have to address but something major has been resolved.

 

So thank you to everyone who contributed to this topic, and a special "thank you" to vb10 (whoever and wherever you are) for your insight on this one.

 

Jan

Review Cisco Networking for a $25 gift card