cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1579
Views
10
Helpful
6
Replies

WLC 9800 connection interruption

Hi all,

we have 2x WLC 9800 physical/virtual in HA and notice occasionally switch to/from WLC1 to WLC2, which causes short connection interruption while all APs go to the other WLC.

we did upgrade to 17.3.4c from 17.3.2a - same issue

ios same on both WLC

when we check show logging we can not find the reason for reboot so we can not explain this behavior ... ?

 

Regards

Boris

6 Replies 6

marce1000
VIP
VIP

  Note : https://www.cisco.com/c/dam/en/us/td/docs/wireless/controller/9800/17-2/deployment-guide/c9800-ha-sso-deployment-guide-rel-17-2.pdf 

    >....

  ■ HA Pair can only be form between two wireless controllers of the same form factor 

            So your particular combination of controllers may not be supported

 - Besides that check these requirements : https://www.cisco.com/c/en/us/support/docs/wireless/catalyst-9800-series-wireless-controllers/213915-configure-catalyst-9800-wireless-control.html#anc2  , then check logs on both controllers when this happens , preferably use  syslog-server as a log collector. Have a sanity check of the 9800 configuration with (CLI) show  tech wireless , have the output processed by : https://cway.cisco.com/tools/WirelessAnalyzer/  

                  Why was upgrading not possible ?

 M.

 M.



-- ' 'Good body every evening' ' this sentence was once spotted on a logo at the entrance of a Weight Watchers Club !

Scott Fella
Hall of Fame
Hall of Fame

I’m hoping that you are talking about N+1 and not SSO. Like what @marce1000 @mentioned, you will not get different models to work. 
Anyways, not knowing if the AP’s are in local mode or FlexConnect, or if this is a new install or an older install that started having issues makes it hard to troubleshoot. Have you properly defined the access point high availability to ensure that aps have the priory controller defined?  Are the AP’s and controllers local to each other? Is there any possibility that the ap have lost connectivity to the controllers?

-Scott
*** Please rate helpful posts ***

we tested WLC stability and we do have access to GUI

only Friday we noticed on the Switchport 30 input errors/30 CRC not too much but with config of only 50% APs on physical WLC .

3Hr ago noticed 88 input errors/88 CRC on same Switchport.

 

3Hr ago

#sh int gi1/0/44

GigabitEthernet1/0/44 is up, line protocol is up (connected)

  Description: UPLINK

  MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec,

     reliability 255/255, txload 1/255, rxload 1/255

  Encapsulation ARPA, loopback not set

  Keepalive set (10 sec)

  Full-duplex, 1000Mb/s, media type is 10/100/1000BaseTX

  input flow-control is off, output flow-control is unsupported

  ARP type: ARPA, ARP Timeout 04:00:00

  Last input 00:00:00, output 00:00:00, output hang never

  Last clearing of "show interface" counters never

  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0

  Queueing strategy: fifo

  Output queue: 0/40 (size/max)

  5 minute input rate 535000 bits/sec, 125 packets/sec

  5 minute output rate 510000 bits/sec, 145 packets/sec

     16678998 packets input, 5219555547 bytes, 0 no buffer

     Received 44024 broadcasts (34589 multicasts)

     0 runts, 0 giants, 0 throttles

     88 input errors, 88 CRC, 0 frame, 0 overrun, 0 ignored

     0 watchdog, 34589 multicast, 0 pause input

     0 input packets with dribble condition detected

     29637163 packets output, 7826371322 bytes, 0 underruns

     0 output errors, 0 collisions, 1 interface resets

     14709 unknown protocol drops

     0 babbles, 0 late collision, 0 deferred

     0 lost carrier, 0 no carrier, 0 pause output

     0 output buffer failures, 0 output buffers swapped out

 

just an hour ago l did moved all APs to the physical/primary WLC and input errors are getting slowly higher.

to me looks like the WLC Ethernet port sends Packets with Error.

 

2 HR ago:

#sh int gi1/0/44

GigabitEthernet1/0/44 is up, line protocol is up (connected)

  Description: UPLINK

  MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec,

     reliability 255/255, txload 1/255, rxload 1/255

  Encapsulation ARPA, loopback not set

  Keepalive set (10 sec)

  Full-duplex, 1000Mb/s, media type is 10/100/1000BaseTX

  input flow-control is off, output flow-control is unsupported

  ARP type: ARPA, ARP Timeout 04:00:00

  Last input 00:00:27, output 00:00:00, output hang never

  Last clearing of "show interface" counters never

  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0

  Queueing strategy: fifo

  Output queue: 0/40 (size/max)

  5 minute input rate 734000 bits/sec, 155 packets/sec

  5 minute output rate 621000 bits/sec, 166 packets/sec

     16886253 packets input, 5330557833 bytes, 0 no buffer

     Received 44724 broadcasts (35019 multicasts)

     0 runts, 0 giants, 0 throttles

     96 input errors, 96 CRC, 0 frame, 0 overrun, 0 ignored

     0 watchdog, 35019 multicast, 0 pause input

     0 input packets with dribble condition detected

     29856436 packets output, 7926390862 bytes, 0 underruns

     0 output errors, 0 collisions, 1 interface resets

     14750 unknown protocol drops

     0 babbles, 0 late collision, 0 deferred

     0 lost carrier, 0 no carrier, 0 pause output

     0 output buffer failures, 0 output buffers swapped out

 

 

when data transfer is higher - we get errors regularly.

should we replace WLC ?

 

What model controllers do you have?  You mentioned a physical appliance and a virtual appliance?  I'm assuming this is N+1 and not SSO since SSO requires the same model and firmware.  You replaced the cable from the controller to the switch and have you tried a different switchport?  There are other ports on the controller that you can test with.

-Scott
*** Please rate helpful posts ***

Haydn Andrews
VIP Alumni
VIP Alumni

First take it we talking N+1

Is it all APs go at the same time? 

You have this line "can not find the reason for reboot" What is rebooting the WLC or the AP? If the AP Join Profile is configured correctly as per the HA N+1 config guide the AP wont reboot, there may be a capwap tunnel drop but the AP should not reboot.

 

Are the WLCs physically located on the same LAN as the APs or is there a WAN between them?

99% of the time when I have seen this happening it is to do with the APs losing heartbeat to the WLC so then failing over

 

Show log from the AP may indicate why it decided to failover.

*****Help out other by using the rating system and marking answered questions as "Answered"*****
*** Please rate helpful posts ***

alirafaleiro
Level 1
Level 1

A wireless LAN controller (WLC) is a network component that manages wireless network access points and allows wireless devices to connect to the network. It offers central control over network elements, increases network visibility, and greatly simplifies individual component monitoring.

https://www.cisco.com/c/dam/en/us/td/docs/wireless/controller/9800/17-2/deployment-guide/c9800-ha-sso-deployment-guide-rel-17-2.pdf

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card