cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1106
Views
5
Helpful
4
Replies

WLC in HA Keep Rebooting

janne.kuosmanen
Level 1
Level 1

I have a 5508 WLC pair setup in HA SSO mode and the primary controller keeps rebooting, while secondary is up and running.

We had to take power off the primary, so we first made switchover by issuing redundancy force-switchover. Then powered primary off and after putting power back it just gets up and then reboots.

Found the Peer. Starting Role Determination...
Restarting system ..

 

Does anyone have any idea how to resolve the situation?

Here is whole story from CLI.

WLCNG Boot Loader Version 1.0.16 (Built on Feb 28 2011 at 13:14:54 by cisco)
Board Revision 1.3 (SN: FCW1650L0H6, Type: AIR-CT5508-K9) (P)

Verifying boot loader integrity... OK.

OCTEON CN5645-NSP pass 2.1, Core clock: 600 MHz, DDR clock: 330 MHz (660 Mhz data rate)
FPGA Revision 1.7
Env FW Revision 1.8
USB Console Revision 2.2
CPU Cores:  10
DRAM:  1024 MB
Flash: 32 MB
Clearing DRAM........ done
Network: octeth0', octeth1
  ' - Active interface
  E - Environment MAC address override
CF Bus 0 (IDE): OK
IDE device 0:
 - Model: STEC M2T CF 1.0.0 Firm: K1367MIX Ser#: STIM2MA212285221848
 - Type: Hard Disk
 - Capacity: 977.4 MB = 0.9 GB (2001888 x 512)


Press <ESC> now to access the Boot Menu...

Loading primary image (7.4.100.0)
100%

34603060 bytes read
Launching...
init started: BusyBox v1.6.0 (2010-05-13 17:50:10 EDT) multi-call binary
starting pid 821, tty '': '/etc/init.d/rcS'

Set PLX switch MPS settings .............!!!!!!!
Detecting Hardware ...
set smp_affinity for irq 48
003f
DP from CGE5.0 ...
starting pid 1067, tty '/dev/ttyS0': '/usr/bin/gettyOrMwar'
Setting up ZVM
Exporting LD_LIBRARY_PATH

Cryptographic library self-test....passed!
XML config selected
Validating XML configuration
Read HA Config before validation
octeon_device_init: found 1 DPs
readCPUConfigData: cardid 0x6070001
Cisco is a trademark of Cisco Systems, Inc.
Software Copyright Cisco Systems, Inc. All rights reserved.

Cisco AireOS Version 7.4.100.0
Firmware Version FPGA 1.7, Env 1.8, USB console 2.2
Initializing OS Services: ok
Initializing Serial Services: ok
Initializing Network Services: ok
Initializing Licensing Services: ok

License daemon start initialization.....

License daemon running.....
Starting Statistics Service: ok
Starting ARP Services: ok
Starting Trap Manager: ok
Starting Network Interface Management Services: ok
Starting System Services: ok
Starting FIPS Features: ok : Not enabled
Starting Fastpath Hardware Acceleration: ok
Starting Fastpath Console redirect : ok
Starting Fastpath DP Heartbeat : ok
Fastpath CPU0.00: Starting Fastpath Application. SDK-1.8.0, build 269. Flags-[DUTY CYCLE] : ok
Fastpath CPU0.00: Initializing last packet received queue. Num of cores(10)
Fastpath CPU0.00: Init MBUF size: 1856, Subsequent MBUF size: 2040
Fastpath CPU0.00: Core 0 Initialization and FIPS self-test: ok
Fastpath CPU0.00: Initializing Timer...
Fastpath CPU0.00: Initializing Timer...done.
Fastpath CPU0.00: Initializing Timer...
Fastpath CPU0.00: Initializing NBAR AGING Timer...done.
Fastpath CPU0.01: Core 1 Initialization and FIPS self-test: ok
Fastpath CPU0.02: Core 2 Initialization and FIPS self-test: ok
Fastpath CPU0.03: Core 3 Initialization and FIPS self-test: ok
Fastpath CPU0.02: Received instruction to get link status
Fastpath CPU0.04: Core 4 Initialization and FIPS self-test: ok
Fastpath CPU0.05: Core 5 Initialization and FIPS self-test: ok
Fastpath CPU0.06: Core 6 Initialization and FIPS self-test: ok
Fastpath CPU0.07: Core 7 Initialization and FIPS self-test: ok
Fastpath CPU0.08: Core 8 Initialization and FIPS self-test: ok
Fastpath CPU0.09: Core 9 Initialization and FIPS self-test: ok
Starting Switching Services: ok
Starting QoS Services: ok
Starting Policy Manager: ok
Starting Data Transport Link Layer: ok
Starting Access Control List Services: ok
Starting System Interfaces: ok
Starting Client Troubleshooting Service: ok
Starting Management Frame Protection: ok
Starting Certificate Database: ok
Starting VPN Services: ok
Starting Licensing Services: ok
Starting Redundancy: Starting Peer Search Timer of 120 seconds

Found the Peer. Starting Role Determination...
Restarting system ..

Updating license storage ...  Done.
Restarting system.


WLCNG Boot Loader Version 1.0.16 (Built on Feb 28 2011 at 13:14:54 by cisco)
Board Revision 1.3 (SN: FCW1650L0H6, Type: AIR-CT5508-K9) (G)

Verifying boot loader integrity...

4 Replies 4

Are you running 7.4.100.0 version ? If so upgrade this to 7.4.121.0 & see. 7.4.100.0 is not a recommended version at all.

Loading primary image (7.4.100.0)

HTH

Rasika


*** Pls rate all useful responses ***

What would be the correct order to do upgrade in this situation with minimum outage?

This is production environment (24/7) so learning by doing is not good method.

AP:s are now in secondary WLC and cannot modify the controller configurations while primary unit is booting. Do I have to disable sso, what happens to licenses etc.

Because I don't have testing environment any advice is welcome.

Hi Janne,

Please check these guidlines and procedure before upgarde in HA setup.

http://www.cisco.com/c/en/us/td/docs/wireless/controller/technotes/7-5/High_Availability_DG.html#pgfId-46645

http://www.cisco.com/c/en/us/td/docs/wireless/controller/technotes/7-5/High_Availability_DG.html#pgfId-46645

 

Regards

Dont forget to rate helpful posts

janne.kuosmanen
Level 1
Level 1

From SNMP log:

Mar 26 09:33:02 ip-of-secondary ha-wlc1: *rmgrTrasport: Mar 26 09:33:02.408: #RMGR-3-INVALID_PING_RESPONSE: rmgr_utils.c:234 Ping response from ip-of-primary is invalid. Incorrect checksum.
Mar 26 09:33:02 ip-of-secondary ha-wlc1: *osapiBsnTimer: Mar 26 09:33:02.518: #LOG-3-Q_IND: rmgr_utils.c:234 Ping response from ip-of-primary is invalid. Incorrect checksum.
Mar 26 09:33:37 ip-of-secondary ha-wlc1: *rmgrTrasport: Mar 26 09:33:37.409: #RMGR-3-INVALID_PING_RESPONSE: rmgr_utils.c:234 Ping response from ip-of-primary is invalid. Incorrect checksum.
Mar 26 09:33:37 ip-of-secondary ha-wlc1: *sisfSwitcherTask: Mar 26 09:33:38.010: #LOG-3-Q_IND: rmgr_utils.c:234 Ping response from ip-of-primary is invalid. Incorrect checksum.
Mar 26 09:33:46 ip-of-secondary ha-wlc1: *rmgrTrasport: Mar 26 09:33:46.487: #RMGR-3-INVALID_PING_RESPONSE: rmgr_utils.c:234 Ping response from ip-of-primary is invalid. Incorrect checksum.
Mar 26 09:33:46 ip-of-secondary ha-wlc1: *sisfSwitcherTask: Mar 26 09:33:46.632: #LOG-3-Q_IND: rmgr_utils.c:234 Ping response from ip-of-primary is invalid. Incorrect checksum.
Mar 26 09:33:46 ip-of-primary ha-wlc1: *rmgrMain: Mar 26 09:33:47.075: #RMGR-0-RED_HA_RELOAD: rmgr_utils.c:177 System reboot: reason: category Peer reload req object Peer
Mar 26 09:33:49 ip-of-primary ha-wlc1: *rmgrMain: Mar 26 09:33:50.082: #OSAPI-3-INVALID_FILE_HANDLE: osapi_support.c:1011 The File/Socket handle is Invalid. Handle = 0.
Mar 26 09:33:49 ip-of-primary ha-wlc1: -Traceback:  0x10af3218 0x10aedea4 0x10dab618 0x10b546ac 0x10b4a834 0x10b462a4 0x10aff308 0x120124c0 0x12072e6c

 

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card