cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
13292
Views
15
Helpful
54
Replies

N+1 5508 WLC failover test

edwardzeng
Level 1
Level 1

Good day all,

I have a question about the N+1 5508 failover test:

Should I shutdown one of the primary WLC to test failover?

I just setup the N+1 bakcup WLC (5508). B

Based on: http://www.cisco.com/en/US/docs/wireless/technology/hi_avail/N1_High_Availability_Deployment_Guide.pdf

We have two production WLCs both 5508 and one 4405.

We just purchased another HA-SKU WLC 5508.


All our four WLCs had been setup into one mobility group in version 7.4.100.6.

.

Their neighbors are all up.

But our test AP could not register to the Backup N+1 WLC. ( We are using option 43 in our DHCP server for all the AP boot.)

Here are the log screen:

================ From test Access Point============

*Mar  1 00:00:53.099: %CDP_PD-4-POWER_OK: Full power - INJECTOR_CONFIGURED_ON_SOURCE inline power source

*Mar  1 00:00:53.842: %DHCP-6-ADDRESS_ASSIGN: Interface BVI1 assigned DHCP address 10.255.1.3, mask 255.255.255.0, hostname wo11-test-ap1

*Mar  1 00:00:54.188: %LINK-6-UPDOWN: Interface Dot11Radio0, changed state to up

*Mar  1 00:00:55.188: %LINEPROTO-5-UPDOWN: Line protocol on Interface Dot11Radio0, changed state to up

*Mar  1 00:00:55.279: %LINK-6-UPDOWN: Interface Dot11Radio1, changed state to up

*Mar  1 00:00:56.280: %LINEPROTO-5-UPDOWN: Line protocol on Interface Dot11Radio1, changed state to up

*Mar  1 00:01:03.820: %CAPWAP-5-DHCP_OPTION_43: Controller address 10.254.240.5 obtained through DHCP

*Mar  1 00:01:03.820: %CAPWAP-3-ERRORLOG: Did not get log server settings from DHCP.

*Mar  1 00:01:13.823: %CAPWAP-3-ERRORLOG: Go join a capwap controller

*Aug  2 02:30:55.000: %CAPWAP-5-DTLSREQSEND: DTLS connection request sent peer_ip: 10.254.240.5 peer_port: 5246

*Aug  2 02:31:25.003: DTLS_CLIENT_ERROR: ../capwap/base_capwap/dtls/base_capwap_dtls_connection_db.c:2051 Max retransmission count reached!

*Aug  2 02:31:55.001: %DTLS-5-SEND_ALERT: Send FATAL : Close notify Alert to 10.254.240.5:5246

*Aug  2 02:31:55.001: %CAPWAP-3-ERRORLOG: Go join a capwap controller

*Aug  2 02:30:55.000: %CAPWAP-5-DTLSREQSEND: DTLS connection request sent peer_ip: 10.254.240.23 peer_port: 5246

*Aug  2 02:30:55.490: %CAPWAP-5-DTLSREQSUCC: DTLS connection created sucessfully peer_ip: 10.254.240.23 peer_port: 5246

*Aug  2 02:30:55.493: %CAPWAP-5-SENDJOIN: sending Join Request to 10.254.240.23

*Aug  2 02:30:55.493: %CAPWAP-3-ERRORLOG: Invalid event 10 & state 5 combination.

*Aug  2 02:30:55.493: %CAPWAP-3-ERRORLOG: CAPWAP SM handler: Failed to process message type 10 state 5.

*Aug  2 02:30:55.493: %CAPWAP-3-ERRORLOG: Failed to handle capwap control message from controller

*Aug  2 02:30:55.493: %CAPWAP-3-ERRORLOG: Failed to process encrypted capwap packet from 10.254.240.23

*Aug  2 02:30:55.874: %LINK-6-UPDOWN: Interface Dot11Radio0, changed state to down

*Aug  2 02:30:55.931: %LINK-5-CHANGED: Interface Dot11Radio0, changed state to reset

*Aug  2 02:30:55.987: %CAPWAP-5-JOINEDCONTROLLER: AP has joined controller WG-WLC1

*Aug  2 02:30:56.041: ac_first_hop_mac - IP:10.255.1.1 Hop IP:10.255.1.1 IDB:BVI1

*Aug  2 02:30:56.041: Setting AC first hop MAC: ccef.481f.14bf

-test-ap1#sh int bvI 1

BVI1 is up, line protocol is up

  Hardware is BVI, address is e8b7.489e.4645 (bia e8b7.489e.4645)

  Internet address is 10.255.1.3/24

===================From backup N+1 WLC===

*spamApTask4: Aug 02 11:41:09.842: #CAPWAP-3-DTLS_DB_ERR: capwap_ac_sm.c:7305 64:a0:e7:40:eb:42: Failed to create DTLS connection for AP  10:255:1:3 (58470).

*spamApTask4: Aug 02 11:41:01.889: #CAPWAP-3-DTLS_DB_ERR: capwap_ac_sm.c:7305 64:a0:e7:40:eb:42: Failed to create DTLS connection for AP  10:255:1:3 (58470).

*spamApTask4: Aug 02 11:40:57.912: #CAPWAP-3-DTLS_DB_ERR: capwap_ac_sm.c:7305 64:a0:e7:40:eb:42: Failed to create DTLS connection for AP  10:255:1:3 (58470).

*spamApTask4: Aug 02 11:40:55.924: #CAPWAP-3-DTLS_DB_ERR: capwap_ac_sm.c:7305 64:a0:e7:40:eb:42: Failed to create DTLS connection for AP  10:255:1:3 (58470).

*spamApTask4: Aug 02 11:18:50.553: #CAPWAP-3-DTLS_DB_ERR: capwap_ac_sm.c:7305 64:a0:e7:40:eb:42: Failed to create DTLS connection for AP  10:255:1:3 (58469).

*spamApTask4: Aug 02 11:18:42.600: #CAPWAP-3-DTLS_DB_ERR: capwap_ac_sm.c:7305 64:a0:e7:40:eb:42: Failed to create DTLS connection for AP  10:255:1:3 (58469).

*spamApTask4: Aug 02 11:18:38.623: #CAPWAP-3-DTLS_DB_ERR: capwap_ac_sm.c:7305 64:a0:e7:40:eb:42: Failed to create DTLS connection for AP  10:255:1:3 (58469).

*spamApTask4: Aug 02 11:18:36.636: #CAPWAP-3-DTLS_DB_ERR: capwap_ac_sm.c:7305 64:a0:e7:40:eb:42: Failed to create DTLS connection for AP  10:255:1:3 (58469).

.

*mmListen: Aug 02 10:43:38.637: #LOG-3-Q_IND: spam_lrad.c:1676 Ignoring discovery request from AP e8:b7:48:9e:46:45 - maximum number of downloads (0) exceeded

*spamApTask0: Aug 02 10:43:38.500: #LWAPP-3-DISC_MAX_DOWNLOAD: spam_lrad.c:1676 Ignoring discovery request from AP e8:b7:48:9e:46:45 - maximum number of downloads (0) exceeded

==================== From one of our Primary WLC=====================

(WLC-5500) >show advanced backup-controller

AP primary Backup Controller .................... ODC-WLC1 10.254.240.5

AP secondary Backup Controller ..................  0.0.0.0

(WLC-5500) >show redundancy summary

Redundancy Mode = SSO DISABLED

     Local State = ACTIVE

      Peer State = N/A

            Unit = Primary

         Unit ID = 54:75:D0:DE:DE:40

Redundancy State = N/A

    Mobility MAC = 54:75:D0:DE:DE:40

Redundancy Management IP Address................. 0.0.0.0

Peer Redundancy Management IP Address............ 0.0.0.0  

Redundancy Port IP Address....................... 0.0.0.0

Peer Redundancy Port IP Address.................. 169.254.0.0

(WLC-5500) >show license capacity

Licensed Feature    Max Count         Current Count     Remaining Count

-----------------------------------------------------------------------

AP Count            250               203               47

==============From the Backup N+1 WLC in DR =====================

(Cisco Controller) >show redundancy summary

Redundancy Mode = SSO DISABLED

     Local State = ACTIVE

      Peer State = N/A

            Unit = Secondary - HA SKU

         Unit ID = 6C:41:6A:5F:4C:80

Redundancy State = N/A

    Mobility MAC = 6C:41:6A:5F:4C:80

Redundancy Management IP Address................. 10.254.240.3

Peer Redundancy Management IP Address............ 0.0.0.0

Redundancy Port IP Address....................... 169.254.240.3

Peer Redundancy Port IP Address.................. 169.254.0.0

(Cisco Controller) >show license capacity

Licensed Feature    Max Count         Current Count     Remaining Count

-----------------------------------------------------------------------

AP Count            500               0                 500

2 Accepted Solutions

Accepted Solutions

You don't, but make sure it's showing active... 500 AP count.

Sent from Cisco Technical Support iPhone App

-Scott
*** Please rate helpful posts ***

View solution in original post

There is a difference between AP SSO and N+1.  The N+1 setup you do need to make sure you activate the license which is really accepting the RTU.

Thanks,

Scott

Help out other by using the rating system and marking answered questions as "Answered"

-Scott
*** Please rate helpful posts ***

View solution in original post

54 Replies 54

I have just setup 4 x 5508 and 1 x 5508 HA trying to follow the N+1 High Availability Overview "guide".

http://www.cisco.com/en/US/docs/wireless/technology/hi_avail/N1_HA_Overview.html#wp1054644

When i shut down the ethernet connections for one of the primary WLCs, no APs join the HA controller.

In the HA controllers message log i see exactly the same kind of errors as described above.

For eksample:

*spamApTask2: Aug 08 18:55:38.011: #CAPWAP-3-DTLS_DB_ERR: capwap_ac_sm.c:7321 00:1f:6d:d6:7c:00: Failed to create DTLS connection for AP  10:255:205:22 (3540).

and

*spamApTask0: Aug 08 18:52:55.429: #LWAPP-3-DISC_MAX_DOWNLOAD: spam_lrad.c:1676 Ignoring discovery request from AP 00:27:0d:55:fc:00 - maximum number of downloads (0) exceeded

These controllers are running 7.4.110

I just read the N+1 Deployment Guide.

http://www.cisco.com/en/US/docs/wireless/technology/hi_avail/N1_High_Availability_Deployment_Guide.pdf

I suspect the problem might be that the HA SKU does not have any permanent license counts when you receive it from Cisco.

In the end of the guide where license is explained, it seems that when you enable Redundancy ( on a WLC with 50 Base licenses as shown )  the "show license capacity" should read 500 - 0 - 500.

On the HA SKU this is not the case.

I think this is a "no-base license Bug" :-/

PS:

Just for the "heck" of it i tried to enable the 500 Eval license.

Now the APs can join the controller, but they do so even when the primary controllers are avalible.

Hi Thomas,

My case was "fixed " by upgraded the N+1 WLC to 7.4.110.0, reload it and setup the backup WLC into High Availabliltity settings in my test AP.

I will upgrade our production to 7.4.110.0 tonight, so I can fully test the failover.

I will update to you if I get confirmation about that.

I attached the debug log files.

Cheers,


Edward

Hi Edward

Thank you for the update.

Unfortunately I had already upgraded "my" 5508 to 7.4.110 when this error occurs.

Just a few questions if I may:

1: did you put the +1 HA controller into the mobility group of the other controllers ?

2: did you configure anything under the Redundancy settings, other the primary / secondary, on the different controllers ?

3: was your HA controller bought as a HA SKU or was it a +50 AP controller you "upgraded" ?

/Thomas

Hi Thomas,

1: did you put the +1 HA controller into the mobility group of the other controllers ?

yes. HA and production WLC are all in the same mobility group.

2: did you configure anything under the Redundancy settings, other the primary / secondary, on the different controllers ?

I just follow the Cisco document as above. nothing special from the document.

3: was your HA controller bought as a HA SKU or was it a +50 AP controller you "upgraded" ?

Yes. we purchased the HA SKU WLC from Cisco, we did not use the standard WLC with +50 ap license.

This Friday night I will test them with all the fully traffic load.

I will update to you later.

Full traffic load test passed!

All good now.

Hi Thomas,

one thing popup from my mind was, have you setup the "High Availability" setting in your AP?

You know from the AP you have setup up to three WLC in its "High Availablitity"  settings.

It is very important, actually I can fource our AP to failover to the HA WLC without shutdown our production WLC, that helps me a lot to do the maintain jobs on our production WLC.

Hope that will help.

Cheers

Edward

Hi Edward / Thomas, I am facing the same issue. I am using 7.5 on the WLC 5508. Based on the guide, there is not much we have to configure so I do not know why is failing. I will post screenshots with the steps I followed so you could see if I forgot something.

Edward, did you configure the PRIMARY WLC and HA WLC using GUI instead of CLI?. I am confused because the guide mentions something about REDUNDANCY MANAGEMENT IP ADDRESS AND PEER REDUNDANCY MANAGEMENT IP ADDRESS so I do not know if I must configure this parameter on both WLC's.

thanks

AJ

Hi Abraham,

I just follow the guide not thing special.

Here are the screen log of our HA WLC, our WLC are in 7.4.110.0

There are two licenses: Base  and Based-AP-Count

My HA-N+1 WLC

base license is: Active, Not in Use

base-ap-count licenses status is: License State: Active, Not in Use, EULA not accepted

Maybe check your N+1 WLC.

=======================================

(Cisco Controller) >show license summary

License Store: Primary License Storage

StoreIndex:  0  Feature: base                              Version: 1.0

        License Type: Permanent

        License State: Active, Not in Use

        License Count: Non-Counted

        License Priority: Medium

License Store: Evaluation License Storage

StoreIndex:  0  Feature: base-ap-count                     Version: 1.0

        License Type: Evaluation

        License State: Active, Not in Use, EULA not accepted

            Evaluation total period:  8 weeks  4 days

            Evaluation period left:  8 weeks  4 days

       License Count: 500 / 0 (Active/In-use)

        License Priority: None

I will give a try to what you say.

Unfortunately, I am using URL Redirect for Web Authentication with and External Cisco ISE as AAA Server that is not working on version 7.5.102.0 (the version I used for the HA SKU WLC testing. This is another issue I am facing now.

Abraham,

In the middle of this post you mentioned about the Redundancy MGMT IP and Peer MGMT IP, and then below this Scott mentions this is only for AP-SSO Mode. I am trying to do only N+1 (non AP-SSO) and it is forcing me to put an IP in these fields. I am not sure what to put, because I have more than one controller for the (N) portion so this cannot be right.

Am I missing something?

This is the guide you need to follow for N+1

http://www.cisco.com/en/US/docs/wireless/technology/hi_avail/N1_High_Availability_Deployment_Guide.pdf

Sent from Cisco Technical Support iPhone App

-Scott
*** Please rate helpful posts ***

Hi Scott,

Is there any bug with the AP Fallback on version 7.5?.

I have configured in both WLC the AP Fallback enabled, Mobility Group is the same for both WLC and is UP between them. However, when I deactivated the Primary WLC, the AP reconnect automatically to the HA WLC with no issues and I could associate, authenticate and navigate BUT, once I reestablished the connectivity on the Primary WLC, the AP never went back to Primary WLC.

Is it neccesary to configure as well High Availability on the AP?

thanks in advance for your orientation.

Hi Wesley,

Firstly, I am assuming that you ACCEPT the evaluation license so the HA N+1 can work, otherwise will not work on version 7.5 (check this post, I put some screenshots about this part). This part is the EULA accepted or not.

I will post tomorrow some additional screenshots with the specific configuration in the Primary and HA N+1 WLC's so you will find it very easy. The WLC version 7.5 apparently has a BUG so the Evaluation License in the HA SKU WLC keeps counting down even though the AP is no more connected to that Backup WLC (it went back to the PRIMARY WLC once I restored the operation in that WLC - screenshots will be added on this part as well). This bug is apparently solved on version 7.6

But apparently upgrading from 7.5 to 7.6 has some issues as well. I am going to test it and let you know.

One more detail. take a look on the RELEASE NOTES for 7.4 and 7.5. In another post that I created I found an issue with the DNS interaction between enduser and server so if you are using PREAUTH ACL with URL Redirect for Web Auth, it will not work on 7.4 and above. In the previous version to 7.4, DNS communication was allowed by default. Based on the release notes I mentioned before, now 7.4 and above require to create another rule in the PREAUTH ACL allowing UDP Traffic for DNS (well-known port 53) between enduser and DNS Server so URL Redirect works.

One more thing that I checked with the TAC Engineer, HA N+1 implies that when PRIMARY WLC is down, all the AP automatically switch into the HA SKU WLC so the previous enduser connection is closed and he needs to reconnect and reauthenticate. This is because HA N+1 does not accept SSO.  I tested this in the lab and works perfectly. In addition to this, you do not need to configure High Availability in the AP so the AP switch to the HA SKU WLC when the PRIMARY WLC fails and goes back when the PRIMARY WLC is restored. This is done automatically by the AP based on the tests I ran in the lab environment.

http://www.cisco.com/en/US/docs/wireless/controller/release/notes/crn74mr02.html#wp784178

Hope this helps, regards

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card