cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1601
Views
2
Helpful
6
Replies

Switchover Why?

Moudar
VIP Alumni
VIP Alumni

Hi,

I got WLC 5520 running 8.10.190.0 and SSO.

When I started my work today I got a message from Prime:

Redundancy notification trap triggered by Controller '10.0.0.148' having Redundancy-Management IP '10.0.0.145', Local state is 'Active' and Peer Redundancy-Management IP '10.0.0.146' and peer state 'StandbyHot'

I dont know why did that happen, and no command shows me swichover detalis or why:

(Cisco Controller) >show redundancy   summary
            Redundancy Mode = SSO ENABLED
                Local State = ACTIVE
                 Peer State = STANDBY HOT
                       Unit = Primary
                    Unit ID = CC:46:D6:59:7B:4B
           Redundancy State = SSO
               Mobility MAC = CC:46:D6:59:7B:4B
               Redundancy Port  = UP
            BulkSync Status = Complete
               Link Encryption = ENABLED
Average Redundancy Peer Reachability Latency = 313 Micro Seconds
Average Management Gateway Reachability Latency = 495 Micro Seconds

  At the same time I could observe that ala my APs have cleared their "Usage traffic"!! I checked the uptime of many APs and that showed no sign of any restart.

Any ideas

6 Replies 6

Mark Elsen
Hall of Fame
Hall of Fame

 

 - Perhaps due to some external network problem (not related to the controller) ; configure a syslog server on the HA pair , and follow  up on logs send around switchover but also in  general

 M.



-- Let everything happen to you  
       Beauty and terror
      Just keep going    
       No feeling is final
Reiner Maria Rilke (1899)

Mark Elsen
Hall of Fame
Hall of Fame

 

                - Adding : also configure  the same syslog server on the switches where the controllers are connected too , 

 M.



-- Let everything happen to you  
       Beauty and terror
      Just keep going    
       No feeling is final
Reiner Maria Rilke (1899)

If you connect wlc ha to hsrp then gw unreachable make wlc failover.

The wlc ha by rmi detect GW reachability.

Ap in wlc ha sso is not effect from wlc failover.

MHM

The problem was HSRP failover. But the problem is that all APs restarted:

*Dec  3 23:00:49.847: %DTLS-5-SEND_ALERT: Send FATAL : Close notify Alert to 10.0.0.148:5246
*Dec  3 23:00:49.851: %DOT11-5-EXPECTED_RADIO_RESET: Restarting Radio interface Dot11Radio0 due to the reason code 27
*Dec  3 23:00:49.851: %DOT11-5-EXPECTED_RADIO_RESET: Restarting Radio interface Dot11Radio1 due to the reason code 27
*Dec  3 23:00:49.851: %WIDS-6-DISABLED: IDS Signature is removed and disabled.
*Dec  3 23:00:49.975: %CAPWAP-5-AP_EASYADMIN_INFO: AP Easy Admin information - EASY_ADMIN is not set, turn off easy admin service!

*Dec  3 23:00:49.975: %CAPWAP-5-AP_EASYADMIN_INFO: AP Easy Admin information - Easy Admin is not enabled, turn it off!

*Dec  3 23:00:49.995: %DOT11-5-EXPECTED_RADIO_RESET: Restarting Radio interface Dot11Radio0 due to the reason code 39
*Dec  3 23:00:49.995: %DOT11-5-EXPECTED_RADIO_RESET: Restarting Radio interface Dot11Radio1 due to the reason code 39
*Dec  3 23:00:50.011: %CLEANAIR-6-STATE: Slot 0 down
*Dec  3 23:00:50.011: %CLEANAIR-6-STATE: Slot 1 down
*Dec  3 23:00:50.015: %LINK-5-CHANGED: Interface Dot11Radio0, changed state to administratively down
*Dec  3 23:00:50.015: %LINK-5-CHANGED: Interface Dot11Radio1, changed state to administratively down
*Dec  3 23:00:50.015: %DOT11-5-EXPECTED_RADIO_RESET: Restarting Radio interface Dot11Radio0 due to the reason code 10
*Dec  3 23:00:50.019: %LINK-6-UPDOWN: Interface Dot11Radio0, changed state to up
*Dec  3 23:00:50.623: %DOT11-5-EXPECTED_RADIO_RESET: Restarting Radio interface Dot11Radio1 due to the reason code 10
*Dec  3 23:00:50.651: %LINK-6-UPDOWN: Interface Dot11Radio1, changed state to up
*Dec  3 23:00:51.015: %LINEPROTO-5-UPDOWN: Line protocol on Interface Dot11Radio0, changed state to down
*Dec  3 23:00:51.043: %LINK-6-UPDOWN: Interface Dot11Radio1, changed state to down
*Dec  3 23:00:51.051: %LINK-5-CHANGED: Interface Dot11Radio1, changed state to reset
*Dec  3 23:00:52.035: %LINEPROTO-5-UPDOWN: Line protocol on Interface Dot11Radio0, changed state to up
*Dec  3 23:00:52.043: %LINEPROTO-5-UPDOWN: Line protocol on Interface Dot11Radio1, changed state to down
*Dec  3 23:00:52.079: %LINK-6-UPDOWN: Interface Dot11Radio1, changed state to up
*Dec  3 23:00:52.087: %LINK-6-UPDOWN: Interface Dot11Radio0, changed state to down
*Dec  3 23:00:52.095: %LINK-5-CHANGED: Interface Dot11Radio0, changed state to reset
*Dec  3 23:00:53.079: %LINEPROTO-5-UPDOWN: Line protocol on Interface Dot11Radio1, changed state to up
*Dec  3 23:00:53.087: %LINEPROTO-5-UPDOWN: Line protocol on Interface Dot11Radio0, changed state to down
*Dec  3 23:00:53.115: %LINK-6-UPDOWN: Interface Dot11Radio0, changed state to up
*Dec  3 23:00:54.115: %LINEPROTO-5-UPDOWN: Line protocol on Interface Dot11Radio0, changed state to up
*Dec  3 23:01:10.651: AP has SHA2 MIC certificate - Using SHA2 MIC certificate for DTLS.

*Dec  3 23:01:11.000: %CAPWAP-5-DTLSREQSEND: DTLS connection request sent peer_ip: 10.0.0.148 peer_port: 5246
*Dec  3 23:01:11.227: %CAPWAP-5-DTLSREQSUCC: DTLS connection created sucessfully peer_ip: 10.0.0.148 peer_port: 5246
*Dec  3 23:01:11.227: %CAPWAP-5-SENDJOIN: sending Join Request to 10.0.0.148
*Dec  3 23:01:11.291: %DOT11-5-EXPECTED_RADIO_RESET: Restarting Radio interface Dot11Radio0 due to the reason code 56
*Dec  3 23:01:11.299: %LINK-6-UPDOWN: Interface Dot11Radio0, changed state to down
*Dec  3 23:01:11.307: %LINK-5-CHANGED: Interface Dot11Radio0, changed state to reset
*Dec  3 23:01:11.927: %DOT11-5-EXPECTED_RADIO_RESET: Restarting Radio interface Dot11Radio1 due to the reason code 56
*Dec  3 23:01:11.931: %DOT11-5-EXPECTED_RADIO_RESET: Restarting Radio interface Dot11Radio0 due to the reason code 10
*Dec  3 23:01:11.935: %CAPWAP-5-JOINEDCONTROLLER: AP has joined controller 5520New
*Dec  3 23:01:11.939: %LINK-6-UPDOWN: Interface Dot11Radio0, changed state to up
*Dec  3 23:01:12.147: %WIDS-6-ENABLED: IDS Signature is loaded and enabled%CRYPTO_PKI: Cert not yet valid or is expired -
    start date: 21:36:50 UTC Nov 5 2013
    end   date: 21:36:50 UTC May 20 2022

*Dec  3 23:01:12.179: %DOT11-3-NA_SENSOR_CERT_ERROR: Certificate installation error: Error in saving WSA certificate.
*Dec  3 23:01:12.299: %LINEPROTO-5-UPDOWN: Line protocol on Interface Dot11Radio0, changed state to down
*Dec  3 23:01:12.339: %LINK-6-UPDOWN: Interface Dot11Radio1, changed state to down
*Dec  3 23:01:12.347: %LINK-5-CHANGED: Interface Dot11Radio1, changed state to reset
*Dec  3 23:01:12.939: %LINEPROTO-5-UPDOWN: Line protocol on Interface Dot11Radio1, changed state to down
*Dec  3 23:01:13.331: %LINEPROTO-5-UPDOWN: Line protocol on Interface Dot11Radio0, changed state to up
*Dec  3 23:01:13.383: %LINK-6-UPDOWN: Interface Dot11Radio1, changed state to up
*Dec  3 23:01:13.391: %LINK-6-UPDOWN: Interface Dot11Radio0, changed state to down
*Dec  3 23:01:13.399: %LINK-5-CHANGED: Interface Dot11Radio0, changed state to reset
*Dec  3 23:01:14.383: %LINEPROTO-5-UPDOWN: Line protocol on Interface Dot11Radio1, changed state to up
*Dec  3 23:01:14.391: %LINEPROTO-5-UPDOWN: Line protocol on Interface Dot11Radio0, changed state to down
*Dec  3 23:01:14.423: %LINK-6-UPDOWN: Interface Dot11Radio0, changed state to up
*Dec  3 23:01:15.423: %LINEPROTO-5-UPDOWN: Line protocol on Interface Dot11Radio0, changed state to up
*Dec  3 23:01:36.359: %CLEANAIR-6-STATE: Slot 0 enabled
*Dec  3 23:01:38.411: %CLEANAIR-6-STATE: Slot 1 enabled

As you can see the process took about 2 minutes. I don't understand why the APs needed to go through this when the WLC is in SSO mode !

They must have lost connectivity to the WLC.  It's no good having HA WLC if the infrastructure it's connected to can't deliver reliable connectivity for the WLC.  We do see a few APs go to their N+1 WLC during a switchover on 8.10 (with up to 3000 APs per HA pair) but for most it is a completely seamless switchover as expected.
So concentrate on why the WLCs possibly both lost connectivity.

------------------------------
Please click Helpful if this post helped you and Accept as Solution if this answered your query.
------------------------------
TAC recommended codes for AireOS WLC's   and   TAC recommended codes for 9800 WLC's
Best Practices for AireOS WLC's,   Best Practices for 9800 WLC's   and   Cisco Wireless compatibility matrix
Check your 9800 WLC config with Wireless Config Analyzer using "show tech wireless" output or "config paging disable" then "show run-config" output on AireOS and use Wireless Debug Analyzer to analyze your WLC client debugs
Field Notice: FN63942 APs and WLCs Fail to Create CAPWAP Connections Due to Certificate Expiration
Field Notice: FN72424 Later Versions of WiFi 6 APs Fail to Join WLC - Software Upgrade Required
Field Notice: FN72524 IOS APs stuck in downloading state after 4 Dec 2022 due to Certificate Expired
- Fixed in 8.10.196.0, latest 9800 releases, 8.5.182.12 (8.5.182.13 for 3504) and 8.5.182.109 (IRCM, 8.5.182.111 for 3504)
Field Notice: FN70479 AP Fails to Join or Joins with 1 Radio due to Country Mismatch, RMA needed
Field Notice: FN74383 APs Running 17.12.4/5/6/6a May Run Out of Flash Space Preventing Upgrades
How to avoid boot loop due to corrupted image on Wave 2 and Catalyst 11ax Access Points (CSCvx32806)
Field Notice: FN74035 - Wave2 APs DFS May Not Detect Radar After Channel Availability Check Time
Leo's list of bugs affecting 2800/3800/4800/1560 APs
Default AP console baud rate from 17.12.x is 115200 - introduced by CSCwe88390

Rich R
VIP
VIP

Use "show redundancy detail" and the other "show redundancy" commands to get more info on the reason.

------------------------------
Please click Helpful if this post helped you and Accept as Solution if this answered your query.
------------------------------
TAC recommended codes for AireOS WLC's   and   TAC recommended codes for 9800 WLC's
Best Practices for AireOS WLC's,   Best Practices for 9800 WLC's   and   Cisco Wireless compatibility matrix
Check your 9800 WLC config with Wireless Config Analyzer using "show tech wireless" output or "config paging disable" then "show run-config" output on AireOS and use Wireless Debug Analyzer to analyze your WLC client debugs
Field Notice: FN63942 APs and WLCs Fail to Create CAPWAP Connections Due to Certificate Expiration
Field Notice: FN72424 Later Versions of WiFi 6 APs Fail to Join WLC - Software Upgrade Required
Field Notice: FN72524 IOS APs stuck in downloading state after 4 Dec 2022 due to Certificate Expired
- Fixed in 8.10.196.0, latest 9800 releases, 8.5.182.12 (8.5.182.13 for 3504) and 8.5.182.109 (IRCM, 8.5.182.111 for 3504)
Field Notice: FN70479 AP Fails to Join or Joins with 1 Radio due to Country Mismatch, RMA needed
Field Notice: FN74383 APs Running 17.12.4/5/6/6a May Run Out of Flash Space Preventing Upgrades
How to avoid boot loop due to corrupted image on Wave 2 and Catalyst 11ax Access Points (CSCvx32806)
Field Notice: FN74035 - Wave2 APs DFS May Not Detect Radar After Channel Availability Check Time
Leo's list of bugs affecting 2800/3800/4800/1560 APs
Default AP console baud rate from 17.12.x is 115200 - introduced by CSCwe88390
Review Cisco Networking for a $25 gift card