12-03-2022 12:47 AM - edited 12-03-2022 12:48 AM
Hello,
I have 2 C9800-CL WLCs Version 17.3.5b running on Hyperv, they are forming a working HA/SSO cluster.
Last week, we had to patch the host where the 2 controllers were located, so we had to migrate the 2 VMs to another hosts, and the cluster has been broken.
Does that mean the 2 WLCs VMs need to be on the same Hyperv host ? If yes that means we cannot migrate the VMs, that is totally not coherent ! That is the purpose of our Hyperv Cluster.
Many thanks for your help.
12-03-2022 01:39 AM
>...., and the cluster has been broken.
How was this assertion observed , do you have logs , or any command output with errors ?
M.
12-03-2022 02:23 AM
Here are some logs :
Dec 2 17:49:37.010: %STACKMGR-6-CHASSIS_REMOVED: Chassis 2 R0/0: stack_mgr: Chassis 1 has been removed from the stack.
Dec 2 17:49:37.102: %PLATFORM-6-HASTATUS: RP switchover, received chassis event became active
Dec 2 17:49:37.094: %PLATFORM-6-HASTATUS: RP switchover, sent message became active. IOS is ready to switch to primary after chassis confirmation
Dec 2 17:49:37.076: %REDUNDANCY-3-SWITCHOVER: RP switchover (PEER_REDUNDANCY_STATE_CHANGE)
Dec 2 17:49:37.076: %REDUNDANCY-3-SWITCHOVER: RP switchover (PEER_DOWN)
Dec 2 17:49:37.076: %REDUNDANCY-3-SWITCHOVER: RP switchover (PEER_NOT_PRESENT)
Dec 2 17:49:37.026: %PLATFORM-6-HASTATUS: RP switchover, received chassis event to become active
Basically the VMs have been migrated one by one, so one was migrating to another host and the other was still on the initial host.
After a while, the HA peers repaired by themselves.
Dec 3 07:36:03.021: %WEBSERVER-5-LOGIN_PASSED: Chassis 1 R0/0: : Login Successful from host 10.105.12.51 by user 'admin' using crypto cipher 'ECDHE-RSA-AES128-GCM-SHA256'
Dec 3 07:36:03.020: %SEC_LOGIN-5-WEBLOGIN_SUCCESS: Login Success [user: admin] [Source: ] at 00:36:03 MST Sat Dec 3 2022
Dec 3 05:14:23.489: %SYS-6-LOGOUT: User solarwinds has exited tty session 1(x.x.x.101)
Dec 3 05:14:19.980: %SEC_LOGIN-5-LOGIN_SUCCESS: Login Success [user: solarwinds] [Source: x.x.x.101] [localport: 22] at 22:14:19 MST Fri Dec 2 2022
Dec 3 05:10:15.494: %SYS-6-LOGOUT: User solarwinds has exited tty session 1(x.x.x.101)
Dec 3 05:10:10.866: %SEC_LOGIN-5-LOGIN_SUCCESS: Login Success [user: solarwinds] [Source: x.x.x.101] [localport: 22] at 22:10:10 MST Fri Dec 2 2022
Dec 2 19:54:34.551: %SMART_LIC-3-COMM_FAILED: Communications failure with the Cisco Smart License Utility (CSLU) : Unable to resolve server hostname/domain name
Dec 2 19:41:45.592: %SMART_LIC-3-COMM_FAILED: Communications failure with the Cisco Smart License Utility (CSLU) : Unable to resolve server hostname/domain name
Dec 2 19:40:32.696: %RF-5-RF_TERMINAL_STATE: Terminal state reached for (SSO)
Dec 2 19:40:31.696: %VOICE_HA-7-STATUS: VOICE HA bulk sync done.
Dec 2 19:40:31.688: %HA_CONFIG_SYNC-6-BULK_CFGSYNC_SUCCEED: Bulk Sync succeeded
Dec 2 19:40:29.982: %RIF_MGR_FSM-6-RMI_LINK_UP: Chassis 1 R0/0: rif_mgr: The RMI link is UP.
Dec 2 19:40:25.077: %RIF_MGR_FSM-6-RMI_LINK_UP: Chassis 2 R0/0: rif_mgr: The RMI link is UP.
Dec 2 19:40:25.077: %STACKMGR-1-DUAL_ACTIVE_CFG_MSG: Chassis 2 R0/0: stack_mgr: Dual Active Detection link is available now
Dec 2 19:40:21.394: %CFMGR_LOG-4-COUNTRY_CFG_DEPRECATED_CLI: Chassis 2 R0/0: wncd: Deprecated CLI used: 'ap country <coutry-code>' is deprecated, instead use 'wireless country <country-code>'
Dec 2 19:40:21.252: %CFMGR_LOG-4-COUNTRY_CFG_DEPRECATED_CLI: Chassis 2 R0/0: wncd: Deprecated CLI used: 'ap country <coutry-code>' is deprecated, instead use 'wireless country <country-code>'
Dec 2 19:40:21.122: %CFMGR_LOG-4-COUNTRY_CFG_DEPRECATED_CLI: Chassis 2 R0/0: wncd: Deprecated CLI used: 'ap country <coutry-code>' is deprecated, instead use 'wireless country <country-code>'
Dec 2 19:40:17.549: %EWLC_HA_LIB_MESSAGE-6-BULK_SYNC_STATE_INFO: Chassis 2 R0/0: wncmgrd: INFO: Bulk sync status : HOT
Dec 2 19:40:17.111: %EWLC_HA_LIB_MESSAGE-6-BULK_SYNC_STATE_INFO: Chassis 1 R0/0: wncmgrd: INFO: Bulk sync status : CONFIG_DONE
Dec 2 19:39:25.389: %PMAN-3-PROC_EMPTY_EXEC_FILE: Chassis 2 R0/0: pvp: Empty executable used for process bt_logger
Dec 2 19:39:11.938: %SELINUX-3-MISMATCH: Chassis 2 R0/0: audispd: type=AVC msg=audit(1670009951.931:103): avc: denied { relabelto } for pid=32045 comm="linux_iosd-imag" scontext=system_u:system_r:polaris_iosd_t:s0 tcontext=system_u:system_r:polaris_iosd_t:s0 tclass=tun_socket permissive=1
Dec 2 19:39:11.938: %SELINUX-3-MISMATCH: Chassis 2 R0/0: audispd: type=AVC msg=audit(1670009951.931:103): avc: denied { relabelfrom } for pid=32045 comm="linux_iosd-imag" scontext=system_u:system_r:polaris_iosd_t:s0 tcontext=system_u:system_r:ifconfig_t:s0 tclass=tun_socket permissive=1
Dec 2 19:39:11.938: %SELINUX-3-MISMATCH: Chassis 2 R0/0: audispd: type=AVC msg=audit(1670009951.931:103): avc: denied { ioctl } for pid=32045 comm="linux_iosd-imag" path="/dev/net/tun" dev="devtmpfs" ino=19432 ioctlcmd=0x54ca scontext=system_u:system_r:polaris_iosd_t:s0 tcontext=system_u:object_r:tun_tap_device_t:s0 tclass=chr_file permissive=1
Dec 2 19:39:11.938: %SELINUX-3-MISMATCH: Chassis 2 R0/0: audispd: type=AVC msg=audit(1670009951.931:102): avc: denied { open } for pid=32045 comm="linux_iosd-imag" path="/dev/net/tun" dev="devtmpfs" ino=19432 scontext=system_u:system_r:polaris_iosd_t:s0 tcontext=system_u:object_r:tun_tap_device_t:s0 tclass=chr_file permissive=1
Dec 2 19:39:11.938: %SELINUX-3-MISMATCH: Chassis 2 R0/0: audispd: type=AVC msg=audit(1670009951.931:102): avc: denied { read write } for pid=32045 comm="linux_iosd-imag" name="tun" dev="devtmpfs" ino=19432 scontext=system_u:system_r:polaris_iosd_t:s0 tcontext=system_u:object_r:tun_tap_device_t:s0 tclass=chr_file permissive=1
Dec 2 19:39:08.016: Vlan Database sync done from bootflash:vlan.dat to stby-bootflash:vlan.dat (556 bytes)
Dec 2 19:39:07.982: Syncing vlan database
Dec 2 19:39:04.435: %REDUNDANCY-5-PEER_MONITOR_EVENT: Active detected a standby insertion (raw-event=PEER_REDUNDANCY_STATE_CHANGE(5))
Dec 2 19:39:04.435: %REDUNDANCY-5-PEER_MONITOR_EVENT: Active detected a standby insertion (raw-event=PEER_FOUND(4))
Dec 2 19:38:24.255: %PMAN-3-PROC_EMPTY_EXEC_FILE: Chassis 2 R0/0: pvp: Empty executable used for process bt_logger
Dec 2 19:38:20.290: %EWLC_HA_LIB_MESSAGE-6-BULK_SYNC_STATE_INFO: Chassis 2 R0/0: wncmgrd: INFO: Bulk sync status : COLD
Dec 2 19:38:14.362: %STACKMGR-6-STANDBY_ELECTED: Chassis 1 R0/0: stack_mgr: Chassis 2 has been elected STANDBY.
Dec 2 19:38:14.373: %IOSXE_REDUNDANCY-6-PEER: Active detected chassis 2 as standby.
Dec 2 19:38:11.907: %PMAN-3-PROC_EMPTY_EXEC_FILE: Chassis 2 R0/0: pvp: Empty executable used for process bt_logger
Dec 2 19:38:02.027: %RIF_MGR_FSM-6-RP_LINK_UP: Chassis 1 R0/0: rif_mgr: The RP link is UP.
Dec 2 19:38:02.027: %STACKMGR-1-DUAL_ACTIVE_CFG_MSG: Chassis 1 R0/0: stack_mgr: Dual Active Detection link is available now
Dec 2 19:37:57.073: %RIF_MGR_FSM-6-RMI_LINK_DOWN: Chassis 2 R0/0: rif_mgr: The RMI link is DOWN.
Dec 2 19:37:54.946: %EWLC_HA_LIB_MESSAGE-6-BULK_SYNC_STATE_INFO: Chassis 1 R0/0: wncmgrd: INFO: Bulk sync status : COLD
Dec 2 19:37:52.440: %PMAN-3-PROC_EMPTY_EXEC_FILE: Chassis 2 R0/0: pvp: Empty executable used for process bt_logger
Dec 2 19:37:52.117: %DPP_SYSLOG-4-EVENT_WARNING: Chassis 2 R0/0: dpman: Pipeline event - DPMAN has started!, (null)
Dec 2 19:37:52.055: %DPP_SYSLOG-6-EVENT_INFO: Chassis 2 R0/0: dpman: Pipeline event - dpe_init done, (null)
Dec 2 19:37:43.905: %PMAN-3-PROC_EMPTY_EXEC_FILE: Chassis 2 R0/0: pvp: Empty executable used for process bt_logger
Dec 2 19:37:42.105: %PMAN-3-PROC_EMPTY_EXEC_FILE: Chassis 2 R0/0: pvp: Empty executable used for process bt_logger
Dec 2 19:37:40.367: %STACKMGR-6-CHASSIS_ADDED: Chassis 2 R0/0: stack_mgr: Chassis 2 has been added to the stack.
Dec 2 19:37:39.383: %PMAN-3-PROC_EMPTY_EXEC_FILE: Chassis 2 R0/0: pvp: Empty executable used for process bt_logger
Dec 2 19:37:38.104: %STACKMGR-6-CHASSIS_ADDED: Chassis 2 R0/0: stack_mgr: Chassis 2 has been added to the stack.
Dec 2 19:37:38.085: %STACKMGR-6-STACK_LINK_CHANGE: Chassis 2 R0/0: stack_mgr: Stack port 2 on Chassis 2 is up
Dec 2 19:37:38.078: %STACKMGR-6-STACK_LINK_CHANGE: Chassis 2 R0/0: stack_mgr: Stack port 1 on Chassis 2 is up
Dec 2 19:37:38.078: %STACKMGR-6-STACK_LINK_CHANGE: Chassis 2 R0/0: stack_mgr: Stack port 2 on Chassis 2 is down
Dec 2 19:37:38.078: %STACKMGR-6-STACK_LINK_CHANGE: Chassis 2 R0/0: stack_mgr: Stack port 1 on Chassis 2 is down
Dec 2 19:37:38.064: %STACKMGR-1-DUAL_ACTIVE_CFG_MSG: Chassis 2 R0/0: stack_mgr: Dual Active Detection link is available now
Dec 2 19:37:37.023: %RIF_MGR_FSM-6-RP_LINK_UP: Chassis 2 R0/0: rif_mgr: The RP link is UP.
Dec 2 19:37:40.368: %STACKMGR-6-CHASSIS_ADDED: Chassis 1 R0/0: stack_mgr: Chassis 2 has been added to the stack.
Dec 2 19:37:38.113: %STACKMGR-6-CHASSIS_ADDED: Chassis 1 R0/0: stack_mgr: Chassis 2 has been added to the stack.
Dec 2 19:37:38.085: %STACKMGR-6-STACK_LINK_CHANGE: Chassis 1 R0/0: stack_mgr: Stack port 1 on Chassis 1 is up
Dec 2 19:37:38.083: %STACKMGR-6-STACK_LINK_CHANGE: Chassis 1 R0/0: stack_mgr: Stack port 2 on Chassis 1 is up
Dec 2 19:36:43.599: %STACKMGR-1-DUAL_ACTIVE_CFG_MSG: Chassis 1 R0/0: stack_mgr: Dual Active Detection links are not available anymore
Dec 2 19:36:43.599: %RIF_MGR_FSM-6-RP_LINK_DOWN: Chassis 1 R0/0: rif_mgr: Setting RP link status to DOWN
Dec 2 19:36:01.694: %STACKMGR-6-STACK_LINK_CHANGE: Chassis 1 R0/0: stack_mgr: Stack port 1 on Chassis 1 is down
Dec 2 19:36:01.694: %STACKMGR-6-STACK_LINK_CHANGE: Chassis 1 R0/0: stack_mgr: Stack port 2 on Chassis 1 is down
Dec 2 19:35:43.767: %SMART_LIC-3-COMM_FAILED: Communications failure with the Cisco Smart License Utility (CSLU) : Unable to resolve server hostname/domain name
Dec 2 19:35:32.171: %RIF_MGR_FSM-6-RMI_LINK_DOWN: Chassis 1 R0/0: rif_mgr: The RMI link is DOWN.
Dec 2 19:35:07.162: %RIF_MGR_FSM-6-GW_REACHABLE_ACTIVE: Chassis 1 R0/0: rif_mgr: Gateway reachable from Active
Dec 2 19:35:05.891: %LINEPROTO-5-UPDOWN: Line protocol on Interface Vlan3854, changed state to up
Dec 2 19:35:05.884: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1, changed state to up
Dec 2 19:35:05.884: %LINEPROTO-5-UPDOWN: Line protocol on Interface Null0, changed state to up
Dec 2 19:35:04.913: %LINK-5-CHANGED: Interface Vlan1, changed state to administratively down
Dec 2 19:35:04.890: %LINK-3-UPDOWN: Interface Vlan3854, changed state to up
Dec 2 19:35:04.885: %LINK-3-UPDOWN: Interface GigabitEthernet1, changed state to up
Dec 2 19:35:04.883: %LINK-3-UPDOWN: Interface Null0, changed state to up
Dec 2 19:35:03.622: %CALL_HOME-6-CALL_HOME_ENABLED: Call-home is enabled by Smart Agent for Licensing.
Dec 2 19:35:02.978: %PKI-6-CS_ENABLED: Certificate server now enabled.
Dec 2 19:35:02.889: %VOICE_HA-2-SWITCHOVER_IND: SWITCHOVER, from STANDBY_HOT to ACTIVE state.
Dec 2 19:35:02.846: EWLC-HAINFRA-INFO: Configured only secondary IP x.x.x.122/255.255.255.224 on active
Dec 2 19:35:02.825: EWLC-HAINFRA-INFO: Configured secondary IP x.x.x.122/255.255.255.224 on active(mgmt)
Dec 2 19:35:02.822: EWLC-HAINFRA-INFO: Configured primary IP x.x.x.120/255.255.255.224 on active(mgmt)
Dec 2 19:35:02.408: %HA-6-SWITCHOVER: Route Processor switched from standby to being active
Dec 2 19:35:02.402: %PLATFORM-6-HASTATUS_DETAIL: RP switchover, received chassis event became active. Switch to primary (count 1)
Dec 2 19:35:01.997: %APMGR_TRACE_MESSAGE-4-WLC_APMGR_WARNING_MSG: Chassis 1 R0/0: wncd: Warning, AP Cisco-3702E is associated with the policy tag default-policy-tag, which has no wlan or rlan configured. Please configure wlan or rlan under the policy tag or associate the AP with valid policy tag
Dec 2 19:35:02.335: %PLATFORM-6-HASTATUS: RP switchover, received chassis event became active
12-03-2022 04:07 AM
>...Basically the VMs have been migrated one by one
- Was this a 'hot move' , I am not sure if that is supported. Anyway have a checkup of the (active) C9800-CL configuration with the CLI command : show tech wireless , have the output analyzed by https://cway.cisco.com/
M.
12-03-2022 04:29 AM
Thanks Marce, my colleague just right clicked on the VM and migrated it to another host.
On the show tech wireles analysis there are the same events we see in the logs I pasted above.
BTW I did not see any article that states that the VMs needs to be on the same hosts, for sure they are on the same Hyperv cluster.
12-03-2022 05:08 AM
- You may also look into : https://www.cisco.com/c/en/us/support/docs/wireless/catalyst-9800-series-wireless-controllers/214749-tac-recommended-ios-xe-builds-for-wirele.html , note that going beyond 17.3.x is not possible if you have older access points ,
M.
01-25-2023 08:14 AM
Hello,
We finally found the solution after creating a Cisco TAC case.
Actually the RP traffic was not forwarded when the 2 VMs were on different hosts.
That's because the L2 VLAN was tagged on Hyperv shared vswitch between the 2 hosts, but was not tagged on the physical switch which trunks are connected to hosts. Then the traffic was not forwarded via this "external" switch when the VMs are on different hosts.
Tagging the RP L2 VLAN on the external switch has solved the issue, now when a VM migrates to a different host we have the HA still in place, without any problem.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide