cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2126
Views
6
Helpful
6
Replies

C9800-CL HyperV VMs Migration

Clem58
Level 3
Level 3

Hello,

I have 2 C9800-CL WLCs Version 17.3.5b running on Hyperv, they are forming a working HA/SSO cluster.
Last week, we had to patch the host where the 2 controllers were located, so we had to migrate the 2 VMs to another hosts, and the cluster has been broken.

Does that mean the 2 WLCs VMs need to be on the same Hyperv host ? If yes that means we cannot migrate the VMs, that is totally not coherent ! That is the purpose of our Hyperv Cluster.

Many thanks for your help.

6 Replies 6

marce1000
VIP
VIP

 

                   >...., and the cluster has been broken.

 How was this assertion observed , do you have logs , or any command output with errors ?

M.



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

Here are some logs :

Dec  2 17:49:37.010: %STACKMGR-6-CHASSIS_REMOVED: Chassis 2 R0/0: stack_mgr: Chassis 1 has been removed from the stack.

Dec  2 17:49:37.102: %PLATFORM-6-HASTATUS: RP switchover, received chassis event became active

Dec  2 17:49:37.094: %PLATFORM-6-HASTATUS: RP switchover, sent message became active. IOS is ready to switch to primary after chassis confirmation

Dec  2 17:49:37.076: %REDUNDANCY-3-SWITCHOVER: RP switchover (PEER_REDUNDANCY_STATE_CHANGE)

Dec  2 17:49:37.076: %REDUNDANCY-3-SWITCHOVER: RP switchover (PEER_DOWN)

Dec  2 17:49:37.076: %REDUNDANCY-3-SWITCHOVER: RP switchover (PEER_NOT_PRESENT)

Dec  2 17:49:37.026: %PLATFORM-6-HASTATUS: RP switchover, received chassis event to become active

Basically the VMs have been migrated one by one, so one was migrating to another host and the other was still on the initial host.

After a while, the HA peers repaired by themselves.

Dec 3 07:36:03.021: %WEBSERVER-5-LOGIN_PASSED: Chassis 1 R0/0: : Login Successful from host 10.105.12.51 by user 'admin' using crypto cipher 'ECDHE-RSA-AES128-GCM-SHA256'

Dec 3 07:36:03.020: %SEC_LOGIN-5-WEBLOGIN_SUCCESS: Login Success [user: admin] [Source: ] at 00:36:03 MST Sat Dec 3 2022

Dec 3 05:14:23.489: %SYS-6-LOGOUT: User solarwinds has exited tty session 1(x.x.x.101)

Dec 3 05:14:19.980: %SEC_LOGIN-5-LOGIN_SUCCESS: Login Success [user: solarwinds] [Source: x.x.x.101] [localport: 22] at 22:14:19 MST Fri Dec 2 2022

Dec 3 05:10:15.494: %SYS-6-LOGOUT: User solarwinds has exited tty session 1(x.x.x.101)

Dec 3 05:10:10.866: %SEC_LOGIN-5-LOGIN_SUCCESS: Login Success [user: solarwinds] [Source: x.x.x.101] [localport: 22] at 22:10:10 MST Fri Dec 2 2022

Dec 2 19:54:34.551: %SMART_LIC-3-COMM_FAILED: Communications failure with the Cisco Smart License Utility (CSLU) : Unable to resolve server hostname/domain name

Dec 2 19:41:45.592: %SMART_LIC-3-COMM_FAILED: Communications failure with the Cisco Smart License Utility (CSLU) : Unable to resolve server hostname/domain name

Dec 2 19:40:32.696: %RF-5-RF_TERMINAL_STATE: Terminal state reached for (SSO)

Dec 2 19:40:31.696: %VOICE_HA-7-STATUS: VOICE HA bulk sync done.

Dec 2 19:40:31.688: %HA_CONFIG_SYNC-6-BULK_CFGSYNC_SUCCEED: Bulk Sync succeeded

Dec 2 19:40:29.982: %RIF_MGR_FSM-6-RMI_LINK_UP: Chassis 1 R0/0: rif_mgr: The RMI link is UP.

Dec 2 19:40:25.077: %RIF_MGR_FSM-6-RMI_LINK_UP: Chassis 2 R0/0: rif_mgr: The RMI link is UP.

Dec 2 19:40:25.077: %STACKMGR-1-DUAL_ACTIVE_CFG_MSG: Chassis 2 R0/0: stack_mgr: Dual Active Detection link is available now

Dec 2 19:40:21.394: %CFMGR_LOG-4-COUNTRY_CFG_DEPRECATED_CLI: Chassis 2 R0/0: wncd: Deprecated CLI used: 'ap country <coutry-code>' is deprecated, instead use 'wireless country <country-code>'

Dec 2 19:40:21.252: %CFMGR_LOG-4-COUNTRY_CFG_DEPRECATED_CLI: Chassis 2 R0/0: wncd: Deprecated CLI used: 'ap country <coutry-code>' is deprecated, instead use 'wireless country <country-code>'

Dec 2 19:40:21.122: %CFMGR_LOG-4-COUNTRY_CFG_DEPRECATED_CLI: Chassis 2 R0/0: wncd: Deprecated CLI used: 'ap country <coutry-code>' is deprecated, instead use 'wireless country <country-code>'

Dec 2 19:40:17.549: %EWLC_HA_LIB_MESSAGE-6-BULK_SYNC_STATE_INFO: Chassis 2 R0/0: wncmgrd: INFO: Bulk sync status : HOT

Dec 2 19:40:17.111: %EWLC_HA_LIB_MESSAGE-6-BULK_SYNC_STATE_INFO: Chassis 1 R0/0: wncmgrd: INFO: Bulk sync status : CONFIG_DONE

Dec 2 19:39:25.389: %PMAN-3-PROC_EMPTY_EXEC_FILE: Chassis 2 R0/0: pvp: Empty executable used for process bt_logger

Dec 2 19:39:11.938: %SELINUX-3-MISMATCH: Chassis 2 R0/0: audispd: type=AVC msg=audit(1670009951.931:103): avc: denied { relabelto } for pid=32045 comm="linux_iosd-imag" scontext=system_u:system_r:polaris_iosd_t:s0 tcontext=system_u:system_r:polaris_iosd_t:s0 tclass=tun_socket permissive=1

Dec 2 19:39:11.938: %SELINUX-3-MISMATCH: Chassis 2 R0/0: audispd: type=AVC msg=audit(1670009951.931:103): avc: denied { relabelfrom } for pid=32045 comm="linux_iosd-imag" scontext=system_u:system_r:polaris_iosd_t:s0 tcontext=system_u:system_r:ifconfig_t:s0 tclass=tun_socket permissive=1

Dec 2 19:39:11.938: %SELINUX-3-MISMATCH: Chassis 2 R0/0: audispd: type=AVC msg=audit(1670009951.931:103): avc: denied { ioctl } for pid=32045 comm="linux_iosd-imag" path="/dev/net/tun" dev="devtmpfs" ino=19432 ioctlcmd=0x54ca scontext=system_u:system_r:polaris_iosd_t:s0 tcontext=system_u:object_r:tun_tap_device_t:s0 tclass=chr_file permissive=1

Dec 2 19:39:11.938: %SELINUX-3-MISMATCH: Chassis 2 R0/0: audispd: type=AVC msg=audit(1670009951.931:102): avc: denied { open } for pid=32045 comm="linux_iosd-imag" path="/dev/net/tun" dev="devtmpfs" ino=19432 scontext=system_u:system_r:polaris_iosd_t:s0 tcontext=system_u:object_r:tun_tap_device_t:s0 tclass=chr_file permissive=1

Dec 2 19:39:11.938: %SELINUX-3-MISMATCH: Chassis 2 R0/0: audispd: type=AVC msg=audit(1670009951.931:102): avc: denied { read write } for pid=32045 comm="linux_iosd-imag" name="tun" dev="devtmpfs" ino=19432 scontext=system_u:system_r:polaris_iosd_t:s0 tcontext=system_u:object_r:tun_tap_device_t:s0 tclass=chr_file permissive=1

Dec 2 19:39:08.016: Vlan Database sync done from bootflash:vlan.dat to stby-bootflash:vlan.dat (556 bytes)

Dec 2 19:39:07.982: Syncing vlan database

Dec 2 19:39:04.435: %REDUNDANCY-5-PEER_MONITOR_EVENT: Active detected a standby insertion (raw-event=PEER_REDUNDANCY_STATE_CHANGE(5))

Dec 2 19:39:04.435: %REDUNDANCY-5-PEER_MONITOR_EVENT: Active detected a standby insertion (raw-event=PEER_FOUND(4))

Dec 2 19:38:24.255: %PMAN-3-PROC_EMPTY_EXEC_FILE: Chassis 2 R0/0: pvp: Empty executable used for process bt_logger

Dec 2 19:38:20.290: %EWLC_HA_LIB_MESSAGE-6-BULK_SYNC_STATE_INFO: Chassis 2 R0/0: wncmgrd: INFO: Bulk sync status : COLD

Dec 2 19:38:14.362: %STACKMGR-6-STANDBY_ELECTED: Chassis 1 R0/0: stack_mgr: Chassis 2 has been elected STANDBY.

Dec 2 19:38:14.373: %IOSXE_REDUNDANCY-6-PEER: Active detected chassis 2 as standby.

Dec 2 19:38:11.907: %PMAN-3-PROC_EMPTY_EXEC_FILE: Chassis 2 R0/0: pvp: Empty executable used for process bt_logger

Dec 2 19:38:02.027: %RIF_MGR_FSM-6-RP_LINK_UP: Chassis 1 R0/0: rif_mgr: The RP link is UP.

Dec 2 19:38:02.027: %STACKMGR-1-DUAL_ACTIVE_CFG_MSG: Chassis 1 R0/0: stack_mgr: Dual Active Detection link is available now

Dec 2 19:37:57.073: %RIF_MGR_FSM-6-RMI_LINK_DOWN: Chassis 2 R0/0: rif_mgr: The RMI link is DOWN.

Dec 2 19:37:54.946: %EWLC_HA_LIB_MESSAGE-6-BULK_SYNC_STATE_INFO: Chassis 1 R0/0: wncmgrd: INFO: Bulk sync status : COLD

Dec 2 19:37:52.440: %PMAN-3-PROC_EMPTY_EXEC_FILE: Chassis 2 R0/0: pvp: Empty executable used for process bt_logger

Dec 2 19:37:52.117: %DPP_SYSLOG-4-EVENT_WARNING: Chassis 2 R0/0: dpman: Pipeline event - DPMAN has started!, (null)

Dec 2 19:37:52.055: %DPP_SYSLOG-6-EVENT_INFO: Chassis 2 R0/0: dpman: Pipeline event - dpe_init done, (null)

Dec 2 19:37:43.905: %PMAN-3-PROC_EMPTY_EXEC_FILE: Chassis 2 R0/0: pvp: Empty executable used for process bt_logger

Dec 2 19:37:42.105: %PMAN-3-PROC_EMPTY_EXEC_FILE: Chassis 2 R0/0: pvp: Empty executable used for process bt_logger

Dec 2 19:37:40.367: %STACKMGR-6-CHASSIS_ADDED: Chassis 2 R0/0: stack_mgr: Chassis 2 has been added to the stack.

Dec 2 19:37:39.383: %PMAN-3-PROC_EMPTY_EXEC_FILE: Chassis 2 R0/0: pvp: Empty executable used for process bt_logger

Dec 2 19:37:38.104: %STACKMGR-6-CHASSIS_ADDED: Chassis 2 R0/0: stack_mgr: Chassis 2 has been added to the stack.

Dec 2 19:37:38.085: %STACKMGR-6-STACK_LINK_CHANGE: Chassis 2 R0/0: stack_mgr: Stack port 2 on Chassis 2 is up

Dec 2 19:37:38.078: %STACKMGR-6-STACK_LINK_CHANGE: Chassis 2 R0/0: stack_mgr: Stack port 1 on Chassis 2 is up

Dec 2 19:37:38.078: %STACKMGR-6-STACK_LINK_CHANGE: Chassis 2 R0/0: stack_mgr: Stack port 2 on Chassis 2 is down

Dec 2 19:37:38.078: %STACKMGR-6-STACK_LINK_CHANGE: Chassis 2 R0/0: stack_mgr: Stack port 1 on Chassis 2 is down

Dec 2 19:37:38.064: %STACKMGR-1-DUAL_ACTIVE_CFG_MSG: Chassis 2 R0/0: stack_mgr: Dual Active Detection link is available now

Dec 2 19:37:37.023: %RIF_MGR_FSM-6-RP_LINK_UP: Chassis 2 R0/0: rif_mgr: The RP link is UP.

Dec 2 19:37:40.368: %STACKMGR-6-CHASSIS_ADDED: Chassis 1 R0/0: stack_mgr: Chassis 2 has been added to the stack.

Dec 2 19:37:38.113: %STACKMGR-6-CHASSIS_ADDED: Chassis 1 R0/0: stack_mgr: Chassis 2 has been added to the stack.

Dec 2 19:37:38.085: %STACKMGR-6-STACK_LINK_CHANGE: Chassis 1 R0/0: stack_mgr: Stack port 1 on Chassis 1 is up

Dec 2 19:37:38.083: %STACKMGR-6-STACK_LINK_CHANGE: Chassis 1 R0/0: stack_mgr: Stack port 2 on Chassis 1 is up

Dec 2 19:36:43.599: %STACKMGR-1-DUAL_ACTIVE_CFG_MSG: Chassis 1 R0/0: stack_mgr: Dual Active Detection links are not available anymore

Dec 2 19:36:43.599: %RIF_MGR_FSM-6-RP_LINK_DOWN: Chassis 1 R0/0: rif_mgr: Setting RP link status to DOWN

Dec 2 19:36:01.694: %STACKMGR-6-STACK_LINK_CHANGE: Chassis 1 R0/0: stack_mgr: Stack port 1 on Chassis 1 is down

Dec 2 19:36:01.694: %STACKMGR-6-STACK_LINK_CHANGE: Chassis 1 R0/0: stack_mgr: Stack port 2 on Chassis 1 is down

Dec 2 19:35:43.767: %SMART_LIC-3-COMM_FAILED: Communications failure with the Cisco Smart License Utility (CSLU) : Unable to resolve server hostname/domain name

Dec 2 19:35:32.171: %RIF_MGR_FSM-6-RMI_LINK_DOWN: Chassis 1 R0/0: rif_mgr: The RMI link is DOWN.

Dec 2 19:35:07.162: %RIF_MGR_FSM-6-GW_REACHABLE_ACTIVE: Chassis 1 R0/0: rif_mgr: Gateway reachable from Active

Dec 2 19:35:05.891: %LINEPROTO-5-UPDOWN: Line protocol on Interface Vlan3854, changed state to up

Dec 2 19:35:05.884: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1, changed state to up

Dec 2 19:35:05.884: %LINEPROTO-5-UPDOWN: Line protocol on Interface Null0, changed state to up

Dec 2 19:35:04.913: %LINK-5-CHANGED: Interface Vlan1, changed state to administratively down

Dec 2 19:35:04.890: %LINK-3-UPDOWN: Interface Vlan3854, changed state to up

Dec 2 19:35:04.885: %LINK-3-UPDOWN: Interface GigabitEthernet1, changed state to up

Dec 2 19:35:04.883: %LINK-3-UPDOWN: Interface Null0, changed state to up

Dec 2 19:35:03.622: %CALL_HOME-6-CALL_HOME_ENABLED: Call-home is enabled by Smart Agent for Licensing.

Dec 2 19:35:02.978: %PKI-6-CS_ENABLED: Certificate server now enabled.

Dec 2 19:35:02.889: %VOICE_HA-2-SWITCHOVER_IND: SWITCHOVER, from STANDBY_HOT to ACTIVE state.

Dec 2 19:35:02.846: EWLC-HAINFRA-INFO: Configured only secondary IP x.x.x.122/255.255.255.224 on active

Dec 2 19:35:02.825: EWLC-HAINFRA-INFO: Configured secondary IP x.x.x.122/255.255.255.224 on active(mgmt)

Dec 2 19:35:02.822: EWLC-HAINFRA-INFO: Configured primary IP x.x.x.120/255.255.255.224 on active(mgmt)

Dec 2 19:35:02.408: %HA-6-SWITCHOVER: Route Processor switched from standby to being active

Dec 2 19:35:02.402: %PLATFORM-6-HASTATUS_DETAIL: RP switchover, received chassis event became active. Switch to primary (count 1)

Dec 2 19:35:01.997: %APMGR_TRACE_MESSAGE-4-WLC_APMGR_WARNING_MSG: Chassis 1 R0/0: wncd: Warning, AP Cisco-3702E is associated with the policy tag default-policy-tag, which has no wlan or rlan configured. Please configure wlan or rlan under the policy tag or associate the AP with valid policy tag

Dec 2 19:35:02.335: %PLATFORM-6-HASTATUS: RP switchover, received chassis event became active

 

                             >...Basically the VMs have been migrated one by one
 - Was this a 'hot move' , I am not sure if that is supported. Anyway have a checkup of the (active)  C9800-CL configuration with the CLI command : show  tech   wireless , have the output analyzed by  https://cway.cisco.com/tools/WirelessAnalyzer/  , please note do not use classical show tech-support (short version) , use the command denoted in green for Wireless Analyzer.               Checkout all advisories and or advisories concerning the HA/SSO setup , if any. Also all advisories red flagged should be corrected.

 M.



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

Thanks Marce, my colleague just right clicked on the VM and migrated it to another host.

On the show tech wireles analysis there are the same events we see in the logs I pasted above.

1.png

BTW I did not see any article that states that the VMs needs to be on the same hosts, for sure they are on the same Hyperv cluster.

 

 - You may also look into  : https://www.cisco.com/c/en/us/support/docs/wireless/catalyst-9800-series-wireless-controllers/214749-tac-recommended-ios-xe-builds-for-wirele.html , note that going beyond 17.3.x is not possible if you have older access points , 

 M.



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

Hello,

We finally found the solution after creating a Cisco TAC case.
Actually the RP traffic was not forwarded when the 2 VMs were on different hosts.

That's because the L2 VLAN was tagged on Hyperv shared vswitch between the 2 hosts, but was not tagged on the physical switch which trunks are connected to hosts. Then the traffic was not forwarded via this "external" switch when the VMs are on different hosts.

Tagging the RP L2 VLAN on the external switch has solved the issue, now when a VM migrates to a different host we have the HA still in place, without any problem.

Review Cisco Networking for a $25 gift card