cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2364
Views
16
Helpful
10
Replies

Cisco 9800-CL HA setup issue

Trying to setup redundancy on 9800-Cl in Esxi environment. Have three NICs - Gi1 ( OOB ) , Gi2 ( trunk ) , Gi3 ( HA ) . After issuing following commands  and reloading controllers one of the controller goes into recovery mode and HA is not formed. Commands used :
chassis redundancy ha-interface gig3

redun-management interface Vlan40 chassis 1 address 192.168.31.4 chassis 2 address 192.168.31.5

IOS version in use 17.3.6. Getting following message on controller which goes into recovery mode %RIF_MGR_FSM-6-RMI_LINK_domn. RMI link is down.

Looking for help and recommendation to resolve issues.

Thanks

 

1 Accepted Solution

Accepted Solutions

balaji.bandi
Hall of Fame
Hall of Fame

Since this is virtual first you need to check the GIG3 how this is extended in vswitch - check VLAN 40

https://www.cisco.com/c/en/us/support/docs/wireless/catalyst-9800-series-wireless-controllers/213915-configure-catalyst-9800-wireless-control.html

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

View solution in original post

10 Replies 10

Sandeep Choudhary
VIP Alumni
VIP Alumni

Hi, 

Please follow this post and try to setup the HA between two 9800-CL WLCs.

https://rowelldionicio.com/cisco-catalyst-9800-cl-high-availability/

https://wifininjas.net/2019/08/21/wn-blog-011-cisco-c9800-cl-wlc-redundancy-ha-sso/

Regards

Dont forget to rate helpful posts

marce1000
VIP
VIP

 

  - Have  a checkup of the current controller configuration (before the HA attempt) with the CLI command : show  tech   wireless , have the output analyzed by  https://cway.cisco.com/tools/WirelessAnalyzer/  , please note do not use classical show tech-support (short version) , use the command denoted in green for Wireless Analyzer.         

 M.   



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

balaji.bandi
Hall of Fame
Hall of Fame

Since this is virtual first you need to check the GIG3 how this is extended in vswitch - check VLAN 40

https://www.cisco.com/c/en/us/support/docs/wireless/catalyst-9800-series-wireless-controllers/213915-configure-catalyst-9800-wireless-control.html

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

marce1000
VIP
VIP

 

 - Also note that you can get additional (state) info's on the RP (Gi3) with the command show platform hardware chassis active qfp datapath pmd ifdev  (you will also get info on the state of other interfaces). You may compare this with before or after connecting attempts to the redundancy controller. Also correlate the observed outputs with the intended networking settings on the virtual environment (hypervisor)

 M.



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

Thanks All. Both 9800-CL controllers needed to be on same ESxi host for HA ( Gi3 ) to work. I had 9800CL controllers on two separate ESXi hosts. Once I moved them to single ESXi host issue was Fixed.

aeccles
Level 1
Level 1

Does anyone know if this can work when the controllers are on different host servers?  Having both controllers on the same host greatly reduces the benefit of HA.  None of the configuration guides address this and so far our time working with TAC hasn't produced a resolution.

May be explain your scenario or if you looking different DC, then try n+1 deployment.

 

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Sure thing.  Our situation is very similar to the original post... We are trying to setup redundancy on 9800-Cl in a VMWare Esxi environment. We have followed the deployment guide and have a functioning controller with APs joined and active clients. 

The issue is that we are unable to establish HA between the WLCs.  We have multiple ESXi Host servers in a cluster, and we would like to be able to have WLC1 and WLC2 on different host servers.  All of our attempts to get this working fail.  The HA link will not establish.

All of the documentation we've found - such as this: Configure Catalyst 9800 Wireless Controllers in High Availability (HA) Client Stateful Switch Over (SSO) in IOS-XE 16.12 - Cisco Only covers instances when WLC1 and WLC2 are on the same host.

What we need are the configuration steps for setup where the WLCs are on different hosts.  We worked with TAC multiple times and have not found a solution.  We don't even know if it is a supported scenario.

If anyone has this running or can provide info on it, I'd greatly appreciate it.

 

 

 >..Does anyone know if this can work when the controllers are on different host servers?  Having both controllers on the same host greatly reduces the benefit of HA.  None of the configuration guides address this and so far our time working with TAC hasn't produced a resolution.
  - In that case , you need to make sure that the external vlans 'bridging' between the two controllers   , use the same vlan tagging as defined for inner HA on the 9800CL pair , 

 M.



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

Ok so for anyone like OP that runs into this problem here is how I solved it:
We too have a VMware vSphere cluster 8 Update3 (ESXi 8u3 Dell Custom A00). I found the same behaviour that when the 9800-CL VMs are on the same host the HA interface (Gi3) works fine but as soon as they are moved to seperate hosts it keeps resetting and rebooting the 9800-CL and wont form the HA.

We are using one Distributed Virtual Switch (DVS) for data - so each host has a dvs switch so changes are easy to make
The layout for the 9800 vm is like so
vnic1 = oob mgmt port group - connect it up and shut it down in IOS-XE which maps to Gi1 (the 9800-CL needs it connected but you can shut down in IOS-XE as OOB on vm makes no sense, and you would need custom static routing as well).
vnic 2 = this is your wireless management interface - either access mode portgroup or a trunk - that depends if you need to trunk vlans for central switching - we are using pure flexconnect so this is just set to our mgmt vlan. On the 9800-CL that is Gi2 and it is the WMI and RMI
vnic 3 = the HA interface - set the dvs portgroup under vlan to NONE, we had it set to a vlan and it wouldnt work when 9800 CL was on different hosts. In IOS-XE this is Gi3 (the HA interface).

I set both the vnic 2 and vnic 3 portgroups on the DVS to have promiscious mode true and forged transmits true.
The reason i think the problem occurs is the 9800-CL detects a switching loop with both Gi2 and Gi3 are connected to same vswitch or DVS if access mode is used. When you set the gi3 port group to NONE it is set for untagged only from the vmware side. This makes it so the 9800-CL does not see a switching loop. 

As side note, when you use vmotion - only move one 9800-CL vm at a time, never move them both at the same time or it will bork and reset. And OP is right no guide on cisco.com or anywhere else will tell you this. I noticed the screen shots on some of the guides use vlan 0 / NONE for the HA interface but always describring a single ESXi host scenario.

9800-CL ESXi DVS and VM settings9800-CL ESXi DVS and VM settings

vwlc-a#show chassis detail
Chassis/Stack Mac Address : 0050.5697.3840 - Local Mac Address
Mac persistency wait time: Indefinite
H/W Current
Chassis# Role Mac Address Priority Version State IP
-------------------------------------------------------------------------------------
*1 Active 0050.5697.3840 2 V02 Ready 169.254.135.112
2 Standby 0050.5697.535f 1 V02 Ready 169.254.135.113

 

Stack Port Status Neighbors
Chassis# Port 1 Port 2 Port 1 Port 2
--------------------------------------------------------
1 OK OK 2 2
2 OK OK 1 1
#
#
#

vwlc-a#show chassis ha-status active

My state = ACTIVE
Peer state = STANDBY HOT
Last switchover reason = active unit removed
Last switchover time = 00:16:34 AWST Thu Sep 5 2024
Image Version = 17.12.2

Chassis-HA Local-IP Remote-IP MASK HA-Interface
-----------------------------------------------------------------------------
This Boot: 169.254.135.112 169.254.135.113 255.255.255.0 GigabitEthernet3

Next Boot: 169.254.135.112 169.254.135.113 255.255.255.0 GigabitEthernet3


Chassis-HA Chassis# Priority IFMac Address Peer-timeout(ms)*Max-retry
-----------------------------------------------------------------------------------------
This Boot: 1 2 00:50:56:97:38:40 800*8

Next Boot: 1 2 00:50:56:97:38:40 800*8

vwlc-a#

Review Cisco Networking for a $25 gift card