05-11-2021 02:48 PM - edited 07-05-2021 01:18 PM
My 9800-40 HA stack on 16.12.4a, which is not yet in production lost its sso state.
I did a hard reboot of the stack and while I have sso back the active is not hearing/responding to
discovery requests. I have 6 9117's and a 1852 which is my lab test unit that I console into.. It dutifly finds all the production 5520's and sends discovery reqs to them and the 9800.. only the 5520's respond.
Troubleshooting.. Radioactive trace to target IP's always come up blank!.
Both ends can ping each other.
where is the hidden wlan on/off button? even without wlans the AP's used to attach
Solved! Go to Solution.
05-13-2021 09:23 AM
The Problem has been found and fixed.
seems command
wireless management interface Vlan314
public-ip x.x.x.x
was missing, well blank.
the vlan was NOT identified/tagged as the WMI!.
how that got lost during reboot is unknown.
in the GUI that be Config/interface/Wireless
05-11-2021 11:14 PM
wlan enable/disable doesn't matter. Was it working and stopped working now.
Deployed many 9800-40s in HA, AP joins smoothly.
First, upgrade WLC to 17.3.3 or 17.4.
set the current time.
set wireless interface and default route to wireless mgmt gateway.
configure AP with Primary WLC's name/ip.
05-12-2021 08:28 AM
lol
this one has been chugging along 6 or 7 months in gestation status.. no official wlan interfaces yet and just recently change the AP management IP to its perm ip.still on a single 1g pipe. awaiting a pile of 10g's. eventually it will replace 3 HA 5500 stacks.. but dont have interface structure for the combined wlan load yet. which is why it still a toy. I noticed yesterday that the box was not in HA anymore and during the course of repair, I rebooted the package.. with the on/off switch! Well it is paired up again. without clients.. So NO it is not registered!! I guess it is time to push that button. It might have been tolerating that condition up to the reboot.
At this time we are staying on 16.12, NTP is alive and well. The Ap's are dutifully discovering the 9800 but getting no response so they land on the Production stack. the real bummer part of this is the broke HA was actually caused by a loose fiber patch!!
05-12-2021 09:06 AM
With later code versions I think 17.1, support RP-RMI. I would probably go with 17.3.x and let that sit and run and do your testing.
05-12-2021 02:33 AM
Can you post a debug of the capwap process?
I suspect some licensing, time or certificate issue.
05-12-2021 08:33 AM
I tend toward the license issue.
capwap debug on the AP? at this point all I have is the console output from my test AP.
it sends discovery reqs to all wlc's but only hears the production system, not its mwar assigned target..
Since we have not yet pushed the smart-license button yet.. it be unregistered.. over a year!!
I think you are on to something
David
05-12-2021 08:48 AM
Capwap debug on the WLC. That should show if the WLC discards the discovery packets or actually answers them.
05-12-2021 08:17 AM
Well... something broke when it lost the SSO state. Have you tried to power down one controller at a time to see if ap's are not able to join either. If ap's can join one and not the other, then you most likely will need to rebuild the broken one. If both don't work as a single unit, then you can either rebuild it from scratch or continue troubleshooting the issue. I personally would rebuild, since to me its easier to get it back and running. I use to run all my 9800's in SSO and ran into issues and now all mine are N+1. Maybe one day, I will test in my lab again, who knows.
05-12-2021 01:27 PM
Scott,
Well the stack is now17.03.03 and no change..
how do we do a capwap debug for an ap, mac? 2c:0b:e9:c4:0c:50. thats mine I can bounce around. its primary mwar is the 9800
the active controller be the same as.. usual. how do I power off/ power cycle one (or the other) remotely? I have found that when these controllers are married up.. the .. standby unit is mute.. on service port. only sorta responding on console. With sever limitations.
05-12-2021 02:34 PM
oh well. I'm paid for.. I Opened a TAC case.. let you know what happens.
05-12-2021 08:00 PM
05-13-2021 07:28 AM
darn the bad luck..
myWLC#sh wireless stats ap join summary
Number of APs: 0
Base MAC Ethernet MAC AP Name IP Address Status Last Failure Phase Last Disconnect Reason
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
myWLC#sh tech ?
its blank..
suggesting that the traffic is not even reaching the 9800 any more. So I don't know why bouncing the 9800's cause the 9500 to stop feeding traffic. Anyway SR691409353 has been opened.
one of these days I will give it a proper name and put it into production.
05-13-2021 09:23 AM
The Problem has been found and fixed.
seems command
wireless management interface Vlan314
public-ip x.x.x.x
was missing, well blank.
the vlan was NOT identified/tagged as the WMI!.
how that got lost during reboot is unknown.
in the GUI that be Config/interface/Wireless
05-13-2021 10:13 AM
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide