04-13-2015 07:38 AM - edited 07-05-2021 02:54 AM
Hi
We've installed the 8500 WLC in HA, eveything working fine. running 8.0.115
We have 2 Distributions on one site, 1 wlc in each room.
so we have 1 wlc connected to compA dist - the HA connected to compB dist.
Last week we was hit by a power outage , the whole room went down to compA, distribution, wireless everything went down in the room.
Everything failed over to the seconday distrabution except the HA for wireless.
My question is, whats the point of the HA controller if it doesn't stay up.
The wlc are installed to cisco best practices, except the fact we have the HA in another room connected by fibre.
If this is in HA, why didn't it fail over, seems to have gone in to split brain.
The HA didn't start working till the pwer was restored to the room,
When we installed we perfomed our own tests by shutting the port down to the dist and removeing the HA cable to replicate a power outage, this worked.
So we are confused why this should happen.
cheers in advance
Craig
04-13-2015 07:52 AM
Craig,
If everything tested fine, I don't know why SSO wouldn't of worked. I have some installs with SSO with the controllers in different buildings and powering off the primary did work for failover to the HA and vice versa. I didn't have the controllers connected via fiber, but copper, but I don't think that should matter. Most of my SSO installs are with the controllers in the same rack.
When the compA room went down, was everything being routed back to compB? Did the ap's ever join the HA controller?
-Scott
04-13-2015 07:59 AM
Hi scott
No APs on the wlc at this time as we just installing them.
The HA controller seemed to have gone in to maintanance mode, but only recoved once the other wlc came up.
We are going to try and replicate this issue again, as I wasn't on that site at the time, I could grabe any logs from the WLC.
cheers
04-13-2015 09:28 AM
Maintenance mode is not a good thing with HA. Could of been the latency/issue with the heartbeat or maybe was already in maintenance mode to be honest.
-Scott
04-13-2015 10:49 PM
Can you please post the output to the command "sh redundancy summary"?
04-14-2015 12:44 AM
Hi Leo
Attached is the layout, currently only have 1Gb line cards till core upgrades, so this is temporary.
(Cisco Controller) >show redundancy summary
Redundancy Mode = SSO ENABLED
Local State = ACTIVE
Peer State = STANDBY HOT
Unit = Secondary (Inherited AP License Count = 500)
Unit ID = 64:9E:F3:65:E7:A0
Redundancy State = SSO
Mobility MAC = 64:9E:F3:65:E7:E0
BulkSync Status = Complete
Average Redundancy Peer Reachability Latency = 89 Micro Seconds
Average Management Gateway Reachability Latency = 384 Micro Seconds
(Cisco Controller-Standby) >show redundancy summary
Redundancy Mode = SSO ENABLED
Local State = STANDBY HOT
Peer State = ACTIVE
Unit = Primary
Unit ID = 64:9E:F3:65:E7:E0
Redundancy State = SSO
Mobility MAC = 64:9E:F3:65:E7:E0
Average Redundancy Peer Reachability Latency = 91 Micro Seconds
Average Management Gateway Reachability Latency = 393 Micro Seconds
There was no APs are this WLC at the time as we only just commisioned it. Checked sysinfo, both WLC had rebooted after the power had restored.
Cheers
04-14-2015 01:01 AM
The HA didn't start working till the pwer was restored to the room,
There can only be one logical explanation left that could fit this scenario: The redundancy to the wired network didn't work. And this is why when power was restored to the room, the APs didn't go anywhere but to primary WLC.
Check the allowed VLANs on the hot-standby and compare the values.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide