11-01-2021 09:34 AM
I have a couple of 9800 (one 9800-L-C and the other one 9800-L-F) that Im trying to setup for RP+RMI.
The documentation says that this should be all right (or so Im reading this : "HA Pair can only be form between two wireless controllers of the same form factor" it says formfactor, not model).
But everything goes really bad after I setup RP+RMI and reboot.
Chasis 1 comes up just fine, and APs join, so the wireless is working.
Chasis 2 boots, detects the other one on the RP, and the never progresses from "All chassis in the stack have been discovered. Accelerating discovery"
It does not even enable any of its ethernet ports (not even the bloody SP port, so there is no way of reaching it again, and thats a really bad thing).
Both use the same ports for the uplink (the te ports. On the F model there is just a couple of GLC-Ts inserted here)
Both are running 17.3.3
I seem to be missing some "tool" to troubleshoot this problem. - Because I dont really see anything on the secondary when I finally have a console connected to it.
What should I do / try ?
One note:
Both WLCs are managed by their MGMT port or SP port (could this be the problem ?).
Of course they have a VLAN/SVI for AP communication just as normal.
But the customer sends Radius and manages they boxes using the MGMT port.
11-01-2021 10:10 AM
AH HA.. been there with a pair of 9800-40's.
contrary to the 55xx'x a pair of 9800's are really only 1. you may console into the 2nd unit and it will tell you to go away. the standby unit seems to have all ports disabled. well the RP and RMI are alive but ICMP is off. the SP is definitely shutdown because it IP is the same as the Active..
I suppose your CLI does not say something like this:
myWLC#sh redundancy states
my state = 13 -ACTIVE
peer state = 8 -STANDBY HOT
Mode = Duplex
Unit = Primary
Unit ID = 1
Redundancy Mode (Operational) = sso
Redundancy Mode (Configured) = sso
Redundancy State = sso
Maintenance Mode = Disabled
Manual Swact = enabled
Communications = Up
client count = 147.. >>>>> interesting Dashboard reports that I have zero clients.. which is correct as this stack is not yet in prod.
client_notification_TMR = 30000 milliseconds
RF debug mask = 0x0
Gateway Monitoring = Disabled
Gateway monitoring interval = 8 secs
11-01-2021 01:11 PM
Nope the secondary does nothing. I know that it will tell you that you cant access the stand-by on the console (unless you configure it on the active primary). But the WLC never gets that far. Its just stuck in the boot after the above mentioned line.
And Im sure that the physical interfaces should be up when its running as secondary.
But that the SP port never comes up is a travesty, and clearly not intended "as designed" right ?
I mean if that's the case, then this platform is even more immature than I previously thought.
I must confess , this is the first customer that I have that uses exclusively the SP ports for out of band, and not a console server for their equipment. - So it might be as designed, but if it is, its clearly a bad design.
11-01-2021 01:20 PM
Thomas,
How about splitting the pair, resetting the stand-alone IP's and verify the health of the 'secondary chassis'
11-01-2021 01:29 PM
I could try. But Im sure it will fail, and that the customer now thinks the solution is bad, because you cannot reach or monitor "chasis 2" via the SP port.
The solution here must be to split them, and do normal N+1 redundancy.
11-01-2021 01:38 PM
11-02-2021 09:33 AM
9800 SSO behaves more like a stack. You can only monitor the standby chassis via the active chassis.
Also note there are a number of improvements to HA serviceability in the later releases.
11-06-2021 04:22 AM
So .. Wiped "chasis" 2 ... and started over with the config.
And removed SP port config from Chasis 1.
Made sure config was good, and running on chasis 1, then configured chasis 2 just with management IP, and then the RP+RMI config.
plugged in all cables the two for the portchannel and the RP port, reloaded.
This is where it stops on chasis 2 ... and there is nothing you can do .... its just stuck.
It never comes op as SSO, and it never turns on its portchannel or "front" interfaces.
Very frustrated now......
Waiting for remote chassis to join
#Nov 6 11:05:14.908: %PMAN-3-PROC_EMPTY_EXEC_FILE: R0/0: pvp: Empty executable used for process bt_logger
#Nov 6 11:05:16.373: %PMAN-3-PROC_EMPTY_EXEC_FILE: R0/0: pvp: Empty executable used for process bt_logger
#################
Chassis number is 2
All chassis in the stack have been discovered. Accelerating discovery
Nov 6 11:05:34.324: %PMAN-3-PROC_EMPTY_EXEC_FILE: R0/0: pvp: Empty executable used for process bt_logger
Nov 6 11:05:36.331: %PMAN-3-PROC_EMPTY_EXEC_FILE: R0/0: pvp: Empty executable used for process bt_logger
Nov 6 11:05:37.081: %PMAN-3-PROC_EMPTY_EXEC_FILE: R0/0: pvp: Empty executable used for process bt_logger
11-06-2021 06:32 AM
So according to TAC the documentation is wrong .... the 9800's have to be the same PID, and not just Formfactor as stated in the Redundancy documentation I read.
This document states:
■ HA Pair can only be form between two wireless controllers of the same form factor
Since the “form factor” of a LF and LC is the same, then it should of course work.
It does not.
This says PID, and according to TAC is correct.
11-07-2021 03:17 AM
Thanks, good to know.
Did you ask TAC to file a documentation bug to get that deployment guide corrected?
10-29-2024 05:25 AM
It is my understanding that the RMI+RP/RP requires that the model numbers of the controllers must be identical. You state that you are using a 9800 L - C and a 9800 L - F. I suspect, but am uncertain that you are able to mix the two devices like that as you don't have equivalent connections.
10-29-2024 05:31 AM
@mdcrapo - @Thomas Obbekaer Thomsen has already explained that above ^^^
"the 9800's have to be the same PID"
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide