cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
4912
Views
11
Helpful
33
Replies

9800LF - AP Join issues - MGT Interface is LAG

perrymcgrew
Level 1
Level 1

IOS-XE 17.09.02.  VTP mode is Client and I can see all our VLANs on switch and 9800LF.   I defined VLAN 2 with IP 10.0.3.254 /22 on Core and its routed.  DHCP is from the Core and has Option 43 pointing to the 9800LF IP assigned to its VLAN 2 interface

I have test lab setup in office.  3560X Switch connected to our Network.   2 APs connected to the 3560X Switch.  

3560X AP Port config

interface GigabitEthernet0/1
description ** TEST AP PORTS **
switchport trunk encapsulation dot1q
switchport trunk native vlan 2
switchport mode trunk

The 9800LF is connected to 3560X Switch using PortChannel  below:

interface Port-channel10
description ** EtherChan to CUN-WLC-9800LF **
switchport trunk encapsulation dot1q
switchport trunk native vlan 2
switchport mode trunk

interface GigabitEthernet0/47
description CUN-WLC-9800LF LAG
switchport trunk encapsulation dot1q
switchport trunk native vlan 2
switchport mode trunk
channel-group 10 mode on

interface GigabitEthernet0/48
description CUN-WLC-9800LF LAG
switchport trunk encapsulation dot1q
switchport trunk native vlan 2
switchport mode trunk
channel-group 10 mode on

The 9800LF ports are config'd as such:

interface Vlan2
description CUN-WLC-9800LF Mgt
ip address 10.0.3.252 255.255.252.0 secondary
ip address 10.0.3.253 255.255.252.0
mdns-sd gateway

interface Port-channel10
description 9800L MGT LAG
switchport trunk native vlan 2
switchport mode trunk

interface TenGigabitEthernet0/1/0
description PortChan 10
switchport trunk native vlan 2
switchport mode trunk
negotiation auto
channel-group 10 mode on
service-policy output AutoQos-4.0-wlan-Port-Output-Policy
!
interface TenGigabitEthernet0/1/1
description PortChan 10
switchport trunk native vlan 2
switchport mode trunk
negotiation auto
channel-group 10 mode on
service-policy output AutoQos-4.0-wlan-Port-Output-Policy

However, I don't see where Mgt Interface is set in the config

WLC-9800LF#show management-interface
No management interfaces configured

I can ping any IP address from the 9800LF.  Yet console from APs report no valid Controller found.  The Red Alarm LED is lit on the primary 9800LF.   TAC has been looking at this for a week and stated my config looks OK.   I am beginning to wonder if I need to move the IP off VLAN 2 and on to the 9800LF PortChannel as a L3 Etherchannel. If so, how does that affect my RMI+RP redundant Or is it that somehow need to set the Mgt Interface to VLAN 2 on the 9800?

Thx

 

33 Replies 33

rebooted primary 9800 and the APs won't join,   The APs still have correct VLAN 2 IP.  Calling it a day  

This may seem obvious but had you saved the config before rebooting?
And how exactly did you reboot? (reload reloads both chassis, clean switchover is done using redundancy force-switchover)

Use "show facility-alarm status" to check the alarm details.

Yes, I chose the Save Config and Reboot option at 3:25pm. Both the APs were joined and using IPs in VLAN 2.   I left the APs plugged in overnight.   The 9115AXi eventually joined at 7:19pm.   The 2802i still has not joined the 9800L.  

The only error WCAE reports is: 

WLAN is using mDNS gateway functionality, but not corresponding SVI Interface detected. WLANs/Policies....

WLC-9800LF#show facility-alarm status
System Totals Critical: 4 Major: 0 Minor: 0

Source Time Severity Description [Index]
------ ------ -------- -------------------

TwoGigabitEthernet0/0/0 Mar 13 2023 15:26:16 CRITICAL Physical Port Link Down [1]

TwoGigabitEthernet0/0/1 Mar 13 2023 15:26:16 CRITICAL Physical Port Link Down [1]

TwoGigabitEthernet0/0/2 Mar 13 2023 15:26:16 CRITICAL Physical Port Link Down [1]

TwoGigabitEthernet0/0/3 Mar 13 2023 15:26:16 CRITICAL Physical Port Link Down [1]

Really don't know where to turn next.  Going to open a new TAC case.  

 

                  >...The 2802i still has not joined the 9800L
 - As stated before use these tools toohttps://logadvisor.cisco.com/logadvisor/wireless/9800/9800APJoin

 M.



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

We took a radioactive trace a few days ago.  TAC has not responded back.  Will likely try them again as it should not take 4 hours for the 9115AXi to re-join.   

 

            >...as it should not take 4 hours for the 9115AXi to re-join.   
 Take a look at the interface counters  for this AP connection on the switch , check for unusual stats if any , 

 M.



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

perrymcgrew
Level 1
Level 1

Thanks to all for the replies.  TAC was not able to pinpoint the issue.  I completely reset the 9800L's to Factory defaults on IOS-XE 17.10.1.  On the Console connection, I only set up the Service Port so I can access the WebUI.  The Day_0 wizard starts, and I only defined the very least needed to get through it.  After saving and reloading the WLC I went in to define the MGMT, Port-Channel etc.   The APs joined and have been solid since.  Previously, I set up the WLC network interfaces using CLI.  We could not see any difference between the CLI run config and the one generated from the WebUI. 

I have defined my WLANs / Policies that I want to carry over from our 5508 (8.5.182.105).  I'm testing and going well except for the Guest network.  Thart will be my next post.

P.

perrymcgrew
Level 1
Level 1

The problem still exists...

  • Swapped lab switch out - same model.  Connected 9800 and the APs to the switch as before.
  • disabled 9800L HA.  Both 9800L's are standalone
  • Brought what was the backup 9800L up to office - factory reset it.
  • Performed the basic setup so we could access WebUI
  • "rolled back" the 9800L from 17.11.1 to 17.9.3
  • config'd Wireless Management (VLAN 2), ethernet etc on the 9800L
  • Factory reset 9115AXi AP.
  • AP pulls WLC Mgt IP from DHCP Option and gets valid IP
    • I can ping AP IP from PC.  From AP's console, I can ping everything *except* the WLC Mgt IP.

9115 / 2802 Access points still won't join -- same error message.   Enabled all dtls & trustpoint debugs on AP but can't see the root cause of the issue.  Partial log attached. Called TAC to raise case priority to S2.  

 

I suspect the AP is still running 17.11 code but you've filtered the log so we can't even see that.
Please downgrade the APs to 17.9.3 code and then factory default reset the AP again please?
And include the full AP log from power on so we can see everything that might be relevant.

https://software.cisco.com/download/home/286304510/type/286288051/release/15.3.3-JPN2
https://software.cisco.com/download/home/286322352/type/286288051/release/15.3.3-JPN2

 

                         - Put a a laptop on the same subnet as an AP and run : 
                               % nmap -sU -p5246-5247  WLChostname
    in order to verify full capwap reachability  from the APs to the controller. Also run an iperf test from the laptop to a server in the same subnet as the WLC , look if the intranet networking performance is nominal ,

 M.



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

I ran nmap earlier and posted results showing the ports are open.  I am attaching picture of the 9800L / Switch / AP setup in my office.  The APs and the 9800L Mgmt are in the same VLAN and have Gig connection.  The switch is a 3560X which I know is "EoL" in Cisco's eyes, but we still have them deployed in our network.   This switch was in production up to February and had APs connected to it as well that were running off a 5508.  Plugging a laptop into the switch and have had no issues accessing apps or Internet -- but have not run iperf.   The entire IT Dept uses the same upstream switch and they'd certainly would let me know if there was any nertwork performance issue   

We took the plunge and reset to Factory defaults and did the bare minimum config up to where the APs should join. We rolled back to 17.9.3.  Had a Sr level engineer on a WebEx when the 9800L was reconfigured and he is just as baffled as I am.  

The only thing I have not changed is the uplink port from this switch to the upstream switch on the floor.   I am going to do that in a few minutes.  Other than that, only the 9800L have not been swapped.  Not really looking forward to setting up a 9800CL to test   There is a growing thought that there is some corruption in the SUDI Trustpoint -- just not sure how to verify / fix.  

 

                          >....I ran nmap earlier and posted results showing the ports are open...
  - When I talked earlier about rommon versions and hardare  programmable devices (updating) , you only gave feedback on the rommon version , not the second item , are you also on latest version according to : https://software.cisco.com/download/home/286321399/type/283425232/release/17.11.1 ?

                   >....I have not changed is the uplink port from this switch to the upstream switch on the floor....
   - Could you check all involved port counters on the path from the 9800 Wireless Management Interface  to the access point , meaning check for instance port counters on the port where the access point is connected to the switch , check if all is good concerning CRC count and overruns and all of those , also check if the port for the access point is 1G full duplex and that there are no discrepancies to what is to be expected. Then go up the path and check port counters on 'the next hop' - till the controller is ( reached (verify all involved hops) . Then if possible also check port counters on the 9800 LF too for the wireless management interface !

                       The commands below are useful for analyzing ap join issued :
       show wireless stats ap join summary
       show wireless dtls connections
       show platform hardware chassis active qfp feature wireless capwap datapath statistics drop all
       show platform hardware chassis active qfp feature wireless capwap datapath mac-address <APradio-mac> details
       show platform hardware chassis active qfp feature wireless capwap datapath mac-address <APradio-mac> statistics
       show platform hardware chassis active qfp feature wireless dtls datapath statistics all (view all DTLS drops)
      show platform hardware chassis active qfp statistics drop all | inc Global | Wls (Data Plane Statistics – Global Wireless Drops)

  Other useful commands are mention in : https://logadvisor.cisco.com/logadvisor/wireless/9800/9800APJoin

 

 M,



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

But have you downgraded the software on the AP yet (with default after that)?  See my reply above.

Raised the TAC case to Sev 2.   Got a new engineer (5th one so far).   He found a case where the issue was similar.   The customer changed the 9800's Management VLAN and the APs joined.   

So I moved the 9800L's Management VLAN from VLAN 2 to our current Mgmt VLAN 11 of our 5508.  All the APs in my Lab setup joined rapidly.  Moved the 9800L's Management VLAN back to the VLAN 2 and the APs once again failed to join.

The only substantial difference between the 9800L VLAN 2 and the 5508 VLAN 11 is that the 5508's VLAN 11 is a /24 subnet mask.  The 9800L's Management VLAN 2 is a /22 subnet mask.

I changed the 9800L's VLAN 2 subnet mask to a /24.   All the APs joined rapidly.  

TAC engineer is going to bring this forward to see if it is a Bug or an "undocumented limitation" on the 9800L's Management VLAN.  This issue presented itself in 17.9.2, 17.9.3, 17.10.1 and 17.11.1.

Has anyone here used a /22 on the Management VLAN?  

We don't have any APs sitting directly on the same VLAN as a 9800 - ours are all routed - but we have APs sitting on /21 and /22 subnets without any problems at all.  The WLCs are on /26 subnets.

Are you sure the DHCP is correctly configured with the same /22 mask?  And same/correct throughout?

If this does turn out to be some weird bug then routing is also a workaround - don't put the APs and WLC on the same VLAN - route them and use option 43 for WLC discovery.

Review Cisco Networking for a $25 gift card