cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2909
Views
0
Helpful
4
Replies

DHCP issues when using LAG between two SG350 switches and roaming between AP's hosted on separate switches

BCinBC
Level 1
Level 1

I have a LAG that runs between two SG350 switches. Each switch has a single Ubiquiti Unifi AC PRO Access Point (f/w 4.0.54.10625) attached to it. The AP's are trunked and have four VLANs associated with them. They, for the most part, work just fine.

 

The issue I'm having is that when I roam between the two access points, the DCHP negotiation will fail. I've watched the negotiation happen, by doing a tcpdump on the router, and it only fails to perform the last step in the DHCP negotiation, and the device isn't issued an IP address. Eventually, in about 5 to 15 minutes or so, the device is allowed to join the network again. 

 

Configuration:

  • Main Switch: SG350-28
  • Secondary Switch: SG350X-24
  • Current firmware on both switches: 2.5.0.83 
  • Router: Ubiquiti Edgerouter 4 with f/w 2.0.6
  • 2 * Ubiquiti Unifi AC PRO Access Point f/w 4.0.54.10625
  • Unifi, under docker, older version 5.10.26 (been waiting for a container update for ages)
  • LAG is dual Gigabit Ethernet
  • 7 VLANs, registered in the switches, these work fine
  • Each VLAN has a DHCP relay back to the DHCP server, DHCP relay is set up in both switches
  • DHCP is on Router for each VLAN
  • VLANs are Isolated

Tests performed:

  • When I directly connect the two switches via a single trunk, DHCP works fine
  • Connecting the AP's to the same (Main) switch works fine
  • STP enabled or disabled makes no difference (Set in default Rapid STP) while LAG'd
  • No issues when reconnecting to the same AP repeatedly on either switch/AP

Not sure what to try next other than dump LAG, which I'd rather not.

edit: Added AP f/w an updated Unifi info

4 Replies 4

LikeMyFloydPink
Level 1
Level 1

-You didn't mention the Unifi Controller, so whether or not you've got Option 43 set in the DHCP leases for each vlan i dont know...

 

So that is the 1st thing i am wondering

@LikeMyFloydPinkI do have a Unifi controller, the control interface for the Unifi and the AP's are all on the same VLAN and I've never had a problem connecting to them, provisioning, etc.If I'm reading what Option 43 does, is it tells AP's where to find their controller? I guess if I was provisioning dozens across switches everywhere this is good to know, thanks!. However in my case the main VLAN is 1U. All the exotics are tagged, etc.

 

Why would it work on a regular VLAN trunk but not a LAG? The issue here is that the DHCP fails when roaming across a LAG, otherwise it works everywhere else.

 

If there's a reason I need to add 43, but I think the AP's are aware of where to find the controller. But it's not Unifi that's doling out the DHCP negotiation, it's the router.


I've even seen others not even bother with unifi and just provision the AP's with a phone app. (I prefer the dashboard myself) :)

I've been running these AP's for 2 maybe 3 years? The problem started when I switched away (pun intended) from SG300's to SG350's.

 

BCinBC
Level 1
Level 1

I've also seen that the SG350X has an SNMP problem,"switch crash with fatal error when SNMP polling is done regularly", and I've definitely been bitten by this as I poll UPS's and other things frequently.

 

The network continues to runs fine as long as I don't try using AP DHCP roaming over a LAG.

ohornig
Level 1
Level 1

I am sad to say that I have struggled with that issue on CBS350 earlier today.

At our customer premises we are using stacked CBS350 and also a number of Unifi APs.

When LACP is in operation between switches, client roaming between APs on distinct switches is not working properly. Firstly I checked all the Spanning Tree configuration and switched between MAC vs. IP/MAC load-balancing algorithm on LAG. Nothing helped.

Lastly the LAG were unconfigured and both switches connected with two distinct lines (one blocked with spanning tree). It seems that this setup helped and roaming is much more seamless with nearly no downtime.