I have a very frustrating but also weird issue in my network.
When a mobile user roams between accesspoints which are connected to different switches the user cannot communicate anymore over the network.
2x Accesspoints with roaming support
2x SG350-28P switches, named sw5 and sw7
Switches are interconnected using 2x 1Gbit optical fiber using portchannel
LACP enabled on the portchannel with short timeout
Both AP's share the same VLAN 8.
Static MAC addresses are not configured
Dynamic MAC address aging time is set to default of 300 sec
1. Client is connected to AP1 attached to sw5. He can access all resources
2. Client moves onto the area of AP2 attached to sw7. He cannot access any resource on the network
3. After issuing a clear mac-address-table dynamic the client can immediate access network resources.
Additional testing found out that during the time that the client cannot access the network the broadcasts this client send are coming through (proven by received dhcp packets at the dhcp-server end). However the client does to get the final returned DHCPACK.
See attached the log of switch5.
Thanks for any ideas which can solve this issue. Clients are complaining a lot :-(
Nobody any clue or suggestion?
I reduced the MAC aging time on all switches to 10 seconds, that helps to get L2 connectivity fixed after 10 seconds after roaming, but still does not help with a lot of clients.
For example OSX recent versions get a 169.254 self-assigned IP when in case of no DHCP response, and do not auto-renew. Leaving the client disconnected...
And lowering the MAC aging causes lot's of additional broadcasting on the network, so not a real solution...
So I really need to get the issue fixed without lowering MAC aging.
Thanks for any tips!
I believe we have the same problem.
As an experiment, try dropping the LAG between devices and just use a single connection. That fixed my roaming problem, I wish I had my LAG back but I now have happy users. Problem also seemed to stop when the AP's were connected to the same switch.
I fully agree. Since reducing the LAG to only a single member the problem does not happen very often anymore. Howeve I still have reports from users that claim that they loose connectivity for a minute from time to time. (I set the MAC aging time to 60 seconds to make the impact smaller).
This is really a bug somewhere within the SG350 OS. Any chance that we will get a fix for this bug?
Interesting read and I don't have an answer but I would like a little more info. Are you using both switches in L2 mode with a router? What router? Or is 1 of the switches in L3 mode with DHCP on the switch? I run all my local routing across my SG300-28 switch in L3 mode so no router is involved other than internet traffic. When I get my SG350 switches I plan to do the same thing. And I was thinking of using a LAG. I have 2 switches now but no LAG. I use a Cisco RV340 router with a 30 bit mask connected to my L3 switch.
I moved from two SG300-10's to the SG350-24X and the SG350-28 because the SG300's are going end of support as of pretty much now. (Oct 2019) Also the 10Gbe was a nice bonus in the 24X.
I had the SG300's in L3 mode, the SG350's, from everything I've read *ONLY* do L3 from the factory. There are no settings to change it.
Can you share both switches configuration?
1) Would you like my configurations too?
2) How would you like them shared? The anonimized (password removed) configuration files?
It's been a while (maybe 2 or 3 months) since I had the LAG active, I left it alone but simply set up another single port trunk and moved one of the connectors over to it and disconnected both LAGs.
I just updated to the latest firmware yesterday. I'm hoping this stops the switches from crashing every 4-6 weeks.
Crashing did not stop after I stopped using the LAG. How I know it's failing is that my monitoring device, Domotz, starts getting errors communicating via SNMP to my APC UPS's.
Attached are the configs for the two switches.
All VLANs are isolated, they do not see each other. Except that VLAN 1, could, if it wanted, initiate connections to other VLANs.
You had configured port-channel but I didn't find any port assigned to this port-channel (same on both switches)
interface Port-Channel1 flowcontrol auto ip dhcp relay enable spanning-tree link-type point-to-point switchport mode trunk macro description "switch " !next command is internal. macro auto smartport dynamic_type switch
Can you show me that which ports are added to this EtherChannel on both switches?
How (and should I) do I remove this port-channel?
Not exactly sure how, or why it's there, but in my frustration in trying to solve this issue, I may have tried something and created this.