Hi all,
Our client replaced the aging core switches at one location with 10 Cisco SG350X-48P switches organized in 3 stacks, stack1 and stack2 are composed of 4 switches each and stack3 is composed of 2 switches. Stack1 was defined as the core "router" (layer 3 forwarder) and it did not do the job properly. Stacks 2 and 3 are connected to stack1 via 2 10G links in LACP for each connection. We tested that VLANs at layer 2 were correctly configured and there were no issues across the 3 stacks, but layer 3 forwarding was spotty at best in stack1, random connection losses or no connections. There are 115 VLANs or which 26 are layer 3 in stack1. As far as we can tell the firmware is the latest version: 2.5.5.47. We tried increasing the Max IP Entries TCAM resource by tricking the system and reducing IPv4 and IPv6 policy based routes to 0 (they are not used in this context). We call this a trick since we did not find any more direct way to manage TCAM resources. To prove to ourselves that the issue is layer 3 forwarding, we "borrowed" a stack of 2 Cisco C3650s from a project, moved the layer 3 forwarding to what we called stack0, converted stack1 to simply layer 2 and the layer 3 forwarding is now behaving as expected on stack0 (C3650s).
Hope all this makes sense. Is this incorrect behavior a firmware bug, is this a bad configuration or use of the SG350X-48P switches? Like asking too much of this model.