07-27-2025 11:25 PM
Hello!
There is such a problem in the network. There is a C9500-40X core, it is responsible only for the user segment of the LAN (desktop PC, IP phones, printers, air conditioners, UPS, video cameras, etc.). Eleven floors are connected to it via MM fiber, two Cisco SFP-10G-SR-S modules, on each floor there is a stack of two or three C9300-48P, several neighboring buildings are also connected via SM fiber, two Cisco SFP+ 10G LRM modules, interfaces are aggregated everywhere. Everything was always fine, but for a month now, every week, the connection with the fifth floor is lost, maybe during the day, maybe at night. I changed the SFP modules both on the floor and on the core, changed the optical patch cord and rebooted the stack on the fifth floor. If you look at the channel status on the stack, when the connection is lost, then the aggregated interfaces show suspended at this time, at the moment I have disassembled the aggregation between the floor and the core, last night the connection was lost again, but after four minutes it was restored by itself.
The network core is a VTPv3 Primary Server domain, all other switches are clients. In general, there are no problems in the network globally. I cannot understand whether there is a loop somewhere, or the problem is in the optical line between the floor and the server room. The building is new, less than three years ago it was put into operation.
On July 22, when I first disassembled the aggregation on the fifth floor, then began to disassemble the network core, but did not physically disconnect the fiber optics, the network went down in the user segment, rose immediately after the physical disconnection of the fiber optics to the fifth floor. What broadcast\multicast indicator is considered abnormal if there is a loop in the network?
C9300 Port settings
interface GigabitEthernet1/0/1-48
description --Client access--
switchport access vlan 30
switchport mode access
switchport nonegotiate
switchport voice vlan 111
load-interval 60
udld port aggressive
storm-control broadcast level 20.00
storm-control multicast level 87.00 65.00
storm-control unicast level 87.00 65.00
spanning-tree portfast
spanning-tree bpduguard enable
interface TenGigabitEthernet3/1/8
description -to C9500-Core-
switchport trunk native vlan 100
switchport mode trunk
switchport nonegotiate
07-28-2025 02:30 AM
- @eugeneworon You should start by examining logs on all cisco switches when there are problems; to do that in a productive manner , configure a central syslog server on all cisco devices for collecting logs, because sometimes you will already have problem indicators before something goes down,
M.
07-28-2025 05:44 AM
Can You draw topolgy
MHM
07-28-2025 11:15 PM
A brief network diagram
07-29-2025 01:24 AM
From your original post
Link is down all traffic is effect.
This issue of STP
So let start' it appear that this floor SW is root for all your domain' that wrong so wrong' root must be core SW not access SW.
Check this point
MHM
07-29-2025 01:35 AM
Now after we set Core as root we check each PO from each floor toward Core
We check
Show spanning tree interface details | in send bpdu
Let see which PO send bpdu toward core' this PO and it SW have Loop.
MHM
07-28-2025 01:31 PM - edited 07-28-2025 01:36 PM
Hello
By the sounds of it you do not have aggregation ( port-channel) between the core and access stacks) just single trunk interconnects -is this correct?
I would say most importantly you need to make sure the core switch stack is the spanning- tree root for the estate and core and access stacks are running the same stp mode -
Can you confirm what stp mode (mst-rstp- pvst+) your are running?
also you need to make sure if you have error recover enabled for such features like udld - bpduguard -link flapps or port - security violations are not enabled for auto-recovery
make sure you only have trunks running that should be their - meaning all active access-port are in administratiive mode of access so dtp is turned off and portfast is enabled
possibly suggest applying port-security on the access ports also
Is there any log buffer shown on the 5th floor stack at the time of your outages or from the core switch stack you can share?
Edited- forgot to ask -any upgrades applied to the switches in the past month?
07-28-2025 11:26 PM
@paul driver написал (-а):By the sounds of it you do not have aggregation ( port-channel) between the core and access stacks) just single trunk interconnects -is this correct?
Yes, but a week ago it was aggregated, I already mentioned that I disassembled it
@paul driver написал (-а):I would say most importantly you need to make sure the core switch stack is the spanning- tree root for the estate and core and access stacks are running the same stp mode -
Can you confirm what stp mode (mst-rstp- pvst+) your are running?
all switches in the network work on rstp
yes, I've been thinking about this for a long time, to enable it everywhere on access ports, it's already configured on some interfaces in the network.
@paul driver написал (-а):Is there any log buffer shown on the 5th floor stack at the time of your outages or from the core switch stack you can share?
log buffer was updated by the time I wanted to look. The size is by default, can it be increased?
@paul driver написал (-а):Edited- forgot to ask -any upgrades applied to the switches in the past month?
Not a large part of the switches were updated at the end of December 2024. All switchs the rest (at least 50 pcs.) on May 1, 2025. ver. 12.17.04
07-29-2025 12:11 AM - edited 07-29-2025 12:17 AM
Hello
apart from what @Leo Laohoo @has requested your core switch is NOT the stp root and i would say it needs to be
alao share the run cfg of the core stack in a file and attach to your post
07-28-2025 06:25 PM - edited 07-28-2025 06:27 PM
@eugeneworon wrote:
the connection with the fifth floor is lost, maybe during the day, maybe at night.
Please elaborate by the statement "the connection with the fifth floor is lost"? When this happens WHAT were the LED of the uplinks (from the switch to core switch), no LED, amber, green?
Are the uplinks going to two switches of the stack or just one?
What firmware is the 9500 and the 9300 on?
Please provide the complete output to the following command (taken from the 9500):
show controll e Ten 2/1/8
show controll e Ten 3/1/8
07-28-2025 11:32 PM
@Leo Laohoo написал (-а):
@eugeneworon wrote:
the connection with the fifth floor is lost, maybe during the day, maybe at night.Please elaborate by the statement "the connection with the fifth floor is lost"? When this happens WHAT were the LED of the uplinks (from the switch to core switch), no LED, amber, green?
Аmber, blinked every 1-2 seconds. Remote access to the stack was lost, the network on the fifth floor stopped working. The device could not be ping.
@Leo Laohoo написал (-а):
@eugeneworon wrote:
the connection with the fifth floor is lost, maybe during the day, maybe at night.Are the uplinks going to two switches of the stack or just one?
Сonnect to two different switches in the kernel stack. A brief network diagram is shown on the screen
@Leo Laohoo написал (-а):
@eugeneworon wrote:
the connection with the fifth floor is lost, maybe during the day, maybe at night.What firmware is the 9500 and the 9300 on?
Please provide the complete output to the following command (taken from the 9500):
show controll e Ten 2/1/8
show controll e Ten 3/1/8
07-29-2025 12:15 AM
@eugeneworon wrote:
Аmber, blinked every 1-2 seconds.
Post the interface configuration from the 9500 & from the 9300.
Uplinks going amber, without any signs of CRC- or duplex mismatch errors, sounds link Etherchannel is doing something dodgy.
07-29-2025 01:28 AM
07-29-2025 01:45 AM
There ya go! None of those uplinks are in an EtherChannel.
Is this intentional?
07-29-2025 02:15 AM
@
Yes. It worked for three years without problems. This month it started to fall, I had to remove the port aggregation to check. Because without physical intervention, it is not known when the link would have been up. In this form (without aggregation), it worked for a week without failures, last night the link on both ports turned off, but after four minutes it was up on its own.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide