01-07-2013 01:23 AM
Hi,
Lately I've noticed some strange behavior on some of the switchports.
When I go through the logs my SGE2000/2010 stack, I see that some of the ports randomly lose their connection:
2147482703 | 05-Jan-2013 04:11:43 | Warning | %LINK-W-Down: 2/g14 |
2147482704 | 05-Jan-2013 03:35:20 | Warning | %STP-W-PORTSTATUS: 2/g33: STP status Forwarding |
2147482705 | 05-Jan-2013 03:34:50 | Informational | %LINK-I-Up: 2/g33 |
2147482706 | 05-Jan-2013 03:34:47 | Warning | %LINK-W-Down: 2/g33 |
2147482707 | 05-Jan-2013 03:34:19 | Informational | %LINK-I-Up: 2/g33 |
2147482708 | 05-Jan-2013 03:34:17 | Warning | %LINK-W-Down: 2/g33 |
2147482709 | 05-Jan-2013 03:34:15 | Informational | %LINK-I-Up: 2/g33 |
2147482710 | 05-Jan-2013 03:34:14 | Warning | %LINK-W-Down: 2/g33 |
2147482711 | 05-Jan-2013 03:34:12 | Warning | %STP-W-PORTSTATUS: 1/g15: STP status Forwarding |
2147482712 | 05-Jan-2013 03:33:42 | Informational | %LINK-I-Up: 1/g15 |
2147482713 | 05-Jan-2013 03:33:40 | Warning | %LINK-W-Down: 1/g15 |
2147482714 | 05-Jan-2013 03:33:20 | Warning | %STP-W-PORTSTATUS: 1/g15: STP status Forwarding |
2147482715 | 05-Jan-2013 03:32:50 | Informational | %LINK-I-Up: 1/g15 |
2147482716 | 05-Jan-2013 03:32:47 | Warning | %LINK-W-Down: 1/g15 |
2147482717 | 05-Jan-2013 03:31:48 | Warning | %STP-W-PORTSTATUS: 2/g5: STP status Forwarding |
2147482718 | 05-Jan-2013 03:31:18 | Informational | %LINK-I-Up: 2/g5 |
I'm having trouble locating the source of the problem. The devices connected to the port are servers and desktops.
This happens frequently throughout the day, but not always on the same ports.
What could cause the random drops?
Thanks in advance!
Solved! Go to Solution.
01-07-2013 05:30 AM
Hi Davy, looks like you've got a stack. The stack implementation of the older SFE/SGE weren't very great and do have some stability issues.
The common causes for ports to go up/down may include
If it is at all possible, I'd break the stack and have the switches standalone. I would attribute to 90% of the problems to the stack. Most of the time it's just that, unfortunately.
If you'd like to troubleshoot off the 5 points listed above, you can make sure your root bridges are set correctly to avoid max age timers updating causing a drop in cam tables.
You may also manually set port speeds/negotiations to see if it stabilizes the connection. Discovery protocol like bonjour can cause unexpected errors so you may want to disable it.
If the switches have a really heavy load or high cpu/memory use, may try to remove a few connections. If the switches are operating in layer 3, you may be experiencing SFFT overflow errors since the software can't route fast enough.
Of course, could always be a firmware issue. Make sure you're on the latest!
-Tom
Please mark answered for helpful posts
01-07-2013 05:30 AM
Hi Davy, looks like you've got a stack. The stack implementation of the older SFE/SGE weren't very great and do have some stability issues.
The common causes for ports to go up/down may include
If it is at all possible, I'd break the stack and have the switches standalone. I would attribute to 90% of the problems to the stack. Most of the time it's just that, unfortunately.
If you'd like to troubleshoot off the 5 points listed above, you can make sure your root bridges are set correctly to avoid max age timers updating causing a drop in cam tables.
You may also manually set port speeds/negotiations to see if it stabilizes the connection. Discovery protocol like bonjour can cause unexpected errors so you may want to disable it.
If the switches have a really heavy load or high cpu/memory use, may try to remove a few connections. If the switches are operating in layer 3, you may be experiencing SFFT overflow errors since the software can't route fast enough.
Of course, could always be a firmware issue. Make sure you're on the latest!
-Tom
Please mark answered for helpful posts
01-07-2013 07:13 AM
Hi Tom,
First of all, thanks for the reply!
I will try your suggestions and will give feedback on it asap.
Our firmware is indeed outdated, so I'll give that a shot first.
01-16-2013 01:37 AM
Hi,
I've tried the answers you suggested, but so far I've been out of luck.
We do have some stand-alone SGE2000 switches in our network as well.
They've been showing the same behavior as the stacks:
147483044 | 15-Jan-2013 13:26:43 | Warning | %STP-W-PORTSTATUS: g4: STP status Forwarding |
2147483045 | 15-Jan-2013 13:26:41 | Warning | %LINK-W-Down: g4 |
2147483046 | 15-Jan-2013 08:30:07 | Informational | %LINK-I-Up: g4 |
2147483047 | 15-Jan-2013 08:30:07 | Warning | %STP-W-PORTSTATUS: g4: STP status Forwarding |
2147483048 | 15-Jan-2013 08:30:04 | Warning | %LINK-W-Down: g4 |
2147483049 | 15-Jan-2013 08:30:04 | Informational | %LINK-I-Up: g4 |
2147483050 | 15-Jan-2013 08:30:04 | Warning | %STP-W-PORTSTATUS: g4: STP status Forwarding |
2147483051 | 15-Jan-2013 08:30:02 | Warning | %LINK-W-Down: g4 |
We do have a lot of STP topology changes when I check it in the properties screen.
Might this be the cause of it?
And if so, how can I troubleshoot this?
root bridge elections are all in order and the max age timer is set to 20 seconds.
Also, our last topology change was 3days ago, but we get these random port drops every day.
01-16-2013 05:50 AM
Hi Davy, each switch has a default root bridge as 32768. What you want to do is make the head-most switch root bridge 4096 then the next in line 8192, next in line 12288, etc incrementing bu 4096. Additionally, you may try to globally filter BPDU.
-Tom
Please mark answered for helpful posts
01-16-2013 06:04 AM
Tom,
This has already been configured.
Our first stack is the root bridge with 24576
Our backup root bridge has 28672
Our other (stand-alone SGE2000 switches) are configured as 32768.
I have configured BPDU filtering in stead of flooding on all our switches as well.
I've added a picture to give you a better view of the topology:
01-21-2013 02:10 AM
Hi,
An update on the situation so far.
Setting the port to a static value seems to have helped for our stand alone switches!
The problem still persists on some of the ports on the stacks though.
This raised a few questions:
Thanks in advance!
01-21-2013 05:16 AM
Hey Davy,
Thanks for the couple questions back. I'm not sure I'll give you the greatest answer but I will try.
Auto negotiation can be affected by a myriad of things. It could be (and some may seem silly...) the switch beging gigE and a NIC being 100, if the NIC is not advertising it is up to the switch to figure out what it is doing. This can lead to duplex mismatch, etc. This is often NOT seen on gigE between node and switch being half duplex doesn't exist (does it??? never seen it). It can also be media used, Cat5 is 100 mbit, Cat5e is roughly 350 mbit while cat6 is gigE. So it may be whats in between giving the fits. I'd recommend not to use Cat5e, just go with Cat5 or Cat6, not the middle man.
Second question, if you break the stack, the topology doesn't have to change. I do recommend a couple redundant links somewhere just incase a layer 1 break somewhere and let spanning-tree be spanning-tree. You never want a switch isolated due to wiring issues.
Last one, L3 mode, there is no performance benefit from the switch point of view. If you don't need the switches routing, don't use it. If your router is over-loaded, making the switch L3 will alleviate the router load and only send traffer that needs a router resource (such as internet).
-Tom
Please mark answered for helpful posts
01-21-2013 07:18 AM
Hi Tom,
We use Cat5e cabling throughout the building, so it could indeed be the wiring.
Anyways, thanks a lot for your time and help!
I've marked this question as answered! :-)
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide