cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1474
Views
0
Helpful
6
Replies

Random and Brief Network Outages

mohamed louhab
Level 1
Level 1

I'm having an issue with my network, where we're are experiencing random and brief network outages.  They happen a couple times a day and last 5-10 seconds. when I check my two backbone switches (4506  : Supervisor: WS-X4516-10GE ,IOS : cat4500-ipbase-mz.122-31.SGA8.bin), STP remains normal and no topology change occurs.

I installed a network's snifer (Wireshark) , and after diagnosis, broadcast and multicast remains normal.
but only remark that two servers communicate a lot with unicast. (in 50s they exchange  18 gigabytes per server).
both servers are in the same switch 4506.
any help
Thanks

6 Replies 6

Nathan Spitzer
Level 1
Level 1

Couple of questions:

First, How many packets/sec are those two servers sending? The SupV-10GB is limited to 136-Gbps, 102 million packets-per-second actual forwarding rate. Many people focus on the first number but the real number is the second. As a possible example lets say that the 18GB of data is being sent in 1K chunks instead of 1.5K (MTU of ethernet) chunks. 18GB per minute=2.4G bits/sec=2.4m packets per second per server

Second, what kind of line cards are you using? Some line cards are oversubscribed. You could have an issue where one of those servers shares a port group with one or more uplinks or other critical ports. During the server transmits that group essentially dies taking other conections down.

Third, do any interfaces show errors? in particular do any interfaces show output drops (indicating buffer or oversubscription issues), tx or rx errors or collisions (duplex mismatch)

Hi nspitzer5


Firstly thank you for your help.
Today the problem occurred after traffic analysis I did not find any evidence for thesetwo servers. despite that the problem is still persisted. slowness was 20 to 30 seconds. Always toplogie STP is stable and not broadcast or multicast traffic


secondly, the line cards are:

Mod Ports Card Type                                                  Model             

---+-----+--------------------------------------+------------------+-----------

1     6  Sup V-10GE 10GE (X2), 1000BaseX (SFP)   WS-X4516-10GE  

2     6  1000BaseX (GBIC)                                      WS-X4306-GB       

3     6  1000BaseX (GBIC)                                      WS-X4306-GB       

4    48  10/100/1000BaseT (RJ45)                           WS-X4548-GB-RJ45  

6    48  10/100/1000BaseT (RJ45)                           WS-X4548-GB-RJ45  


I just look it details the information that you have already know.


third, they actually exist some server interfaces with errors but the interfaces Uplink are well.



Thanks

Have a few questions:

Who is slow/down (users, servers,etc)?

Are all these devices in 1 VLAN/IP subnet, or are there multiple VLANs that are affected?

How many switches are part of the network that is slow?

Is this 4506 a core switch, or just access?

What are the ports on the 4506 that are affected with the slowness?

Is the CPU on the 4506 spiking during the slowness?

What is the interface utilization for each of the affected ports (make sure you set the load interval to 30 to get the best picture) when the slowness occurs?

Can you quantify "slow or outage"?  IE: loss of pings for x number of seconds, etc?

Do you have any QoS configured on the affected devices?

Can you post a "show int" for the server ports during the outage?

You can always PM me to discuss more in detail if you wish.

Hi dbass ,

Who is slow/down (users, servers,etc)?

Users

Are all these devices in 1 VLAN/IP subnet, or are there multiple VLANs that are affected?

there multiple VLANs , but the most users are in the Vlan 1 ( with servers ).

at present , we move the users in the vlan 1 to the appropriate vlan.

How many switches are part of the network that is slow?

all swithchs

Is this 4506 a core switch, or just access?

there are two 4506 in the core , and eight 4503 in the access.

What are the ports on the 4506 that are affected with the slowness?

I do not know

Is the CPU on the 4506 spiking during the slowness?

the CPU is normal during the slowness.

Can you quantify "slow or outage"?  IE: loss of pings for x number of seconds, etc?

Yes , the are a lot of loss of pings ( when the are de slowness )  , but sometimes the outage  affact the IPphone.

Do you have any QoS configured on the affected devices?

no , but i will configure the QOS in the LAN for the VOIP.

Thank you for your Help

Do you HSRP between your 2 4506 core switches?  If you do, does HSRP fail over during your outage?  Do you have spanning tree configured so that your core switches are root and secondary root, and that all of the access switches are a much higher priority?

As a general rule, never have servers in the same VLAN as your users, and you shouldn't be using VLAN 1 either.  I would change the VLAN number from 1 to something else ASAP as well as moving the users in to their own VLANs.

In the switch logs do you see any MAC addresses flapping between ports?

Also, do your access switches have redundant connections to your core switches? Are they connected directly to the core switches or are they daisy chained off of one another?

Are the access switches all in the same building as the core switch?

Hi ,

Do you HSRP between your 2 4506 core switches?  If you do, does HSRP fail over during your outage?

Yes i have HSRP between the 2 core Switches . i dont know if HSRP fail because during outage i dont't have access to the Switch with telnet.

Do you have spanning tree configured so that your core switches are root and secondary root, and that all of the access switches are a much higher priority?

Yes , STP is configured very well , and it's don't failed furing the outage. all the access switches has a higher priority .

In the switch logs do you see any MAC addresses flapping between ports?

no , i saw the STP is stable!!!

Also, do your access switches have redundant connections to your core switches? Are they connected directly to the core switches or are they daisy chained off of one another?

the access swithces have redundant connections  to my core switches through the favric link (GBIC).

Are the access switches all in the same building as the core switch?

yes.

thank you for uour help

Review Cisco Networking for a $25 gift card