06-23-2016 01:02 AM - edited 03-08-2019 06:20 AM
we have a problem between 4 switches catalyst 3750, suddenly we have detected that there are a lot of packet loss between several point to point connection
I attached a JPG with the scheme.
Red lines are the connections with problems. Green line don't have problems
for example:
P28TAMERO05#ping 10.25.110.6 repeat 100 size 500 source 10.25.110.5
Type escape sequence to abort.
Sending 100, 500-byte ICMP Echos to 10.25.110.6, timeout is 2 seconds:
Packet sent with a source address of 10.25.110.5
...!!!!...!!.!!!.!!.!.!!!!.!!!!.!!!.!.!!.!!!.!!!.!.!.!!!!.!!!!.!!!.!!!
!.!!.!!.!!!!..!!!!..!!!..!!!.!
Success rate is 68 percent (68/100), round-trip min/avg/max = 8/11/42 ms
P28TAMERO05#
P28TAMERO05#ping 10.25.110.10 repeat 100 size 500 source 10.25.110.9
Type escape sequence to abort.
Sending 300, 500-byte ICMP Echos to 10.25.110.10, timeout is 2 seconds:
...........................!!..........!!!.!!!!...!!.......!!!.!!!.!..
.!...!.
Success rate is 25 percent (20/77), round-trip min/avg/max = 8/12/25 ms
P28TAMERO05#
P28TAMERO05#ping 10.25.110.58 repeat 100 size 500 source 10.25.110.57
Type escape sequence to abort.
Sending 100, 500-byte ICMP Echos to 10.25.110.58, timeout is 2 seconds:
Packet sent with a source address of 10.25.110.57
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Success rate is 100 percent (100/100), round-trip min/avg/max = 8/13/26 ms
P28TAMERO05#
We have similar outputs in the rest of the switches.
Circuits are different, two of them are PDH circuits, and the other are carrier ethernet E-Lines, but we have reviewed the circuits and not present any problems.
The Vlans for users that we have behind the switches don't have connectivity problems, for example:
P28TAMERO05#ping 10.25.111.111 size 500 source 10.25.115.1 repeat 300
Type escape sequence to abort.
Sending 300, 500-byte ICMP Echos to 10.25.111.111, timeout is 2 seconds:
Packet sent with a source address of 10.25.115.1
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!
Success rate is 100 percent (300/300), round-trip min/avg/max = 8/13/34 ms
P28TAMERO05#
We think that there are problems with the control plane or in how the switch handle the traffic (or just have nothing to do and we are wrong) but we don't know how see this or which would be the best solution for this, because besides this means that the management of the devices are very dificult because is interrupt constantly
Solved! Go to Solution.
09-26-2016 07:59 AM
Hi all,
Finally the problem was the bug CSCub04965 - TCP Session hung causing Packet Loss
https://bst.cloudapps.cisco.com/bugsearch/bug/CSCub04965/?reffering_site=dumpcr
Regards
06-23-2016 04:39 AM
Do you have any messages in your log about MAC address flapping?
Have you enabled spanning tree, and consistently? Don't intermix rapid-stp and mstp.
Have client ports got bpduguard enabled?
Are any of the WAN circuits (or internal links) running into capacity issues?
Have you checked the interfaces between switches for errors (crc, framing, drops)?
06-23-2016 04:54 AM
Hello, I answer your questions
Do you have any messages in your log about MAC address flapping? No, there aren’t any messages with this evidence.
Have you enabled spanning tree, and consistently? Don't intermix rapid-stp and mstp. Yes, is consistently. Only have rspt.
Have client ports got bpduguard enabled? Yes
Are any of the WAN circuits (or internal links) running into capacity issues? No.
Have you checked the interfaces between switches for errors (crc, framing, drops)? Yes, and there aren’t any errors in the interfaces between switches
06-23-2016 06:36 AM
Do you see this behavior all the time, most of the time, or occasionally?
What is the circuit, and subscribed bandwidth, for 10.25.110.10 ? Can you post "show interface" for that and the .9 end of the links? Is there any possibility there are high-volume replication tasks running when you run into issues? Remember typically PING traffic is low priority; however that much loss will indicate there is congestion or overload of some kind.
sh mem and sh proc on the two ends might also shed some light on processor use and memory - if you haven't reset the switches for a while, it might be you have a memory leak and insufficient memory remaining for buffers.
09-26-2016 07:59 AM
Hi all,
Finally the problem was the bug CSCub04965 - TCP Session hung causing Packet Loss
https://bst.cloudapps.cisco.com/bugsearch/bug/CSCub04965/?reffering_site=dumpcr
Regards
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide