Reason For packet drops
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-18-2013 01:28 PM - edited 03-07-2019 05:09 PM
Guys,
I have a Cisco 6500 with a gig link to a foundry switch. Both ports are 1gig FD, Flowcontrol disabled, MTU 1500.
The cisco is showing large amounts of packet drops on its interface. the foundry is not seeing any. however this is impacting performance. The amount foo traffic is minimal yet in 12 hours I had 20000 packets dropped. I am running pings to the IP on the foundry and if I clear the counters on the cisco port I can almost tie up the ping failures with the packets being dropped.
Want to know a way I can figure out why the packets are dropped, is it the speed of the packets perhaps? I can create a 2x 1gig lacp but doing so between cisco and Foundry is a real pig. There is no config ont he cisoc port, just switchport mode turnk and a list of vlans
CPU utilization for five seconds: 45%/37%; one minute: 35%; five minutes: 34%
Cisco6500#sh int g1/1
GigabitEthernet1/1 is up, line protocol is up (connected)
Hardware is C6k 1000Mb 802.3, address is xx
Description: link to foundry
MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
reliability 255/255, txload 8/255, rxload 5/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full-duplex, 1000Mb/s, media type is 10/100/1000BaseT
input flow-control is off, output flow-control is off
Clock mode is auto
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:23, output never, output hang never
Last clearing of "show interface" counters 00:05:04
Input queue: 0/2000/12/0 (size/max/drops/flushes); Total output drops: 4794
Queueing strategy: fifo
Output queue: 0/40 (size/max)
30 second input rate 21775000 bits/sec, 4566 packets/sec
30 second output rate 31511000 bits/sec, 4880 packets/sec
1384266 packets input, 771010734 bytes, 0 no buffer
Received 10087 broadcasts (3330 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 12 overrun, 0 ignored
0 watchdog, 0 multicast, 0 pause input
0 input packets with dribble condition detected
1404769 packets output, 1070775755 bytes, 0 underruns
0 output errors, 0 collisions, 0 interface resets
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 PAUSE output
0 output buffer failures, 0 output buffers swapped out
CPU usage 5% mem usage 30%
GigabitEthernet2 is up, line protocol is up
Hardware is GigabitEthernet, address is xxx
Configured speed 1Gbit, actual 1Gbit, configured duplex fdx, actual fdx
Member of 8 L2 VLANs, port is tagged, port state is FORWARDING
STP configured to ON, priority is level0, flow control disabled
mirror disabled, monitor disabled
Not member of any active trunks
Not member of any configured trunks
Port name is link to switch
MTU 1518 bytes, encapsulation ethernet
300 second input rate: 24985704 bits/sec, 4258 packets/sec, 2.55% utilization
300 second output rate: 18324784 bits/sec, 4252 packets/sec, 1.89% utilization
1447384920 packets input, 3900985418665 bytes, 0 no buffer
Received 40051164 broadcasts, 38039767 multicasts, 1369293989 unicasts
0 input errors, 0 CRC, 0 frame, 0 ignored
0 runts, 0 giants, DMA received 5742345498 packets
1278568924 packets output, 2734490474507 bytes, 0 underruns
Transmitted 46275661 broadcasts, 22822039 multicasts, 1209471224 unicasts
0 output errors, 0 collisions, DMA transmitted 5573536219 packets
- Labels:
-
Other Switching
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-18-2013 05:00 PM
I have plenty of free ports on the 6748 card (only 13 in use) think that would be best?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-18-2013 05:14 PM
I didn't see Jon's post but I would also concur with his recommendation. Move your link to the higher-performance 6748. This line card is more suited for higher-speed links like some low-scale servers and switch uplinks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-19-2013 10:08 AM
no luck, changed the port over. stats are now looking clean. however still seeing 10% loss pinging over the link. (link utilisation is less than 1%)
sh int g4/33
GigabitEthernet4/33 is up, line protocol is up (connected)
Hardware is C6k 1000Mb 802.3, address is
Description: uplink
MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full-duplex, 1000Mb/s, media type is 10/100/1000BaseT
input flow-control is off, output flow-control is off
Clock mode is auto
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:48, output never, output hang never
Last clearing of "show interface" counters never
Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 0
Queueing strategy: fifo
Output queue: 0/40 (size/max)
30 second input rate 3810000 bits/sec, 1310 packets/sec
30 second output rate 5565000 bits/sec, 1254 packets/sec
7210039 packets input, 2991599375 bytes, 0 no buffer
Received 132925 broadcasts (43769 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 0 multicast, 0 pause input
0 input packets with dribble condition detected
7117699 packets output, 4482380996 bytes, 0 underruns
0 output errors, 0 collisions, 1 interface resets
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 PAUSE output
0 output buffer failures, 0 output buffers swapped out
show count int g4/33
64 bit counters:
0. rxHCTotalPkts = 7294904
1. txHCTotalPkts = 7197135
2. rxHCUnicastPkts = 7159920
3. txHCUnicastPkts = 7038262
4. rxHCMulticastPkts = 44469
5. txHCMulticastPkts = 77175
6. rxHCBroadcastPkts = 90506
7. txHCBroadcastPkts = 81707
8. rxHCOctets = 3023920550
9. txHCOctets = 4525667280
10. rxTxHCPkts64Octets = 350
11. rxTxHCPkts65to127Octets = 6452090
12. rxTxHCPkts128to255Octets = 2067714
13. rxTxHCPkts256to511Octets = 1246597
14. rxTxHCpkts512to1023Octets = 802175
15. rxTxHCpkts1024to1518Octets = 2554189
16. txHCTrunkFrames = 7197006
17. rxHCTrunkFrames = 7294904
18. rxHCDropEvents = 0
32 bit counters:
0. rxCRCAlignErrors = 0
1. rxUndersizedPkts = 0
2. rxOversizedPkts = 0
3. rxFragmentPkts = 0
4. rxJabbers = 0
5. txCollisions = 0
6. ifInErrors = 0
7. ifOutErrors = 0
8. ifInDiscards = 0
9. ifInUnknownProtos = 0
10. ifOutDiscards = 0
11. txDelayExceededDiscards = 0
12. txCRC = 0
13. linkChange = 1
14. wrongEncapFrames = 0
All Port Counters
1. InPackets = 7284256
2. InOctets = 3019194572
3. InUcastPkts = 7149592
4. InMcastPkts = 44376
5. InBcastPkts = 90279
6. OutPackets = 7186667
7. OutOctets = 4518243338
8. OutUcastPkts = 7028077
9. OutMcastPkts = 77033
10. OutBcastPkts = 81566
11. AlignErr = 0
12. FCSErr = 0
13. XmitErr = 0
14. RcvErr = 0
15. UnderSize = 0
16. SingleCol = 0
17. MultiCol = 0
18. LateCol = 0
19. ExcessiveCol = 0
20. CarrierSense = 0
21. Runts = 0
22. Giants = 0
23. InDiscards = 0
24. OutDiscards = 0
25. InErrors = 0
26. OutErrors = 0
27. InUnknownProtos = 0
28. txCRC = 0
29. TrunkFramesTx = 7186538
30. TrunkFramesRx = 7284256
31. WrongEncap = 0
32. Broadcast_suppression_discards = 0
33. Multicast_suppression_discards = 0
34. Unicast_suppression_discards = 0
35. rxTxHCPkts64Octets = 350
36. rxTxHCPkts65to127Octets = 6441840
37. rxTxHCPkts128to255Octets = 2066039
38. rxTxHCPkts256to511Octets = 1245186
39. rxTxHCpkts512to1023Octets = 801178
40. rxTxHCpkts1024to1518Octets = 2549261
41. DropEvents = 0
42. CRCAlignErrors = 0
43. UndersizedPkts = 0
44. OversizedPkts = 0
45. FragmentPkts = 0
46. Jabbers = 0
47. Collisions = 0
48. DelayExceededDiscards = 0
49. bpduOutlost = 0
50. qos0Outlost = 0
51. qos1Outlost = 0
52. qos2Outlost = 0
53. qos3Outlost = 0
54. qos4Outlost = 0
55. qos5Outlost = 0
56. qos6Outlost = 0
57. qos7Outlost = 0
58. qos8Outlost = 0
59. qos9Outlost = 0
60. qos10Outlost = 0
61. qos11Outlost = 0
62. qos12Outlost = 0
63. qos13Outlost = 0
64. qos14Outlost = 0
65. qos15Outlost = 0
66. qos16Outlost = 0
67. qos17Outlost = 0
68. qos18Outlost = 0
69. qos19Outlost = 0
70. qos20Outlost = 0
71. qos21Outlost = 0
72. qos22Outlost = 0
73. qos23Outlost = 0
74. qos24Outlost = 0
75. qos25Outlost = 0
76. qos26Outlost = 0
77. qos27Outlost = 0
78. qos28Outlost = 0
79. qos29Outlost = 0
80. qos30Outlost = 0
81. qos31Outlost = 0
82. bpduCbicOutlost = 0
83. qos0CbicOutlost = 0
84. qos1CbicOutlost = 0
85. qos2CbicOutlost = 0
86. qos3CbicOutlost = 0
87. bpduInlost = 0
88. qos0Inlost = 0
89. qos1Inlost = 0
90. qos2Inlost = 0
91. qos3Inlost = 0
92. qos4Inlost = 0
93. qos5Inlost = 0
94. qos6Inlost = 0
95. qos7Inlost = 0
96. qos8Inlost = 0
97. qos9Inlost = 0
98. qos10Inlost = 0
99. qos11Inlost = 0
100. qos12Inlost = 0
101. qos13Inlost = 0
102. qos14Inlost = 0
103. qos15Inlost = 0
104. qos16Inlost = 0
105. qos17Inlost = 0
106. qos18Inlost = 0
107. qos19Inlost = 0
108. qos20Inlost = 0
109. qos21Inlost = 0
110. qos22Inlost = 0
111. qos23Inlost = 0
112. qos24Inlost = 0
113. qos25Inlost = 0
114. qos26Inlost = 0
115. qos27Inlost = 0
116. qos28Inlost = 0
117. qos29Inlost = 0
118. qos30Inlost = 0
119. qos31Inlost = 0
120. pqueInlost = 0
121. Overruns = 0
122. maxIndex = 0
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-19-2013 10:14 AM
Ryan
Can you try pinging from the foundry switch to the 6500 ?
What is the CPU doing on the foundry ?
Also - can you try pinging from one host connected to the 6500 to one host connected to the foundry and see what the results are ?
Edit - sorry, are you still seeing an impact in overall performance ?
Jon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-19-2013 10:42 AM
show cpu-utilization
16 percent busy, from 1 sec ago
1 sec avg: 16 percent busy
5 sec avg: 14 percent busy
60 sec avg: 23 percent busy
300 sec avg: 24 percent busy
WSB1/1 peak: 47.8% in 24d4h, last sec: 5.2%, 5 sec: 4.7%, 60 sec: 4.8%, 300 sec: 4.9%
ok (should have pointed out what i was trying earlier which was pinging to a VIP) Looks like the link is working quite well now.. just something haywire on the foundry.
1) ping from management interface on cisco to management interface on foundry
Success rate is 99 percent (99999/100000), round-trip min/avg/max = 1/1/1972 ms
2) ping from management interface on foundry to management interface on cisco
Success rate is 100 percent (100000/100000), round-trip min/avg/max=0/0/1983 ms.
3) ping from management interface on cisco to VIP on foundry
Success rate is 99 percent (4990/5000), round-trip min/avg/max = 1/1/932 ms
4) ping from host off cisco to management on Foundry
Packets: Sent = 94, Received = 94, Lost = 0 (0% loss)
5) Ping from host on cisco to VIP on foundry
Packets: Sent = 115, Received = 103, Lost = 12 (10% loss),
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-19-2013 10:56 AM
edited
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-18-2013 05:34 PM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
Jon is correct to point at the 6148 as a potential bottleneck. Although the ports are gig, the architecture of the card isn't designed to support high bandwidth utilization on multiple ports (as noted by Leo), either ingress or egress. As Jon noted, that card is classic bus; which has many ramifications. Any classic bus card in a 6500 chassis drops the whole chassis performance from 30 Mpps to 15 Mpps (actually it's even worst than the numeric half; the fabric 30 Mpps applies to any size frame), and if you were moving data between a classic bus only and a fabric only card, such as your 6148 <> 6748, the transfer has to jump through the sup as the two cards don't share a common data bus.
The 6148 card also has limited hardware buffers 1 MB per 8 ports vs. even the later 6148A which has 2.67 MB per port.
What you should do, is analyze how busy all your copper ports are and move the "busiest" 48 to the 6748, alternating by 24. I.e. busiest in g4/1, next in g4/25, then g4/2 and next in g4/26, etc.
Then you take the remaining busiest (49th) and pair it with the least busiest, and use ports g1/1 and g1/2. Then take the next busiest (50th) and least busiest pair and use g2/1 and g2/2. Then next busiest and least busiest pair go to g1/9 and g1/10. The next pair g2/9 and g2/10. Next pair g1/17 and g1/18. Etc.
Once you hit every group of 8, on both cards, you start again still pairing next busiest and least busiest starting with port g1/3 and g1/4, etc.
The foregoing will distribute your usage about as optimally as possible on your hardware. For better performance, you would need to consider hardware changes, such as either replacing the 6148s with better (i.e. fabric capable) line cards, or pulling them from the chassis, and "fanning out" your 6748 card's port (Etherchannelled) to external switches.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-18-2013 05:44 PM
Joseph / Leo
I just edited my post about the shared bus but i takeJoseph's point about communication between classic and fabric cards. I was more concerned with the 8:1 oversubscription ratio as i see this as the limiting factor ie.
2 x WS-X6148 at 8:1 means 6Gbps per card so 12Gbps which is nowhere near the 32Gbps so i think the cards themselves are the limiting factor together with the buffer sizes.
My question is about the port groupings. The docs say the WS-X6148 only has 2 port groupings 1 - 24, 25 - 48. Do you know how that ties in with the oversubscription ratio. I always thought that the 8:1 ratio meant each group of 6 ports shares an ASIC and that ASIC priovided a 1Gbps connection.
Do you think it's a typo in the docs and the WS-X6148 actually has the same port groupings as the WS-X6148A ?
Jon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-18-2013 06:45 PM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
Jon, I too suspect the principle bottleneck is on the port group. I recall on some 6500 line cards (6148s?) that share port group (per ASIC), when you see (egress?) drops, you see same count against all ports within the same group.
With the 8:1 oversubscription, and with only 16 KB of receiving buffering per port (on the 6148), it would seem easy for there to be ingress drops. (Which is why I detail how to better utilize the available copper ports.)
Yea, I agree one 6148's 6 gig ASICs won't overrun the classic bus, but there are two and I'm unsure what the FW card connections are. If you think there's still lots of head room, remember the 32 Gbps classic bus is 16 Gbps (duplex). So in theory, those cards may oversubscribe the bus, as could the 6748, alone, but like you, I think the issue is the port groups on the 6148.
Where did you see the reference for just two port groups for the 6148?
This reference: http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note09186a00801751d7.shtml#ASIC, which also describes the likely problem (and similar port loading as a solution), refers to 8 port groups for the 6148.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-19-2013 03:57 AM
Joseph
See this link -
I suspect it may be a typo especially considering Ryan posted this -
0. ifOutDiscards (ports 17-24) = 151293200
which looks very much like a port grouping to me.
As for the FWSM, if i remember correctly i think that uses a dedicated connection to the switch fabric. But i agree, if there is a lot of traffic between the WS-X6148s and the WS-X6748 that could also have an impact on the shared bus.
Jon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-19-2013 09:08 AM
Well, there's certainly a discrepancy between Cisco documentation, as the reference I provided says 6 groups for the 6148.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-19-2013 10:57 AM
Ryan
Okay, i can't say how the foundry works but when you ping from a host to a host those packets should be hardware switched. When you ping to an actual IP configured on the 6500 these packets have to be handled by the main CPU so if the switch is busy, they can be delayed or dropped.
So when you are testing performance always try to ping from a connected device to another connected device rather than to IPs actually configured on the switches themselves.
Still it looks better as you say. I would try some ping tests between end devices ie. PCs, servers etc. to get an idea of how well the link is performing.
Jon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-19-2013 11:29 AM
Ok, from host on cisco to firewall off foundry:
Packets: Sent = 10000, Received = 9994, Lost = 6 (0% loss)
So there is a bit but then from the same host to VIP on foundry
Packets: Sent = 10000, Received = 9931, Lost = 69 (0% loss)
cpu sitting at 24% as it runs. Do you concur the issue is on the foundry?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-19-2013 11:48 AM
Ryan
It is looking the more likely problem but i cannot say for sure how the foundry processes pings to it's own IP addresses.
Note also that if this is a LAN environment between devices connected to the switches you really shouldn't see any packet loss.
We have already extensively covered what it could be on the 6500 so no need to repeat those other than to recommend moving all switch uplinks to the WS-X6748 to take some of the strain off the WS-X6148s and perhaps to move other things around between ports if needed.
Unfortunately i can't help you with the the other switch as i have no experience with them. I do not know if it could be caused by a similiar sort of thing or not. I don't even know how much of the traffic is hardware switched compared to software switched.
As already mentioned though, if you are still getting performance issues then by all means looks at switches CPU stats but bear in mind, certainly from the 6500 perspective, that the CPU could be quite high but this does not necessarily reflect how well it is hardware switching packets.
So if the CPU is high and you ping an actual switch IP it is not surprising to see packets dropped and that is why a better test is end to end device connectivity.
Jon

- « Previous
-
- 1
- 2
- Next »