03-08-2008 07:04 AM - edited 03-05-2019 09:37 PM
I have two dell server 2950-III dual quad-core processors
3.0Ghz with 8GB RAM and 1TB SATA drive. Dell_1 has an
ip address of 192.168.1.10/24 Dell_2 has an ip address
of 192.168.1.20/24. Both of the dell is connected to
a Cisco 2960 Catalyst switch copper Gig. I am running
Redhat Linux ES 3 on these servers. I hard code the
interface to 1000/full
When I perform FTP between the servers, I can get about
800Mbps throughput. That's the good part.
Now, I have a Cisco 2621 (64RAM/16F) I connect both
F0/0 and F0/1 to the catalyst 2960. The router is
running IOS version 12.3(24). I set both the interface
of the router and the catalyst to 100 full-duplex.
I give F0/0 192.168.1.1/24, F0/1 192.168.2.1/24. I
give Dell_2 192.168.2.10/24 with the gateway to
be 192.168.2.1. Dell_1's default gateway is 192.168.1.1.
My FTP transfer is peaking out at 5Mbps between
Dell_1 and Dell_2 across the 2621. The CPU on the Cisco
2621 peaks at 99% cpu utilization. I see no
errors on both the catalyst switchports and on the router
interfaces. I thought I could get much better on the
Cisco 2621 than 5Mbps throughput. With
either SecureFTP (sFTP) or SecureCopy (scp), the through put drops to 2Mbps.
In other words, it gets worse.
Anyone know what the throughput for Cisco 2621 router?
IOS on the router is c2600-ik9o3s3-mz.123-24a.bin.
Thanks.
03-09-2008 11:36 AM
What do you think might be account for all the broadcasts seen on fastE 1/0? Looks to be about 21% of the inbound packet count on that interface.
I'm wondering whether the router, as a host processing broadcasts, is what might be consuming the CPU.
03-09-2008 02:36 PM
The Linux is running Samba Apps (aka Microsoft
file-sharing services). It is sending
broadcast WINS and other chatty Microsoft
protocols. That should NOT have any effect on
the router.
If what you said is true, the same thing would
have applied if I replace the router with a
checkpoint firewalls. When I replace the
router with a Checkpoint firewall, I get wire
speed file transfer at 90 Mbps. I would think
that I would get better throughput on the
router than I would with firewalls due to
stateful by nature in firewalls. With router
just forwarding packets, that's my assumption.
However, the opposite is true. Weird.
03-09-2008 02:52 PM
Ok, let's review what you are seeing.
1) You mentioned you are peaking a 5Mbps
2) The router's CPU pegged at 100%
3) Based on links provided by other members, it seems the IP Input is the caused of the high utilization of the CPU
4) The link describes as one of the causes is oversubscription in the router
5) The router is rated at 12Mbps when using IP Only and 64Bytes packet size
Now... Based on the output you posted from one of the interfaces.
5 minute input rate 4170000 bits/sec, 390 packets/sec
5 minute output rate 123000 bits/sec, 217 packets/sec
We do the math to see what the packet size is and let's take the input rate values.
4170000 / 390 = 10692.31 we convert that to Bytes
10692.31 / 8 = 1336 Bytes average for each packet.
You are sending over 20 times the packet size from the spec rate and you should reduce the number accordingly.
We come to the conclusion that a 2621 maxes out at 5Mbps when the packet average size is 1336 Bytes.
HTH,
__
Edison.
03-09-2008 03:01 PM
My understanding is, as packet size increases, and with a constant PPS rate, effective transfer bandwidth should increase. The increase isn't exactly linear, but the table I provided in a prior post (from Cisco) has the required PPS rates to obtain Ethernet gig line rates for different packet sizes. (You can scale it divide by 10 for fast Ethernet.)
03-09-2008 03:10 PM
Is that based on tests you've done ?
The PPS rated for a 2621 is 25,000 based on 64Byte packet size. That's how they get the 12.80Mbps figure.
Math as follow:
25,000 * 64 = 1,600,000 Bytes Per Second
Convert to Bits
1600000 * 8 = 12,800,000 Bits Per Second
You are saying, if you have a packet 20 times bigger, you still maintain the same PPS ? I doubt it.
I suggest the OP to lower the packet size at the application layer and see it improves the throughput.
03-09-2008 03:31 PM
Personally tested, no. Nor could I guarantee, nor do I actually know, a 2621 will maintain its PPS rate regardless of actual packet size. However, effective bandwidth usually jumps with increased packet size.
PPS for 64 byte packets are often quoted since for IP it normally represents the worst case to provide line rate. (For 100 Mbps Ethernet, for 64 byte sized packets, requires about 148,809 PPS; 1500 byes size packets only require about 8,100 PPS.)
The table I drew the gig Ethernet PPS rates from can be found: http://www.cisco.com/en/US/products/hw/modules/ps2643/products_white_paper09186a0080091db8.shtml
What's interesting in table 1 and table 2, we do see the actual PPS rate fall with increased packet sizes, but the necessary PPS often decreases even more. So, the graphs show a higher percentage of theoretical line rate being achieved as packet size increases. (Your mileage, err bandwidth, might vary.)
03-09-2008 04:07 PM
You are basing this argument on the 7500 with a VIP? LOL - talk about comparing apples and apples.
The bottom line is this, we've circulated the routerperformance document many times. No need to repost it. There are numbers published on such document and the values that were used to come up with those numbers.
On a regular router 2621 (No VIP, No Layer3 Distributed Switching), you get a maximum of 12.80Mbps based on 25k PPS * 64Byte Size.
03-09-2008 04:59 PM
Edison,
I don't disagree about the math on how the reference sheet you refer to arrives at 12 Mbps for the 2621 for fast path switching. Nor do I know anything beyond the sheet's 25 Kpps rating for a 2621's fast path switching; except for the much, much lower rating for process switching.
I wasn't directly comparing a 2621 against a 7500, either. What I was arguing was effective bandwidth normally improves as packet size increases (if for no other reason the improved ratio between actual payload vs. packet/frame overhead). What the 7500 VIP document showed was both the reduced PPS requirements to obtain line rate as packet sizes increases and one example such impact to a particular piece of hardware makes.
Bottom-line: 12.8 Mbps for 64 byte packets should represent worst case, not necessary best case.
I see David did try your suggestion to decrease packet size to 64 bytes and saw a 5:1 performance reduction. Although, I wouldn't be able to predict whether there would be any reduction, from my argument, I'm not surprised there was. I would have been surprised if there was an improvement.
[edit]
PS:
BTW: I am surprised from time to time. ;)
03-09-2008 03:12 PM
"It is sending broadcast WINS and other chatty Microsoft protocols. That should NOT have any effect on the router."
Well, from the link both I and Glen provided contains under:
IP Input
Traffic that cannot be interrupt-switched arrives
Broadcast traffic
Check the number of broadcast packets in the show interfaces command output. If you compare the amount of broadcasts to the total amount of packets that were received on the interface, you can gain an idea of whether there is an overhead of broadcasts. If there is a LAN with several switches connected to the router, then this can indicate a problem with Spanning Tree.
So, uncertain it wouldn't have any effect on the router. (What's the CPU load on the router when there is no transfer?)
With regard to router vs. your Checkpoint firewall, perhaps apples to oranges comparison of hardware capabilities?
One thing you might also try, if your IOS supports it, is using CoPP.
03-09-2008 04:05 PM
Very interesting discussion.
"If there is a LAN with several switches connected to the router,
then this can indicate a problem with Spanning Tree."
There is NO other switches. The test was done on a single switch
Catalyst 2960.
"(What's the CPU load on the router when there is no transfer?)"
The load is 0% when there is no transfer
"The PPS rated for a 2621 is 25,000 based on 64Byte
packet size. That's how they get the 12.80Mbps figure."
Well, I use a program call "Iperf" and test the throughput.
When I lower the packet size to 64Byte, the throughput goes
from 5Mbps down to 1Mbps.
Any more thoughts or ideas folks?
03-09-2008 04:21 PM
Ok, clear counter, execute a FTP session, then post the show buffers from this router.
Thanks
03-09-2008 04:45 PM
C2621#sh buffers
Buffer elements:
1118 in free list (1000 max allowed)
4698929 hits, 0 misses, 1119 created
Public buffer pools:
Small buffers, 104 bytes (total 84, permanent 50, peak 114 @ 09:38:29):
73 in free list (20 min, 150 max allowed)
5951510 hits, 925 misses, 196 trims, 230 created
131 failures (0 no memory)
Middle buffers, 600 bytes (total 81, permanent 25, peak 104 @ 23:34:47):
79 in free list (10 min, 150 max allowed)
768179 hits, 310 misses, 158 trims, 214 created
168 failures (0 no memory)
Big buffers, 1536 bytes (total 63, permanent 50, peak 63 @ 00:00:31):
29 in free list (5 min, 150 max allowed)
2230804 hits, 301 misses, 6 trims, 19 created
152 failures (0 no memory)
VeryBig buffers, 4520 bytes (total 11, permanent 10, peak 11 @ 23:35:43):
11 in free list (0 min, 20 max allowed)
127 hits, 25 misses, 1 trims, 2 created
25 failures (0 no memory)
Large buffers, 5024 bytes (total 1, permanent 0, peak 1 @ 23:35:43):
1 in free list (0 min, 10 max allowed)
1 hits, 24 misses, 8 trims, 9 created
24 failures (0 no memory)
Huge buffers, 18024 bytes (total 1, permanent 0, peak 1 @ 23:35:43):
1 in free list (0 min, 4 max allowed)
0 hits, 24 misses, 8 trims, 9 created
24 failures (0 no memory)
Interface buffer pools:
CD2430 I/O buffers, 1536 bytes (total 0, permanent 0):
0 in free list (0 min, 0 max allowed)
0 hits, 0 fallbacks
Header pools:
Header buffers, 0 bytes (total 137, permanent 128, peak 137 @ 23:35:59):
9 in free list (10 min, 512 max allowed)
125 hits, 3 misses, 0 trims, 9 created
0 failures (0 no memory)
128 max cache size, 128 in cache
73 hits in cache, 0 misses in cache
Particle Clones:
1024 clones, 0 hits, 0 misses
Public particle pools:
F/S buffers, 256 bytes (total 384, permanent 384):
128 in free list (128 min, 1024 max allowed)
256 hits, 0 misses, 0 trims, 0 created
0 failures (0 no memory)
256 max cache size, 256 in cache
0 hits in cache, 0 misses in cache
Normal buffers, 1548 bytes (total 512, permanent 512):
384 in free list (128 min, 1024 max allowed)
320 hits, 0 misses, 0 trims, 0 created
0 failures (0 no memory)
128 max cache size, 128 in cache
0 hits in cache, 0 misses in cache
Private particle pools:
FastEthernet0/0 buffers, 1548 bytes (total 192, permanent 192):
0 in free list (0 min, 192 max allowed)
192 hits, 0 fallbacks
192 max cache size, 128 in cache
677145 hits in cache, 0 misses in cache
FastEthernet0/1 buffers, 1548 bytes (total 192, permanent 192):
0 in free list (0 min, 192 max allowed)
192 hits, 0 fallbacks
192 max cache size, 128 in cache
6021707 hits in cache, 0 misses in cache
C2621#
03-09-2008 05:48 PM
You have some misses but nothing to be concerned about.
I find it interesting that once you changed the packet size from 1336 Byte average to 64 Byte, the number dropped 5Mbps to 1Mbps.
The 1Mbps figure is illustrated as the maximum throughput for a 2621 when using process switching.
You mentioned you have CEF enabled on the interfaces, can we quickly verify that with a show ip cef output ?
Thanks
03-09-2008 05:08 PM
David,
The point about broadcasts was their possible impact to the router's performance. The Cisco mention of Spanning Tree concerns a possible source.
Other ideas? Yep, one interesting one, I think.
Can you place your 2621 behind the Checkpoint firewall so that the 2621 doesn't see any broadcasts from the destination network?
[edit]
PS:
There one possible very simple explanation, the PPS rating we're using doesn't accurately reflect the performance of the box.
However, I still suspect the high process switching ratio is causing the box to peak its CPU too soon. The issue is the cause, and if we identify it, a work around.
03-09-2008 06:17 PM
"Can you place your 2621 behind the Checkpoint firewall so that the 2621 doesn't see any broadcasts from the destination network?"
Yes I can do that but I don't think that will
help because checkpoint clusterXL itself also
use multicast/broadcast to talk to each other
to maintain state, called Checkpoint Clustering
Protocol (CCP). Similar to VRRP not quite.
Therefore, the router would see this traffic
as well.
Edison,
I just logged off from my test network but I
can assure that I have cef enable as follows:
config t
ip cef
interface F0/0
ip cef
interface F0/1
ip cef
end
wr mem
Regarding the 1Mbps throughput, unless I am
mistaken, the smaller the size of the packet,
at 64bytes, it will cause high CPU usage.
The higher the size of the packet, let say
1300 bytes, the lower the cpu on the router,
unless you exceed 1460 or 1480 bytes.
Therfore, what I am seeing @ 64bytes for 1mbps
throughput is normal, isn't it?
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide