cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
7109
Views
0
Helpful
31
Replies

Maximum throughput on Cisco 2621 router

cisco24x7
Level 6
Level 6

I have two dell server 2950-III dual quad-core processors

3.0Ghz with 8GB RAM and 1TB SATA drive. Dell_1 has an

ip address of 192.168.1.10/24 Dell_2 has an ip address

of 192.168.1.20/24. Both of the dell is connected to

a Cisco 2960 Catalyst switch copper Gig. I am running

Redhat Linux ES 3 on these servers. I hard code the

interface to 1000/full

When I perform FTP between the servers, I can get about

800Mbps throughput. That's the good part.

Now, I have a Cisco 2621 (64RAM/16F) I connect both

F0/0 and F0/1 to the catalyst 2960. The router is

running IOS version 12.3(24). I set both the interface

of the router and the catalyst to 100 full-duplex.

I give F0/0 192.168.1.1/24, F0/1 192.168.2.1/24. I

give Dell_2 192.168.2.10/24 with the gateway to

be 192.168.2.1. Dell_1's default gateway is 192.168.1.1.

My FTP transfer is peaking out at 5Mbps between

Dell_1 and Dell_2 across the 2621. The CPU on the Cisco

2621 peaks at 99% cpu utilization. I see no

errors on both the catalyst switchports and on the router

interfaces. I thought I could get much better on the

Cisco 2621 than 5Mbps throughput. With

either SecureFTP (sFTP) or SecureCopy (scp), the through put drops to 2Mbps.

In other words, it gets worse.

Anyone know what the throughput for Cisco 2621 router?

IOS on the router is c2600-ik9o3s3-mz.123-24a.bin.

Thanks.

31 Replies 31

What do you think might be account for all the broadcasts seen on fastE 1/0? Looks to be about 21% of the inbound packet count on that interface.

I'm wondering whether the router, as a host processing broadcasts, is what might be consuming the CPU.

The Linux is running Samba Apps (aka Microsoft

file-sharing services). It is sending

broadcast WINS and other chatty Microsoft

protocols. That should NOT have any effect on

the router.

If what you said is true, the same thing would

have applied if I replace the router with a

checkpoint firewalls. When I replace the

router with a Checkpoint firewall, I get wire

speed file transfer at 90 Mbps. I would think

that I would get better throughput on the

router than I would with firewalls due to

stateful by nature in firewalls. With router

just forwarding packets, that's my assumption.

However, the opposite is true. Weird.

Ok, let's review what you are seeing.

1) You mentioned you are peaking a 5Mbps

2) The router's CPU pegged at 100%

3) Based on links provided by other members, it seems the IP Input is the caused of the high utilization of the CPU

4) The link describes as one of the causes is oversubscription in the router

5) The router is rated at 12Mbps when using IP Only and 64Bytes packet size

Now... Based on the output you posted from one of the interfaces.

5 minute input rate 4170000 bits/sec, 390 packets/sec

5 minute output rate 123000 bits/sec, 217 packets/sec

We do the math to see what the packet size is and let's take the input rate values.

4170000 / 390 = 10692.31 we convert that to Bytes

10692.31 / 8 = 1336 Bytes average for each packet.

You are sending over 20 times the packet size from the spec rate and you should reduce the number accordingly.

We come to the conclusion that a 2621 maxes out at 5Mbps when the packet average size is 1336 Bytes.

HTH,

__

Edison.

My understanding is, as packet size increases, and with a constant PPS rate, effective transfer bandwidth should increase. The increase isn't exactly linear, but the table I provided in a prior post (from Cisco) has the required PPS rates to obtain Ethernet gig line rates for different packet sizes. (You can scale it divide by 10 for fast Ethernet.)

Is that based on tests you've done ?

The PPS rated for a 2621 is 25,000 based on 64Byte packet size. That's how they get the 12.80Mbps figure.

Math as follow:

25,000 * 64 = 1,600,000 Bytes Per Second

Convert to Bits

1600000 * 8 = 12,800,000 Bits Per Second

You are saying, if you have a packet 20 times bigger, you still maintain the same PPS ? I doubt it.

I suggest the OP to lower the packet size at the application layer and see it improves the throughput.

Personally tested, no. Nor could I guarantee, nor do I actually know, a 2621 will maintain its PPS rate regardless of actual packet size. However, effective bandwidth usually jumps with increased packet size.

PPS for 64 byte packets are often quoted since for IP it normally represents the worst case to provide line rate. (For 100 Mbps Ethernet, for 64 byte sized packets, requires about 148,809 PPS; 1500 byes size packets only require about 8,100 PPS.)

The table I drew the gig Ethernet PPS rates from can be found: http://www.cisco.com/en/US/products/hw/modules/ps2643/products_white_paper09186a0080091db8.shtml

What's interesting in table 1 and table 2, we do see the actual PPS rate fall with increased packet sizes, but the necessary PPS often decreases even more. So, the graphs show a higher percentage of theoretical line rate being achieved as packet size increases. (Your mileage, err bandwidth, might vary.)

You are basing this argument on the 7500 with a VIP? LOL - talk about comparing apples and apples.

The bottom line is this, we've circulated the routerperformance document many times. No need to repost it. There are numbers published on such document and the values that were used to come up with those numbers.

On a regular router 2621 (No VIP, No Layer3 Distributed Switching), you get a maximum of 12.80Mbps based on 25k PPS * 64Byte Size.

Edison,

I don't disagree about the math on how the reference sheet you refer to arrives at 12 Mbps for the 2621 for fast path switching. Nor do I know anything beyond the sheet's 25 Kpps rating for a 2621's fast path switching; except for the much, much lower rating for process switching.

I wasn't directly comparing a 2621 against a 7500, either. What I was arguing was effective bandwidth normally improves as packet size increases (if for no other reason the improved ratio between actual payload vs. packet/frame overhead). What the 7500 VIP document showed was both the reduced PPS requirements to obtain line rate as packet sizes increases and one example such impact to a particular piece of hardware makes.

Bottom-line: 12.8 Mbps for 64 byte packets should represent worst case, not necessary best case.

I see David did try your suggestion to decrease packet size to 64 bytes and saw a 5:1 performance reduction. Although, I wouldn't be able to predict whether there would be any reduction, from my argument, I'm not surprised there was. I would have been surprised if there was an improvement.

[edit]

PS:

BTW: I am surprised from time to time. ;)

"It is sending broadcast WINS and other chatty Microsoft protocols. That should NOT have any effect on the router."

Well, from the link both I and Glen provided contains under:

IP Input

Traffic that cannot be interrupt-switched arrives

Broadcast traffic

Check the number of broadcast packets in the show interfaces command output. If you compare the amount of broadcasts to the total amount of packets that were received on the interface, you can gain an idea of whether there is an overhead of broadcasts. If there is a LAN with several switches connected to the router, then this can indicate a problem with Spanning Tree.

So, uncertain it wouldn't have any effect on the router. (What's the CPU load on the router when there is no transfer?)

With regard to router vs. your Checkpoint firewall, perhaps apples to oranges comparison of hardware capabilities?

One thing you might also try, if your IOS supports it, is using CoPP.

Very interesting discussion.

"If there is a LAN with several switches connected to the router,

then this can indicate a problem with Spanning Tree."

There is NO other switches. The test was done on a single switch

Catalyst 2960.

"(What's the CPU load on the router when there is no transfer?)"

The load is 0% when there is no transfer

"The PPS rated for a 2621 is 25,000 based on 64Byte

packet size. That's how they get the 12.80Mbps figure."

Well, I use a program call "Iperf" and test the throughput.

When I lower the packet size to 64Byte, the throughput goes

from 5Mbps down to 1Mbps.

Any more thoughts or ideas folks?

Ok, clear counter, execute a FTP session, then post the show buffers from this router.

Thanks

C2621#sh buffers

Buffer elements:

1118 in free list (1000 max allowed)

4698929 hits, 0 misses, 1119 created

Public buffer pools:

Small buffers, 104 bytes (total 84, permanent 50, peak 114 @ 09:38:29):

73 in free list (20 min, 150 max allowed)

5951510 hits, 925 misses, 196 trims, 230 created

131 failures (0 no memory)

Middle buffers, 600 bytes (total 81, permanent 25, peak 104 @ 23:34:47):

79 in free list (10 min, 150 max allowed)

768179 hits, 310 misses, 158 trims, 214 created

168 failures (0 no memory)

Big buffers, 1536 bytes (total 63, permanent 50, peak 63 @ 00:00:31):

29 in free list (5 min, 150 max allowed)

2230804 hits, 301 misses, 6 trims, 19 created

152 failures (0 no memory)

VeryBig buffers, 4520 bytes (total 11, permanent 10, peak 11 @ 23:35:43):

11 in free list (0 min, 20 max allowed)

127 hits, 25 misses, 1 trims, 2 created

25 failures (0 no memory)

Large buffers, 5024 bytes (total 1, permanent 0, peak 1 @ 23:35:43):

1 in free list (0 min, 10 max allowed)

1 hits, 24 misses, 8 trims, 9 created

24 failures (0 no memory)

Huge buffers, 18024 bytes (total 1, permanent 0, peak 1 @ 23:35:43):

1 in free list (0 min, 4 max allowed)

0 hits, 24 misses, 8 trims, 9 created

24 failures (0 no memory)

Interface buffer pools:

CD2430 I/O buffers, 1536 bytes (total 0, permanent 0):

0 in free list (0 min, 0 max allowed)

0 hits, 0 fallbacks

Header pools:

Header buffers, 0 bytes (total 137, permanent 128, peak 137 @ 23:35:59):

9 in free list (10 min, 512 max allowed)

125 hits, 3 misses, 0 trims, 9 created

0 failures (0 no memory)

128 max cache size, 128 in cache

73 hits in cache, 0 misses in cache

Particle Clones:

1024 clones, 0 hits, 0 misses

Public particle pools:

F/S buffers, 256 bytes (total 384, permanent 384):

128 in free list (128 min, 1024 max allowed)

256 hits, 0 misses, 0 trims, 0 created

0 failures (0 no memory)

256 max cache size, 256 in cache

0 hits in cache, 0 misses in cache

Normal buffers, 1548 bytes (total 512, permanent 512):

384 in free list (128 min, 1024 max allowed)

320 hits, 0 misses, 0 trims, 0 created

0 failures (0 no memory)

128 max cache size, 128 in cache

0 hits in cache, 0 misses in cache

Private particle pools:

FastEthernet0/0 buffers, 1548 bytes (total 192, permanent 192):

0 in free list (0 min, 192 max allowed)

192 hits, 0 fallbacks

192 max cache size, 128 in cache

677145 hits in cache, 0 misses in cache

FastEthernet0/1 buffers, 1548 bytes (total 192, permanent 192):

0 in free list (0 min, 192 max allowed)

192 hits, 0 fallbacks

192 max cache size, 128 in cache

6021707 hits in cache, 0 misses in cache

C2621#

You have some misses but nothing to be concerned about.

I find it interesting that once you changed the packet size from 1336 Byte average to 64 Byte, the number dropped 5Mbps to 1Mbps.

The 1Mbps figure is illustrated as the maximum throughput for a 2621 when using process switching.

You mentioned you have CEF enabled on the interfaces, can we quickly verify that with a show ip cef output ?

Thanks

David,

The point about broadcasts was their possible impact to the router's performance. The Cisco mention of Spanning Tree concerns a possible source.

Other ideas? Yep, one interesting one, I think.

Can you place your 2621 behind the Checkpoint firewall so that the 2621 doesn't see any broadcasts from the destination network?

[edit]

PS:

There one possible very simple explanation, the PPS rating we're using doesn't accurately reflect the performance of the box.

However, I still suspect the high process switching ratio is causing the box to peak its CPU too soon. The issue is the cause, and if we identify it, a work around.

"Can you place your 2621 behind the Checkpoint firewall so that the 2621 doesn't see any broadcasts from the destination network?"

Yes I can do that but I don't think that will

help because checkpoint clusterXL itself also

use multicast/broadcast to talk to each other

to maintain state, called Checkpoint Clustering

Protocol (CCP). Similar to VRRP not quite.

Therefore, the router would see this traffic

as well.

Edison,

I just logged off from my test network but I

can assure that I have cef enable as follows:

config t

ip cef

interface F0/0

ip cef

interface F0/1

ip cef

end

wr mem

Regarding the 1Mbps throughput, unless I am

mistaken, the smaller the size of the packet,

at 64bytes, it will cause high CPU usage.

The higher the size of the packet, let say

1300 bytes, the lower the cpu on the router,

unless you exceed 1460 or 1480 bytes.

Therfore, what I am seeing @ 64bytes for 1mbps

throughput is normal, isn't it?