Due to the limitations of the BPL devices we use in our ISP backbone, we have to handle Network Address Translations centrally. At our perimeter point, we need a router that can
1)Terminate MetroEthernet at outside FastEthernet interface (Easy)
2)Perform well in router-on-a-stick scenario for 32 VLANs at inside interface(**)
3)Handle Network Address Translations for about 1000+ clients(**)
4)Perform IPS and Firewall
** ->I detached a Cisco 2651XM with latest IOS, configured as router-on-a-stick, router from a location where 500+ NATs were occuring and CPU was hitting %100 and rendering the device unresponsive. This issue might be occuring because of these ** mentioned points above. I attached a simple device called Netasq that runs on FreeBSD platform, configured same, and it performs great with %4 CPU. Maybe it was a bug, I called TAC but it was EOL, opened a topic in NetPro but no solution.
Waiting for suggestions
ISR is not a distributed platform like 7600/GSRs. NAT being high CPU intensive feature, I doubt ISR can handle 1000+ simultaneous sessions. Chances are more than it will hit high CPU utilization even before hitting 1000 sessions.
I guess ISR2 is much impreoved in terms of performance.
Thank you for your response. Hearing this from a Cisco employee now made me think...
1)How can a 2k$ FreeBSD based device can handle the load mentioned above with CPU utilization of %10 with all UTM applications included, whereas a 5k$ Cisco device that has its own IOS can not? I can not accept this as a CCSP CCNP network engineer who dedicated his 7+ years to Cisco.
Can you please ask this issue to a marketing engineer?
I would also like to see the Cisco marketing engineer's answer to your question, but I would like to share my own view as well.
First, the ISR routers are devices with a low-performance CPU when comparing them to the usual workstation/server processors from Intel or AMD. You cannot expect that a device running an embedded processor clocked at most somewhere around 1 GHz can beat a strong Intel/AMD machine with lots of memory and large caches.
Second, they are, as their name suggests it, "integrated services routers", i.e. universal devices capable of performing many diverse networking functions, and that is true. However, even if a device can provide a particular service, it does not mean that it has unlimited power for providing it, and also if a device supports various features, it does not necessarily mean that you can have all of them turned on and expect that they all will perform well under a high load. The ISR routers are very flexible, however, they are still considered to be, at least from the throughput point of view, low-end routers. Their strength is the versatility, not the raw throughput.
Third, for larger NAT deployments, the ASA is usually recommended instead of ISR routers (note that ASA boxes run Intel processors and Linux-based OS), as it should be capable of handling so many NAT flows and translations.
I would indeed like to read the marketing engineer's response, as it is a fact that the pricing of Cisco products is a topic for itself, but at the same time, I think that here, another point is to be considered: the ISRs simply do not seem to be targeted for the particular application you are trying to implement them in.
1st, NAT on IOS doesn't have "sessions", have "entries". Normal Internet use can generate many tens and even hundreds of entries per user. That is not a problem per-se.
Then, router (or ASA) CPU is not directly affected by number of NAT entries. There is only some background housekeeping work about.
What CPU is affected by, is the amount of traffic. That is, you can max out CPU with 10 entries, or have plenty left with 10,000.
Remember, network performance is primarily measured in packets per seconds, not size of this or that table.
With the large memory sizes of today, huge tables is rarely an issue.
Thank you for your valuable responses everyone. If you like to have a little background about the issue, following is the link
As I remember, ASA had an IOS called Finesse, I think it doesnt run on top of Linux, didnt investigate in google though...
pbevilacqua, packets per second point of view sounds reasonable to me. Please let me link the relationship below.
One interface has 20Mbit MetroEthernet configured. How many packets should be processed to fully utilize 20MBit by router?
Lets say X model of router is introduced to market with 2 100Mbit Full duplex NICs. So can we arrive at the assumption that, this router does have the enough processing power in PPS (packet per-second) to stably utilize the interfaces it has?
When i think of a simple set up router, CPU deals with packets while (excluding minor operations) 1)Deciding where to route the packet, 2)Switching the packet . So the routed and switched packets do have a unit, a measure in PPS. When I read your post, I understand NAT as just a matter of memory. The more memory you have, more entries you can have in NAT table. But isnt NAT the operation to manipulate each packet to change the source IP? Yet another table to look for from CPU's standpoint. Without NAT, while it just checks routing table then forwarding table, now it has to spend cycles to NAT table, routing table and forwarding table. From my point of view, seeing the PPS term also for NAT thread indicates that it is CPU intensive. In addition, correct me if I am wrong, this is just my thinking, each packet, once NAT is enabled, should now be decapsulated-encapsulated untill/from Layer 4, since source and destination ports are needed in NAT table. I mean when i run a show nat with some extra parameter, I can see outs local, outs global ins loc, ins glob and ports, whereas a NAT disabled router only has to do operation at L3, excluding inspections etc.
" NAT on IOS doesn't have "sessions", have "entries". Normal Internet use can generate many tens and even hundreds of entries per user. That is not a problem per-se. (is it per-session?)" can you elaborate this statement?
The process of table lookup is not a big deal. It uses hashing search algorithm per Knuth's fundamental works.
What matters, is how many times per second it has to be repeated, not how big the table is.
"What matters, is how many times per second it has to be repeated, not how big the table is"
As a conclusion, what model of router do you suggest?
Would like to hear your opinion about the issue linked above, when you have time.
Also, I recommend you use a L3 switch, not router on the stick."
This was my very first suggestion, but networks admin before me have already made his wrong choice.
"What BW are getting from ISP?"
Thanks for the documentation, however according to it, current 2651XM router suits the needs. But it maxes out CPU.
"An important part of the this profession, is to fix other people mistakes."
Well said m8, but before taking action, you have to succesffuly answer the question of your CEO as follows
"You say that router-on-a-stick design is inappropriate, current Cisco device hits CPU %100. You want me to buy another cisco device (L3 switch). Cisco router 2000$+ , cisco L3 switch 3000$+. Then my dear network admin tell me, in another campus, how can a device called Netasq running on BSD worth 1000$ configured as router-on-a-stick with a 250$ L2 3com switch works flwaless with CPU utilization of %5-%10"
On that performances sheet, they use uni-directional traffic, so if you have 20 mbps of bi-directional traffic, that equates to 40.
Then you have to allow some safety margin.
Regarding the device they are using now, try copying an huge file between gigabit-equipped servers on different VLANs. Perhaps, throw some ACL in there too. Compare time to the same copy made between servers in the same vlan.
Then, try having a disk crash on that box and see what you are left with.
And, from what I understand, you may be very well set with a 3560-8PC, less than $1,000 street price.
Anybody can do IT/networking cheaply and creatively. But this forum is titled "NetPro", and as such we try to remain.