cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1198
Views
0
Helpful
8
Replies

ping drop issue

shamsul77
Level 1
Level 1

Just want to ask the feedback. Currently I have two Cisco 7200 series router running at our datacenter. Router specs as below

 

3 FastEthernet interfaces

3 Gigabit Ethernet interfaces

2045K bytes of NVRAM.

250880K bytes of ATA PCMCIA card at slot 2 (Sector size 512 bytes).

65536K bytes of Flash internal SIMM (Sector size 512K).

 

We running HSRP using both router. Right now we are encountering a lot of ping drop to our website and services. My analysis as below:

  1. No ping drop when I ping within local network. (internal network)
  2. Intermittent ping drop occur when try to ping our website over the internet
  3. I tried to ping my ISP peering IP continuously and no ping drop occur.  
    interface GigabitEthernet0/1
    description Cogent BGP
    ip address 38.88.XXX.XXX 255.255.255.XXX
    ip accounting output-packets
    ip flow ingress
    duplex full
    speed 1000
    media-type sfp
    no negotiation auto
    no cdp enable
  4. Intermittent ping drop occurred when I ping cisco router 7200 public interface I
    interface GigabitEthernet0/2
    description TransAct Internal
    ip address 208.97.2XX.XX 255.255.2XX.0  <--------dropping many ping
    ip flow ingress
    duplex full
    speed auto
    media-type sfp
    no negotiation auto
    standby 1 ip 208.97.219.10
    standby 1 preempt!

I have attached my both router "show interface" output for more details. any feedback highly appreciated.

8 Replies 8

Palani Mohan
Cisco Employee
Cisco Employee

Shamsul

Your real problem is “we are encountering a lot of ping drop to our website and services”.  For now, I recommend ignoring any test/results that are targeting the router itself. This type of traffic that is destined to the router is NOT handled the same way as traffic transitting the router.

Your routers (Trading_7206B NPE-G1 and Trading_7206A NPE-G2) are ancient, when compared to what has been available for the past 10 years or so. In addition, the pktSizes you are dealing with are small less than 200 byes)and very small(less than 100 bytes). What this means is that the throughput is low and yet, CPU utilization is likely high.

Now, we don’t know anything about the “features configured”. Without knowing this, it is hard to say what performance you can expect from your old routers.

What is the contracted bandwidth with the service providers?
Do you expect traffic to be made of small/very small pkts in both directions?
Do you track CPU utilization? If yes, what is it?

Are you familiar with IP SLA? If yes, kindly consider leveraging it and establishing baseline, for capacity planning purposes.

I hope this helps … Palani

Bandwidth for

Router A: 500 Mbps

Router B: 1000 Mbps 

Attached is router A and B bandwidth utilization.

Also attached is router cpu utilization for router A and B.

Hi Shamsul

7206-A has NPE-G2 CPU whereas 7206-B has NPE-G1. NPE-G2 is way faster CPU when compared to NPE-G1. Your NPE-G1 based router is being pushed to its limits during what I am guessing to be peak business hours. Look at the "past 72 hrs" data. You are touching 100% mark consistently.When the CPU is pegged to 100%, chances are that it is dropping traffic, which would manifest the symptom you described originally. (“we are encountering a lot of ping drop to our website and services”)

With no features configured, you can expect 500Mbps (NPE-G1) and 1Gbps (NPE-G2). Reference doc is here. What this document does not (explicitly) say is that when max rated pkts are being pushed, CPU will be at near 100%. Looking at the show int data, NPE-G1 is pushing 500Mbps on one interface.

At a minimum, consider upgrading NPE-G1 to G2. For immediate/long term, you need to think in terms of upgrading to ASR/1k platform.

I hope this helps ... Palani

Hi Palani

I'm a bit unclear this part:

With no features configured, you can expect 500Mbps (NPE-G1) and 1Gbps (NPE-G2). Reference doc is here. What this document does not (explicitly) say is that when max rated pkts are being pushed, CPU will be at near 100%. Looking at the show int data, NPE-G1 is pushing 500Mbps on one interface. 

Do you mind to elaborate further above part? 

Just for the confirmation

Router A NPE-G2 =500 Mbps

Cisco 7206VXR (NPE-G2) processor (revision A) with 1966080K/65536K bytes of memory.
Processor board ID 26825181

Router B NPE-G2 = 1000 Mbps

Cisco 7206VXR (NPE-G2) processor (revision A) with 1966080K/65536K bytes of memory.
Processor board ID 33678460

Thanks

Hi Shamsul

One of the two routers is seen approaching 100% utilization. If/when it happens, chances are high that the trafficRate at which it is entering the router is higher than what this router can handle. Under such circumstances, it is natural to expect pkts getting dropped and when it happens, services are going to be impacted.

If you see the router CPU is at 100% on a regular basis, you have no option other than considering upgrade of the router platform.

My both in stating one of them is NPE-G1. Both are indeed NPE-G2. With no features configured, NPE-G2 is capable of handling 1Gbps t'put. Kindly consider including the output of show run.

Are you using any network monitoring applications to track CPU and interface usage? Infovista, Concord/eHealth and such

Kind regards ... Palani

Attached is sh run output

Hi Shamsul

Thank you. Other than Netflow (ip flow ingress), this router is simply expected to forward pkts, as fast as it can. So, you can expect performance close to 1Gbps.

I see you are exporting Netflow data to 208.97.XXX.227. Kindly consider looking at data exported/analyzed at the time you experienced service interruption. If the volume is higher than 1Gbps, then you can interpret that it is time to consider upgrading the router.

Other thing you can do is to make the following changes to the config.

!
line con 0
exec prompt timestamp
!
line vty 0 4
exec prompt timestamp
!
process cpu threshold type total rising 90 interval 10 falling 60 interval 5
process cpu threshold type interrupt rising 80 interval 10 falling 60 int 5
process cpu threshold type process rising 60 interval 10 falling 50 int 5
!

When the CPU crosses 90%, you can expect to see a log (Sample output is shown below):

%SYS-1-CPURISINGTHRESHOLD: Threshold: Total CPU
Utilization(Total/Intr): 90%/85%, Top 3 processes(Pid/Util):  61/2%, 125/1%, 2/0%

The cmds seem to log the following:
     -Total CPU utilization
     -Amount of CPU spent servicing interrupts
     -Top 3 processes PID# with # CPU utilized by these processes.

Next time when you experience service interruption (“we are encountering a lot of ping drop to our website and services”), please check router logs. If you the log message like the one showed above, then it is confirmation that this router is ready to be upgraded.

I hope this helps .... Palani

Thank you sir. Great sharing and appreciate.

I will try to follow the steps that you line out and convince my boss to upgrade the routers.

Thanks

Review Cisco Networking products for a $25 gift card