Solved: High CPU usage on 3560CG-8

ggilley · ‎07-13-2012

I'm seeing very high CPU usage on my new 3560CG-8-PC switch. It's at 61%. Any suggestions how to chase down what's causing it?

I'm seeing a lot of rpf-fail drops. What causes them? How to prevent?

Supervisor TxQueue Drop Statistics

Queue 0: 0

Queue 1: 0

Queue 2: 0

Queue 3: 0

Queue 4: 0

Queue 5: 0

Queue 6: 0

Queue 7: 0

Queue 8: 0

Queue 9: 0

Queue 10: 0

Queue 11: 0

Queue 12: 0

Queue 13: 0

Queue 14: 4373069

Queue 15: 0

Thanks,

Greg

nkarpysh · ‎07-24-2012

Hi Greg,

Not sure what is your current IOS. But problem looks very close to the cosmetic defect below:

http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCth24278

Symptoms:

Catalyst 2960S switch may report elevated CPU utilization (e.g., 50%) under

normal conditions.

Conditions:

ALL the following conditions MUST match.

- This issue is observed in 2960S even without any configuration and connection.

- This issue occurs when the telnet/console session is idle. When a telnet

/console session to the switch is established, the CPU utilization falls to

normal baseline levels. As long as a telnet/console session remains active, the

CPU utilization remains normal.

- This issue is usually observed by "show process cpu history"

for e.g.,

2960S_A>en

2960S_A# show process cpu history

44444444445555555555555554444444444555555555555555

8888888899999888884444433333000009999988888000002222222222

100

90

80

70

60

50 **************************************************

40 **************************************************

30 **************************************************

20 **************************************************

10 **********************************************************

0....5....1....1....2....2....3....3....4....4....5....5....

0 5 0 5 0 5 0 5 0 5

CPU% per second (last 60 seconds)

The above graph indicates the CPU utilization was ~50% and went down to 8%

when console session up.

Workaround:

None. This issue is cosmetic.

Further Problem Description:

On this platform, CPU utilization software accounting is performed incorrectly
resulting in misleading levels. This is true ONLY when the CPU is idle, there
is no console/VTY activity, or there are no packets sent to CPU.

When console/VTY activity is present or packets are sent to CPU, the CPU
utilization software accounting is correct.

There should NOT be any performance impact due to this bug.

This issue may be seen in Catalyst 2960 platforms also.

The bug is fixed in 12.2(58)SE1 and 12.2(55)SE3, or later releases.

HTH

Nik

HTH,
Niko

View solution in original post

Giuseppe Larosa · ‎07-14-2012

Hello Greg,

there is IP multicast traffic in your network?

RPF fail means that traffic from a multicast source is checked to see if it arrives from the interface that the local node would use to reach the source and it arrives on a different interface and it has to be dropped

This check is called Reverse Path Forwarding.

If the amount of multicast traffic is high the high cpu may be caused by the action of sending traffic to cpu to just be dropped.

see

http://www.cisco.com/en/US/tech/tk828/technologies_tech_note09186a0080094b55.shtml#noforwardrpffail

Hope to help

Giuseppe

ggilley · ‎07-22-2012

Thanks for the tip, but I'm confused. This is a switch, not a router. Why would it be doing RPF on multicast traffic?

Am I wrong about what Queue14 is?

Thanks,

Greg

Giuseppe Larosa · ‎07-23-2012

Hello Greg,

it is a multilayer switch so it can perform multicast routing and if it is configured for this, RPF check is part of its job.

However, queue 13 is the rpf-check queue 14 is

14:dstats

you are right your queue drops are not related to RPF check.

Hope to help

Giuseppe

ggilley · ‎07-24-2012

I'm very confused. I log into the switch and the CPU usage drops. And it stays dropped as long as I'm logged in. Once I log out, it goes back up to 60%. How do I diagnose something like this when I log in, it changes the behavior?

Thanks,

Greg

11111 11111 11111

9777779999900000888889999988888000007777788888777777777788

100

90

80

70

60

50

40

30

20 *****

10 **********************************************************

0....5....1....1....2....2....3....3....4....4....5....5....

0 5 0 5 0 5 0 5 0 5

CPU% per second (last 60 seconds)

1111111115666565566566666666565566665555565666656665566666

3366746768021839900932222010839912218999908020090108803110

100

90

80

70

60 *################################################

50 *################################################

40 *################################################

30 *################################################

20 *** ***#################################################

10 ##########################################################

0....5....1....1....2....2....3....3....4....4....5....5....

0 5 0 5 0 5 0 5 0 5

CPU% per minute (last 60 minutes)

* = maximum CPU% # = average CPU%

6667666666667776666766767676666676666666699676767766666666666666766666

3353554973237419339164556634764457363934429713080145834863633743244758

100 *

90 **

80 * * * * **

70 **** ** **** *** ***** ** ** * * **** **** ** ** * * * ***

60 ##########**#####################*#######**###########################

50 ##########*##############################*############################

40 ##########*##############################*############################

30 ##########*##############################*############################

20 ######################################################################

10 ######################################################################

0....5....1....1....2....2....3....3....4....4....5....5....6....6....7.

0 5 0 5 0 5 0 5 0 5 0 5 0

CPU% per hour (last 72 hours)

* = maximum CPU% # = average CPU%

Giuseppe Larosa · ‎07-24-2012

Hello Greg,

it is interesting

I'm sorry for the basic suggestion but I would start from

show proc cpu sorted 1min

I know that when you are connected cpu usage is reduced, but if it is caused by a process you should be able to find some trace of this.

Hope to help

Giuseppe

ggilley · ‎07-24-2012

Yeah, I tried that, but it doesn't show anything interesting. I did it for 5min.

Greg

PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process

138 22498296 25292782 889 1.89% 1.77% 1.76% 0 Hulc LED Process

64 9004438 50466203 178 1.09% 0.74% 0.76% 0 RedEarth Rx Mana

4 9448557 524570 18012 0.29% 0.53% 0.73% 0 Check heaps

3 302089 17805 16966 0.00% 1.51% 0.43% 0 crypto sw pk pro

199 4189228 4665167 897 0.29% 0.38% 0.40% 0 Spanning Tree

82 2276428 30782775 73 0.19% 0.19% 0.23% 0 HLFM address lea

8 1988987 17974 110659 0.00% 0.18% 0.14% 0 Licensing Auto U

84 1396506 30813111 45 0.00% 0.09% 0.09% 0 HLFM address ret

265 812825 10609471 76 0.00% 0.07% 0.09% 0 MDFS RP process

254 795937 2390329 332 0.00% 0.08% 0.09% 0 Marvell wk-a Pow

97 549106 3354204 163 0.19% 0.06% 0.07% 0 hpm main process

128 395411 5284951 74 0.00% 0.05% 0.05% 0 Hulc Storm Contr

184 319545 10567666 30 0.00% 0.03% 0.04% 0 MDFS MFIB Proces

100 402606 1077963 373 0.00% 0.05% 0.02% 0 hpm counter proc

286 72 206 349 0.19% 0.08% 0.02% 1 Virtual Exec

181 234477 977617 239 0.29% 0.07% 0.01% 0 IP Input

36 87268 1078009 80 0.09% 0.04% 0.01% 0 Per-Second Jobs

243 360321 2179690 165 0.00% 0.02% 0.00% 0 DHCPD Receive

63 386547 50446694 7 0.19% 0.02% 0.00% 0 RedEarth Tx Mana

nkarpysh · ‎07-24-2012

Hi Greg,

Not sure what is your current IOS. But problem looks very close to the cosmetic defect below:

http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCth24278

Symptoms:

Catalyst 2960S switch may report elevated CPU utilization (e.g., 50%) under

normal conditions.

Conditions:

ALL the following conditions MUST match.

- This issue is observed in 2960S even without any configuration and connection.

- This issue occurs when the telnet/console session is idle. When a telnet

/console session to the switch is established, the CPU utilization falls to

normal baseline levels. As long as a telnet/console session remains active, the

CPU utilization remains normal.

- This issue is usually observed by "show process cpu history"

for e.g.,

2960S_A>en

2960S_A# show process cpu history

44444444445555555555555554444444444555555555555555

8888888899999888884444433333000009999988888000002222222222

100

90

80

70

60

50 **************************************************

40 **************************************************

30 **************************************************

20 **************************************************

10 **********************************************************

0....5....1....1....2....2....3....3....4....4....5....5....

0 5 0 5 0 5 0 5 0 5

CPU% per second (last 60 seconds)

The above graph indicates the CPU utilization was ~50% and went down to 8%

when console session up.

Workaround:

None. This issue is cosmetic.

Further Problem Description:

On this platform, CPU utilization software accounting is performed incorrectly
resulting in misleading levels. This is true ONLY when the CPU is idle, there
is no console/VTY activity, or there are no packets sent to CPU.

When console/VTY activity is present or packets are sent to CPU, the CPU
utilization software accounting is correct.

There should NOT be any performance impact due to this bug.

This issue may be seen in Catalyst 2960 platforms also.

The bug is fixed in 12.2(58)SE1 and 12.2(55)SE3, or later releases.

HTH

Nik

HTH,
Niko

ggilley · ‎07-25-2012

That's it! No wonder I couldn't figure out what was going on.

Now if I could just figure out how to install 12.2(55)EX3, I could go on with my life. It always fails when I try to install :-(

Thanks!

Greg

P.S. And thanks Giuseppe for trying to help