High CPU usage - 3750 X stack - Page 3

johnramz · ‎06-20-2012

Support Community

We recently configured a stack of four 48 port 3750-x switches . We are noticing high CPU usage. "Hulc LED process" seems pretty high.

This has coincided with VMware servers getting slow and non-responsive at times, perhaps a coincidence, not sure.

Below I provided some outputs that might help to diagnose it

Thanks

John

System image file is "flash:/c3750e-ipbasek9-mz.122-58.SE2/c3750e-ipbasek9-mz.122-58.SE2.bin"

Show inventory output

NAME: "1", DESCR: "WS-C3750X-48"

PID: WS-C3750X-48T-S , VID: V02 ,

NAME: "Switch 1 - Power Supply 0", DESCR: "FRU Power Supply"

PID: C3KX-PWR-350WAC , VID: V02L ,

NAME: "2", DESCR: "WS-C3750X-48"

PID: WS-C3750X-48T-S , VID: V02

NAME: "Switch 2 - Power Supply 0", DESCR: "FRU Power Supply"

PID: C3KX-PWR-350WAC , VID: V02D ,

NAME: "3", DESCR: "WS-C3750X-48"

PID: WS-C3750X-48T-S , VID: V02

NAME: "Switch 3 - Power Supply 0", DESCR: "FRU Power Supply"

PID: C3KX-PWR-350WAC , VID: V02L ,

NAME: "4", DESCR: "WS-C3750X-48"

PID: WS-C3750X-48T-S , VID: V02

NAME: "Switch 4 - Power Supply 0", DESCR: "FRU Power Supply"

PID: C3KX-PWR-350WAC , VID: V02L ,

SWITCH#sh processes cpu sorted

CPU utilization for five seconds: 61%/5%; one minute: 50%; five minutes: 49%

PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process

168 260466386 44948517 5794 14.53% 13.98% 13.70% 0 Hulc LED Process

231 97586088 27253906 3580 4.95% 4.73% 4.64% 0 Spanning Tree

213 63106121 154928892 407 4.15% 3.89% 3.91% 0 IP Input

284 70113217 34537588 2030 3.51% 3.98% 4.17% 0 RARP Input

4 6663412 421278 15817 3.03% 0.43% 0.32% 0 Check heaps

374 9872291 10805181 913 3.03% 0.77% 0.62% 0 IP SNMP

376 11142951 5370604 2074 3.03% 0.73% 0.66% 0 SNMP ENGINE

12 35389011 32152175 1100 2.87% 2.08% 2.20% 0 ARP Input

128 34962407 3622140 9652 2.07% 1.69% 1.63% 0 hpm counter proc

85 49034286 8536062 5744 1.91% 2.44% 2.44% 0 RedEarth Tx Mana

107 25127806 46459053 540 1.27% 1.10% 0.93% 0 HLFM address lea

174 2412 1714 1407 0.95% 0.39% 0.25% 1 SSH Process

220 6423643 12634764 508 0.79% 0.70% 0.56% 0 ADJ resolve proc

181 6913179 2890070 2392 0.63% 0.31% 0.36% 0 HRPC qos request

375 1681949 5000777 336 0.47% 0.08% 0.07% 0 PDU DISPATCHER

84 10180707 12623537 806 0.47% 0.30% 0.37% 0 RedEarth I2C dri

1

666666096996666666666666659666667666666666666666666766676666666656666666

249363098992351145264823289455360612252332233522344115537230141392553343

100 ** ** *

90 ** ** *

80 ** ** *

70 * * ***** * * * * * ** *** * * * **** **

60 **********************************************************************

50 ######################################################################

40 ######################################################################

30 ######################################################################

20 ######################################################################

10 ######################################################################

0....5....1....1....2....2....3....3....4....4....5....5....6....6....7..

0 5 0 5 0 5 0 5 0 5 0 5 0

CPU% per hour (last 72 hours)

* = maximum CPU% # = average CPU%

455555555554444444444555554444455555555555555555555444444444

922222111118888866666000009999911111555554444422222444448888

100

90

80

70

60 *****

50 *************************************************** **

40 **********************************************************

30 **********************************************************

20 **********************************************************

10 **********************************************************

0....5....1....1....2....2....3....3....4....4....5....5....6

0 5 0 5 0 5 0 5 0 5 0

CPU% per second (last 60 seconds)

565756555555555555555555555555555556555555555555565555565556

518841757869248569271526666733778330496833777819929379701861

100

90

80 *

70 *

60 **** ******* **** * * ***** *** * *** **** **** **** *

50 ##########################################################

40 ##########################################################

30 ##########################################################

20 ##########################################################

10 ##########################################################

0....5....1....1....2....2....3....3....4....4....5....5....6

0 5 0 5 0 5 0 5 0 5 0

CPU% per minute (last 60 minutes)

* = maximum CPU% # = average CPU%

insccisco · ‎10-28-2013

These switches I have are all new and do not have any modules on them. It is a bit strange then.

Will you know that key features I will be missing by not installing a 15.x version and instead downgrading to 12.x? I know there is feature navigator but if you know them in top on your head, let me kknow, thank you

Leo Laohoo · ‎10-29-2013

Cisco Feature Navigator: www.cisco.com/go/fn

Jason Aarons · ‎11-25-2013

I experienced this problem too for a large manufacturing company plant during a recent upgrade. The hulc cpu issue occured with c3750e-universalk9-mz.122-58.SE2.bin, we upgraded to latest c3750e-universalk9-mz.152-1.E.bin and immediately experienced a bug with IPC (Stack communications) making the stack non-responseive console would stop responding with tracebacks, we backed out of that fast!

We then downgraded to c3750e-universalk9-mz.122-55.SE8.bin and whalla the hulc high cpu went away.

We opened SR628298899 and they kept implying that 15% hulc wasn't the problem (show log had tracebacks for hulc as well). We spent about 2 week and lots of labor hours troubleshooting this. Nothing could ever be found wrong, however we had intermittent outages (4-5 minutes every couple hours) over the week. No issues with mac addr table, arp, spanning-tree, cef, queue drops or other issues. Wireshark on clients was clean, the traffic just seemed to vanish. Clients were unable to ping their default gateway (SVI) and would fix itself after 5-10 min, then reappear hours later. We went thru several TAC engineers before trying the IOS downgrade. At first we thought maybe the stack master was bad and went down that route which didn't help. Also the IOS upgrades took forever (20 minutes per switch) which suprised many. I wasn't expecting that long of a outage.

Only thing I can tell is that the hulc process is now using 0.00% CPU whereas before the downgrade it was using 15%. Whatever the problem only a downgrade fixed it.

Before the downgrade;

    ERROR: Total CPU Utilization is at 99% for the past 5 seconds, which is very

  high (>90%).

  This can cause the following symptoms:

    - Input queue drops

    - Slow performance

    - Slow response in Telnet or unable to Telnet to the router

    - Slow response on the console

    - Slow or no response to ping

    - Router doesn't send routing updates

  The following processes are causing excessive CPU usage:

    PID CPU Time       Process

     170       13.35          Hulc LED Process

   395  13.18          SSH Process

   399  10.54          SSH Process


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
After the downgrade;

MICUSRTR1MDF1#show proc cpu | inc hulc

103 9 34 264 0.00% 0.00% 0.00% 0 HRPC hulc misc r

109 0 142 0 0.00% 0.00% 0.00% 0 hulc_xps_process

354 8 5 1600 0.00% 0.00% 0.00% 0 hulc cfg mgr mas

355 7231 40 180775 0.00% 0.00% 0.00% 0 hulc running con

MICUSRTR1MDF1#

Hulc high cpu bug matches

http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCtn42790

http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCti61764

Fixed show ver

c3750e-universalk9-mz.122-55.SE8.bin

WS-C3750X-48P 12.2(55)SE8 C3750E-UNIVERSALK9-M

WS-C3750X-24P 12.2(55)SE8 C3750E-UNIVERSALK9-M

Jason Aarons, CCIE No. 38564

Leo Laohoo · ‎11-25-2013

 we upgraded to latest c3750e-universalk9-mz.152-1.E.bin and immediately experienced a bug with IPC (Stack communications) making the stack non-responseive console would stop responding with tracebacks

You are not the first to use 15.2(1)E and rolled-back.

Also the IOS upgrades took forever (20 minutes per switch) which suprised many.  I wasn't expecting that long of a outage.

Hmmmm ... maybe the bootstrap was upgraded during the process.

Only thing I can tell is that the hulc process is now using 0.00% CPU whereas before the downgrade it was using 15%.  Whatever the problem only a downgrade fixed it.

In my personal opinion, the best IOS versions (for the 3750-series family of switches) started with 12.2(55)SE6 and up to SE8.

If you need to go to 15.X, then test first IOS version 15.0(2)SE4. DO NOT even try 15.0(2)SE5!

Andre Weissflog · ‎02-17-2014

Why not 15.0(2)SE5 ?

Leo Laohoo · ‎02-17-2014

Why not 15.0(2)SE5 ?

Because 15.0(2)SE5 is full of "surprises". Bad ones.

If you have a spare 3750X then load this one up (with config). Leave it for 1 week. And verify the behaviour of this IOS.

I loaded this IOS into my spare 3750X and in a few days I was getting memory/CPU going nuts. Epic fail. ROLL BACK!

Andre Weissflog · ‎02-22-2014

In your production we use several 3750X with this Software and within 3 weeks there are no cpu or memory nuts identifiable

(checked with nagios every 3 minutes).

adventurer · ‎12-02-2013

I have loaded 15.0(2)SE4 into a 3750X-48T-L and a 3560X-24T-S.

Surprisingly, 3750X shows average 30% but the 3560X is perfectly happy 10%.

I always thought the code between these 2 series are similar how could it have made such a difference.

Regards.