cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
56648
Views
20
Helpful
37
Replies

High CPU usage - 3750 X stack

johnramz
Level 1
Level 1

Support Community

We recently configured a stack of four 48 port 3750-x switches . We are noticing high CPU usage. "Hulc LED process" seems pretty high.

This has coincided with VMware servers getting slow and non-responsive at times, perhaps a coincidence, not sure.

Below I provided some outputs that might help to diagnose it

Thanks

John

System image file is "flash:/c3750e-ipbasek9-mz.122-58.SE2/c3750e-ipbasek9-mz.122-58.SE2.bin"

Show inventory output

NAME: "1", DESCR: "WS-C3750X-48"

PID: WS-C3750X-48T-S   , VID: V02  ,

NAME: "Switch 1 - Power Supply 0", DESCR: "FRU Power Supply"

PID: C3KX-PWR-350WAC   , VID: V02L ,

NAME: "2", DESCR: "WS-C3750X-48"

PID: WS-C3750X-48T-S   , VID: V02 

NAME: "Switch 2 - Power Supply 0", DESCR: "FRU Power Supply"

PID: C3KX-PWR-350WAC   , VID: V02D ,

NAME: "3", DESCR: "WS-C3750X-48"

PID: WS-C3750X-48T-S   , VID: V02 

NAME: "Switch 3 - Power Supply 0", DESCR: "FRU Power Supply"

PID: C3KX-PWR-350WAC   , VID: V02L ,

NAME: "4", DESCR: "WS-C3750X-48"

PID: WS-C3750X-48T-S   , VID: V02 

NAME: "Switch 4 - Power Supply 0", DESCR: "FRU Power Supply"

PID: C3KX-PWR-350WAC   , VID: V02L ,

SWITCH#sh processes cpu sorted

CPU utilization for five seconds: 61%/5%; one minute: 50%; five minutes: 49%

PID Runtime(ms)     Invoked      uSecs   5Sec   1Min   5Min TTY Process

168   260466386    44948517       5794 14.53% 13.98% 13.70%   0 Hulc LED Process

231    97586088    27253906       3580  4.95%  4.73%  4.64%   0 Spanning Tree

213    63106121   154928892        407  4.15%  3.89%  3.91%   0 IP Input

284    70113217    34537588       2030  3.51%  3.98%  4.17%   0 RARP Input

   4     6663412      421278      15817  3.03%  0.43%  0.32%   0 Check heaps

374     9872291    10805181        913  3.03%  0.77%  0.62%   0 IP SNMP

376    11142951     5370604       2074  3.03%  0.73%  0.66%   0 SNMP ENGINE

  12    35389011    32152175       1100  2.87%  2.08%  2.20%   0 ARP Input

128    34962407     3622140       9652  2.07%  1.69%  1.63%   0 hpm counter proc

  85    49034286     8536062       5744  1.91%  2.44%  2.44%   0 RedEarth Tx Mana

107    25127806    46459053        540  1.27%  1.10%  0.93%   0 HLFM address lea

174        2412        1714       1407  0.95%  0.39%  0.25%   1 SSH Process

220     6423643    12634764        508  0.79%  0.70%  0.56%   0 ADJ resolve proc

181     6913179     2890070       2392  0.63%  0.31%  0.36%   0 HRPC qos request

375     1681949     5000777        336  0.47%  0.08%  0.07%   0 PDU DISPATCHER

  84    10180707    12623537        806  0.47%  0.30%  0.37%   0 RedEarth I2C dri

        1

      666666096996666666666666659666667666666666666666666766676666666656666666

      249363098992351145264823289455360612252332233522344115537230141392553343

  100       ** **               *

   90       ** **               *

   80       ** **               *

   70   * * *****  *   * * *    * ** ***   *       *     * ****         **

   60 **********************************************************************

   50 ######################################################################

   40 ######################################################################

   30 ######################################################################

   20 ######################################################################

   10 ######################################################################

     0....5....1....1....2....2....3....3....4....4....5....5....6....6....7..

               0    5    0    5    0    5    0    5    0    5    0    5    0

                   CPU% per hour (last 72 hours)

                  * = maximum CPU%   # = average CPU%

      455555555554444444444555554444455555555555555555555444444444

      922222111118888866666000009999911111555554444422222444448888

  100

   90

   80

   70

   60                                     *****

   50 ***************************************************     **

   40 **********************************************************

   30 **********************************************************

   20 **********************************************************

   10 **********************************************************

     0....5....1....1....2....2....3....3....4....4....5....5....6

               0    5    0    5    0    5    0    5    0    5    0

               CPU% per second (last 60 seconds)

      565756555555555555555555555555555556555555555555565555565556

      518841757869248569271526666733778330496833777819929379701861

  100

   90

   80    *

   70    *

   60 **** *******  **** * * *****  ***  * ***  **** **** **** *

   50 ##########################################################

   40 ##########################################################

   30 ##########################################################

   20 ##########################################################

   10 ##########################################################

     0....5....1....1....2....2....3....3....4....4....5....5....6

               0    5    0    5    0    5    0    5    0    5    0

               CPU% per minute (last 60 minutes)

              * = maximum CPU%   # = average CPU%

37 Replies 37

These switches I have are all new and do not have any modules on them. It is a bit strange then.

Will you know that key features I will be missing by not installing a 15.x version and instead downgrading to 12.x? I know there is feature navigator but if you know them in top on your head, let me kknow, thank you

Cisco Feature Navigator:  www.cisco.com/go/fn

Jason Aarons
Level 6
Level 6

I experienced this problem too for a large manufacturing company plant during a recent upgrade. The hulc cpu issue occured with c3750e-universalk9-mz.122-58.SE2.bin,  we upgraded to latest c3750e-universalk9-mz.152-1.E.bin and immediately experienced a bug with IPC (Stack communications) making the stack non-responseive console would stop responding with tracebacks, we backed out of that fast!

We then downgraded to c3750e-universalk9-mz.122-55.SE8.bin and whalla the hulc high cpu went away.

We opened SR628298899 and they kept implying that 15% hulc wasn't the problem (show log had tracebacks for hulc as well).  We spent about 2 week and lots of labor hours troubleshooting this.  Nothing could ever be found wrong, however we had intermittent outages (4-5 minutes every couple hours) over the week. No issues with mac addr table, arp, spanning-tree, cef, queue drops or other issues.  Wireshark on clients was clean, the traffic just seemed to vanish. Clients were unable to ping their default gateway (SVI) and would fix itself after 5-10 min, then reappear hours later.  We went thru several TAC engineers before trying the IOS downgrade.  At first we thought maybe the stack master was bad and went down that route which didn't help.  Also the IOS upgrades took forever (20 minutes per switch) which suprised many.  I wasn't expecting that long of a outage.

Only thing I can tell is that the hulc process is now using 0.00% CPU whereas before the downgrade it was using 15%.  Whatever the problem only a downgrade fixed it.

Before the downgrade;

    ERROR: Total CPU Utilization is at 99% for the past 5 seconds, which is very
  high (>90%).
  This can cause the following symptoms:
    - Input queue drops
    - Slow performance
    - Slow response in Telnet or unable to Telnet to the router
    - Slow response on the console
    - Slow or no response to ping
    - Router doesn't send routing updates
  The following processes are causing excessive CPU usage:
    PID CPU Time       Process
     170       13.35          Hulc LED Process
   395  13.18          SSH Process
   399  10.54          SSH Process


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
After the downgrade;

MICUSRTR1MDF1#show proc cpu | inc hulc

103           9        34        264  0.00%  0.00%  0.00%   0 HRPC hulc misc r

109           0       142          0  0.00%  0.00%  0.00%   0 hulc_xps_process

354           8         5       1600  0.00%  0.00%  0.00%   0 hulc cfg mgr mas

355        7231        40     180775  0.00%  0.00%  0.00%   0 hulc running con

MICUSRTR1MDF1#

Hulc high cpu bug matches

http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCtn42790

http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCti61764

Fixed show ver

c3750e-universalk9-mz.122-55.SE8.bin

  WS-C3750X-48P      12.2(55)SE8           C3750E-UNIVERSALK9-M

  WS-C3750X-24P      12.2(55)SE8           C3750E-UNIVERSALK9-M

  WS-C3750X-24P      12.2(55)SE8           C3750E-UNIVERSALK9-M

Jason Aarons, CCIE No. 38564

 we upgraded to latest c3750e-universalk9-mz.152-1.E.bin and immediately experienced a bug with IPC (Stack communications) making the stack non-responseive console would stop responding with tracebacks

You are not the first to use 15.2(1)E and rolled-back.

Also the IOS upgrades took forever (20 minutes per switch) which suprised many.  I wasn't expecting that long of a outage.

Hmmmm ... maybe the bootstrap was upgraded during the process.

Only thing I can tell is that the hulc process is now using 0.00% CPU whereas before the downgrade it was using 15%.  Whatever the problem only a downgrade fixed it.

In my personal opinion, the best IOS versions (for the 3750-series family of switches) started with 12.2(55)SE6 and up to SE8.

If you need to go to 15.X, then test first IOS version 15.0(2)SE4.  DO NOT even try 15.0(2)SE5!

Why not 15.0(2)SE5 ?

Why not 15.0(2)SE5 ?

Because 15.0(2)SE5 is full of "surprises".  Bad ones.

If you have a spare 3750X then load this one up (with config).  Leave it for 1 week.  And verify the behaviour of this IOS. 

I loaded this IOS into my spare 3750X and in a few days I was getting memory/CPU going nuts.  Epic fail.  ROLL BACK!

In your production we use several 3750X with this Software and within 3 weeks there are no cpu or memory nuts identifiable

(checked with nagios every 3 minutes).

adventurer
Level 1
Level 1

I have loaded 15.0(2)SE4 into a 3750X-48T-L and a 3560X-24T-S.

Surprisingly, 3750X shows average 30% but the 3560X is perfectly happy 10%.

I always thought the code between these 2 series are similar how could it have made such a difference.

Regards.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: