cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
16975
Views
10
Helpful
17
Replies

Memory Allocation Error with IOS 15.0(2)

stefan.graber
Level 1
Level 1

Hi,

I updated a switch (C3560G-48PS-S) from 12.2(58)SE2 to IOS 15.0(2)SE1. Some time after the upgrade I got error messages about memory allocations to our syslog server in regular intervals (each 30 seconds).

I wasn't able to connect to the switch over SSH anymore. It wasn't even possible to access the CLI over the console cable (error message: "Low on memory; try again later"). After a reboot of the switch, it went fine for some hours, but the error appeared again. It seems that the switching process still works fine as there aren't any complaints of users about network issues.

Important: I installed the same IOS version to another switch of the type C2960-24PC-L and the memory allocation appeared there after some hours as well. I thought that this issue is maybe solved with the newest release of IOS for that particular device. But even with IOS 15.0(2)SE2 on the C2960-24PC-L the memory allocation error happens again. Just the traceback is a little bit different.

Does anyone have the same issue with IOS 15.0(2) as well? Could maybe give me someone a hint what to do for solving that issue?

Thanks.

Error Message of C3560G-48PS-S with IOS 15.0(2)SE1

031513: Feb 22 11:00:26.848: %SYS-2-MALLOCFAIL: Memory allocation of 65536 bytes failed from 0x2C13A88, alignment 0

Pool: Processor  Free: 693180  Cause: Memory fragmentation

Alternate Pool: None  Free: 0  Cause: No Alternate pool

-Process= "CDP Protocol", ipl= 0, pid= 205

-Traceback= 1FBAAF4z 2BF7F08z 2BFEAE4z 2C13A8Cz 1EB4DF4z 1EB8A0Cz 1EB8B00z 1E82974z 1A2ABF0z 1A2F1B0z 12EA4FCz 12EE85Cz 19EDC84z 19E83D8z

Error Message of C2960-24PC-L with IOS 15.0(2)SE1

113163: Feb 21 08:53:33.218: %SYS-2-MALLOCFAIL: Memory allocation of 65536 bytes failed from 0x1441F44, alignment 0

Pool: Processor  Free: 421908  Cause: Memory fragmentation

Alternate Pool: None  Free: 0  Cause: No Alternate pool

-Process= "CDP Protocol", ipl= 0, pid= 179

-Traceback= E08B40z 14263C4z 142CFA0z 1441F48z D02E40z D06A58z D06B4Cz CD09C0z 880BF0z 87E4C4z 87E600z 8845F0z C53660z C5397Cz C4FB20z 14A10ACz

Error Message of C2960-24PC-L with IOS 15.0(2)SE2

011537: Feb 22 17:01:49.415: %SYS-2-MALLOCFAIL: Memory allocation of 65536 bytes failed from 0x1442328, alignment 0

Pool: Processor  Free: 329472  Cause: Memory fragmentation

Alternate Pool: None  Free: 0  Cause: No Alternate pool

-Process= "CDP Protocol", ipl= 0, pid= 179

-Traceback= E08F18z 14267A8z 142D384z 144232Cz D0321Cz D06E34z D06F28z CD0D

17 Replies 17

jawad-mukhtar
Level 4
Level 4

Leo Laohoo
Hall of Fame
Hall of Fame

Unless you critically need features found in these two versions, try 12.2(55)SE6 or SE7

Sent from Cisco Technical Support Nintendo App

Basically I don't need any additional features of IOS 15.x, but the latest IOS of 12.x was already published in July 2011. Due to the security policy of my company I've to update the operating systems of our switches. Based on the download section it seems that Cisco publishes only newer versions of 15.x and 12.x won't be supported anymore.

Nevertheless, I changed the C3560G-48PS-S back to IOS 12.2(58)SE2 in the meantime and it works fine again. I'll try to isolate the issue with the C2960-24PC-L which still runs the newest IOS. Maybe I'm able to printout the commands provided by Nick anyhow.

Nicholas Oliver
Cisco Employee
Cisco Employee

Stefan,

Let's focus on one of these, as the issue on all of them is likely the same.  Could you please upload the following outputs from one of these devices when it is seeing these memory allocation failures:

-show version

-show mem all totals

-show mem sum

-show proc mem

What was the version that was previously running on these devices?  Based on your description, this issue is only seen AFTER the upgrade, correct?

-Nick

Nick,

Correct, the issue happens after the upgrade. I changed now the switch C3560G-48PS-S back to IOS 12.2(58)SE2. It works fine again. So it definitly seems that the memory allocation error happens due to the IOS upgrade. The C2960-24PC-L still runs with IOS 15.2(2)SE2 and produces the mentioned errors.

Unfortunately I'm not able to access the switch remotely to type in the provided commands and it's even not possible to connect to the console. While trying to use the console cable, the message "Low on memory; try again later" appears. The only possibilty to get access to the command line would be to reboot the switch and to connect to it immediately. Unfortunately to that time the memory allocation errors don't appear as it runs for some hours without any issues.

I guess the output of the commands after a reboot wouldn't be useful, right?

Thanks. Stefan

Nick,

I'm able to provide at least some information as we weekly save the output of 'show tech-support'. The upgrade happened some days prior to the last save and the switch already produced error messages about memory allocation to that time. While having a look at the output I found out that the log file is filled by another message:

010225: Feb 22 16:13:50.670: AAA/ATTR(00000000): cannot alloc new sublist

Any idea about that error? Did maybe something change with the authentication method from IOS 12.x to 15.x?

Stefan

Stefan,

Running the commands immediately after a reboot would provide us with a baseline.  After the reboot, how many hours does it take before you start seeing memory allocation failures?  These low end switches operate within very tight tolerances with regards to memory usage, and a small leak can result in this type of behavior.  Looking at the output you provided there are two processes that I am concerned with:

152   0  321956340  318071124    3874320          0          0 Auth Manager  

179   0    8375196    3826736    1471724          0          0 CDP Protocol  

If we knew that it was say 4 hours before the issue occurred after a reboot, and we could capture the commands immediately after a reboot, and then say every hour after that until the problem occurred that would help.

-Nick

Nick,

Good idea to have a baseline. I gathered the corresponding output yesterday: I did another reboot of the C2960-24PC-L and as before the switch didn't produce any error messages for a couple of hours.Cause it isn't possible to access the switch anymore as soon as memory allocation error happens, I saved the outputs every hour to our server while using a scheduled task. So here we go...

Please find attached all outputs of the provided commands. It took arround 4 and a half hours till the error messages appeared again: Reboot took place at 6:32am and the first error appeared at 11:12am (local time). The following graphs shows additionally the steady increase of the used memory of the I/O and processor pool of that particular switch. Based on the output of 'show memory process' I did additionally some charts with shows the process of memory usage for 'Auth Manager' and 'CDP Protocol' (see PDF).

Please let me know if I shall provide additional information. Thanks a lot for your help.

Stefan

C2960-24PC-L

C2960-24PC-L

Stefan,

The issue definitely appears to be a leak.  If you look at the memory there is a large increase (relatively speaking) in:

    PC          Total   Count  Name

0x00D03218    4065960      62  AAA AttrL Sub

In the capture at 0 hours this is holding about 65K, however after 6 hours it's risen to 4 MB.  As I mentioned before a small leak can have a large impact on a switch like this because it doesn't have much memory free to begin with.  This issue looks very similar to one that we fixed in 15.0(2)SE:

CSCty49762

EAP Framework and AAA AttrL Sub Uses All Process Memory

This bug deals with AAA and dot1x authentications.  However this bug is already fixed in your release.  We need to get a new bug opened for this issue and attempt to reproduce it in our labs.  Whenever it is convenient for you, I would suggest opening a TAC case, you can do that here:

http://tools.cisco.com/ServiceRequestTool/create/launch.do

When you go through that process it will ask you for your CCO user id, and you will have to select a tech and subtech that describe the problem.  Please use:

TECH: Router and IOS Architecture

SUBTECH: Memory Allocation Failure

PROBLEM CODE: Software Failure

If you have any problems in this process, let me know.

-Nick

Hi Nick,

Thanks for analyzing the outputs and the information about the memory leak. Unfortunately I'm not able to open a TAC case by my myself due to our maintenace contract, but I already forwarded the information to our Cisco partner with reference to that forum thread. As soon as I get an update about the issue, I'll let you know. Propably there are some others out there with the same issue on their switches

Stefan

Appears to also be a issue in 15.0(2)SE2, 3750-E. Was able to get this yesterday:

sh processes memory sorted

Processor Pool Total:  175321384 Used:  163314948 Free:   12006436

      I/O Pool Total:   16777216 Used:   12939452 Free:    3837764

Driver te Pool Total:    4194304 Used:     106740 Free:    4087564

PID TTY  Allocated      Freed    Holding    Getbufs    Retbufs Process

211   0  345172288  149244784   89995048      25380          0 Auth Manager   

   0   0  119855256   53253192   60774588          0          0 *Init*         

   0   0  683544756  670127884    6809520   14732139    1472786 *Dead*         

  93   0    5137024    1504012    2914280      44196          0 Stack Mgr Notifi

214   0 1248266660  331598488    2437228  635292072          0 CDP Protocol   

435   0    2170332     214372    1962324          0          0 EIGRP-IPv4     

282   0     978660       1476     952616          0          0 IPC LC Message H

   0   0          0          0     656760          0          0 *MallocLite*   

288   0     549328       7932     535484          0          0 IP RIB Update  

343   0  180616076  178111144     439804          0          0 hulc running con

   1   0   76233364   75865200     397164          0          0 Chunk Manager  

  64   0     368236        600     377796          0          0 EEM ED Identity

205   0     297412      10660     295704          0          0 HL2MCM         

204   0     294980       8236     294980          0          0 HL2MCM         

370   0     265100          0     275260     100548          0 EEM ED Syslog  

  30   0     290688          0     268584          0          0 IPC Seat RX Cont

384   0     196344          0     203504          0          0 EEM Server     

298   0     289752      76232     190556       2268          0 DHCPD Receive  

247   0    1056852     432180     169488      55836          0 802.1x switch  

415   0   15259740      13604     167108      10152          0 IGMP Input

Now I can no longer SSH or console into the switch. Get login prompts but local or AAA authentication doesnt work and no TACACS request is being sent to ACS. Upgraded three stacks for testing before deployment. This is the only one that seems to be having the issue. It was right at a week of uptime. Opened TAC cast a bit ago.

I've created a TAC case (last week) after I got three different type of Tracebacks from a 3750G.  TAC engineer finally confirmed on Friday that they found the bug I was hitting.

No ETA as to when an engineering fix is going to appear.

Thanks for your replies. I'm glad to hear that I'm not the only with that issue.

I created a TAC case in the meanwhile as well. Unfortunately I wasn't able to provide Cisco the requested outputs till now as the memory allocation errors didn't appear after a reboot of the switch so far. Nevertheless, the memory usage increases steadily... so I guess it's just a matter of time till it happens again.

Stefan

We are also having this issue with 15.0(2) SE2 on two different 3750 switch stacks. We upgraded our 3 HQ stacks over the weekend and now i cannot remotely connect to two of them. The other works fine. Very strange.

We will be rolling back this weekend.

Any suggestions on a good, working IOS to go to?

Review Cisco Networking for a $25 gift card