03-06-2013 12:45 AM - edited 03-07-2019 12:05 PM
This is fyi.
We have a serious kind of memory leaking troubles on our C3750 stacks. We have hundreds of such stacks, but the problem appears only on those that have many features enabled - like many VLANs, many subnets, HSRP, STP root bridge etc.
The symptom is that first we lose SSH, an error message like this can be found in log
Feb 13 13:36:29.336: %AAA-3-ACCT_LOW_MEM_UID_FAIL: AAA unable to create UID for incoming calls due to insufficient processor memory
then we lose telnet ( we normally do not use telnet but enabled on those )
then we are not able to log in through console, getting errors like these on console
%% Low on memory; try again later
and then the switch loses it L3 and L2 functionalities and needs to be restarted.
The whole process takes some time, like two weeks, it develops slowly, it is not a suden strike.
We reported this to Cisco about a month ago, they tried to match to these bugs
but it looks like it is a new bug.
We tried various IOS v15 versions, but did not help. Provided a lot of info from runing switches, Cisco development team is involved, but the root cause not known yet. Cisco's advice was to get te switch rebooted regurarly.
Internaly we made a decision to downgrade IOS on one of these switches, and went to 122-58.SE2 on 23 Feb and since that we have not had any issues. Of course there is no guarantee we will not have, but so far we are happy and going to downgrade the other ones as well.
05-29-2013 07:17 AM
Hello,
Is there any update on this case? We did an update to the latest IOS v15 from v12 and now we have exactly the same issue.
Do we need to do a downgrade or is there a real sulution available. I cannot believe Cisco didn't provide any solution yet and we don't want to do a downgrade if not necessary.
Tnx!
05-29-2013 04:36 PM
Unless you have operational features requirement to run 15.0, avoid this version.
I would recommend using 12.2(55)SE7.
If you have 802.1x, avoid 15.0 at all cost.
Sent from Cisco Technical Support Nintendo App
05-29-2013 05:17 PM
Sad to see you are still dealing with this type issue as this has been going in the 3750 since early 12.X code . We had this issue with like 12.2.35 SE way back 5 years ago .
05-29-2013 08:05 PM
Hi Glen,
Just want to let you (and anyone else reading this thread), I've started rolling-BACK my fleet of 3560 and 3750 from 15.0(2)SE2 to 12.2(55)SE7.
I'm hitting multiple bugs in this version with our implementation of 802.1x.
05-31-2013 12:10 AM
Cisco is still working on the TAC case since February and they have not provided us a clear answer on what the root cause is and how to remedy it.
Definitely it is dependent on the number of MAC addresses as on one of the stack we had never a problem with before we added three more members and more devices connected and got the issue very fast. We do not use dot1x but one of the suspected process on Cisco radar was and maybe still is HULC DOT1X Process.
We took our internal measures and downgraded couple of switches we had the problem with from v15 to 122-58.SE2, and we have not had any troubles since. Generally speaking if we experience this trouble on the C3750 stack with v15 we take the info from it and send to Cisco, but immediately downgrade it to 122-58.SE2, so honestly Cisco does not have much time to dig into it more deeply. They tried to simulate in their LAB, but were not able to reproduce. I understand that as for sure the issue develops in our environment only at very specific conditions as we have hundreds of similar systems and got the issue so far on 6 of them.
We have reported several other issues with v15 on C3750 to Cisco, some cosmetic, some more serious and I think that v12 is an older software train , many bugs were reported and fixed for it. This is very mature version of IOS. On the other hand, v15 versions are younger and they will follow new business demands, include new features and enhancements. But for the time being we are okay with what is available in v12 so we can use it.
06-03-2013 02:24 PM
The Fix below fixed all out problems so far:
Apparently, in the IOS there's an "Auth manager" that can monitor all
sessions in a switch. Starting from the 15.0 IOS stream this feature is
enabled by default. As the bug describes:
"Auth Manager continues to hold more memory in Processor Pool locking
out access to the 3750 stack unless switch is rebooted"
To prevent this 'leakage' of memory to the processor, you have to
disable this session monitoring by issuing the following command:
no macro auto monitor
Memory and CPU has been behaving very stable so far....
12-28-2015 11:34 AM
Thanks arseus001 the senarion you provided solve my trouble with C3750X memory leakage
06-04-2013 12:45 PM
I had mem leaks and high cpu issues with both v15 and 12.2.58 on our 3750/3750x stacks. Followed Leo's recommendation(12.2.55) a couple months back and have been solid since.
06-04-2013 11:47 PM
'no macro auto monitor' did not help in our case.
We have the latest news from Cisco TAC:
---------I am writing to update you about my findings. I have worked yesterday on another memory leak on 3750 switches and I have found another bugs related to memory leak on v15 software for this platform :
http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCud60602 http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCuf32893Both of them have a root cause in below bugs ( confirmed by DE in other SR's that they are duplicated of below ) :
http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCub85948
http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCue92705
When we add to this previously found bugs CSCuc03649 , CSCtt96255 I believe that risk of memory leak on 15.0.2-SE is quite big. The strange thing is that in those bugs I have noticed that those bugs are noted to be found in v15 for 3750 but it is not fixed in those versions ( it is fixed only in 15.3 and 15.2 which are not available for 3750 ).
I am contacting the development team what is the reason that fix for this bug is not present in 15.0.2-SE software train.
06-04-2013 11:56 PM
Vlad,
I stand by my recommendation. Avoid using 12.2(58)SE, 15.0(1)SE and 15.0(2)SE. Downgrade to 12.2(55)SE7 instead.
If you really, really have operational feature requirement in 15.0 then consider 15.0(2)SE2.
WARNING: If you plan to use 15.0(2)SE2 on 3560 and 3750, make sure you are not using Dot1x. If you are using Dot1x, then 15.0(2)SE2 will not be a good version for you.
06-07-2013 05:31 AM
Cisco inform the bugs CSCud60602, CSCuf32893, CSCub85948, CSCue92705 are fixed in the most recent version
15.0(2)SE3 released on 5 June.
We will give it a try and will upgrade one of the problematic switches to this version.
06-08-2013 12:25 AM
15.0(2)SE3 released on 5 June.
BREAKING NEWS: DO NOT, under any circumstances, upgrade to 15.0(2)SE3. If you have TACACs configured and yoru switch boots up to this level, you will NOT be able to access your switch via Telnet, SSH or console.
The only way is to "break-in" via the "Mode" button.
There are two methods to take back control of your switch. And they are:
1. Easy Method:
2. Slightly Easy Method
Hope this helps.
11-13-2013 10:58 AM
I'm seeing this and just upgraded to 15.0(2)SE4. Looks like the bug is still around.
11-13-2013 01:38 PM
I'm seeing this and just upgraded to 15.0(2)SE4.
It shouldn't be re-appearing. I've got fleets of 15.0(2)SE4 and I can confirm the issue has been fixed.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide