Did not find anything in Bug Tool or these forums for this so posting as a new question.
Environment is multiple stacks of 2960X -48FPD-L switches in different IDF's running IOS 15.0(2)EX5. They were installed new and configured by another party about one month ago. Current configuration is attached to this post.
Switch stack fails to rebuild/sync after CLI reload command via telnet session. IP stack appears to not initialize so no network connectivity comes up even though per the onsite staff the master switch system LED's are green and one of the uplinks is connected to that switch..
The system LED's of the other switches are amber.
Remedy for the time being is for someone onsite to power cycle all the switches in 20 second master reelection window.
Have not yet had opportunity to go onsite to console into one of the master switches to reload and monitor the process from that perspetive.
One telltale that makes me wonder about EX5 is the reload command as documented in the 15.0(2)EX Switch Manager command reference here:
should prompt with,
"Proceed to reload the whole Stack? [confirm]"
but actually only prompts with,
"Proceed with reload? [confirm]"
as if the command was 'reload slot 1' instead of just 'reload'.
I have not found anything in the config missing/wrong as far as configuration of the stack is concerned other than priorities and ID's were all assigned automatically/default..
If EX5 is the issue I'm not keen on downgrading to EX3 because the uplinks are SFP-10G-LRM which had more issues with EX3 than with EX5.
Thanks in advance for your time.
Solved! Go to Solution.
Unfortunately, nearly all the IOS for the 2960X is a potential land mine. The quality control & testing of IOS before they get released to the public has significantly dropped to an unprecendented low. Bugs that would've been picked up internally were left to users to do all the testing and report to Cisco. In my humble opinion, users are now the "official" IOS testers.
Have you tried the other like the 15.2.(2a)E1 or 15.2(3)E?
The console error message essentially says the FlexStack adapters are not Cisco authentic (they are the real deal). Only occurs on soft reloads and, in the given situation I'm facing, occurs 100% of the time. Power-cycle reloads are fine.
The TAC case is open but not going anywhere fast. They have not even suggested upgrading yet which is bothersome in it's own right.
Wishing my SE's were still selling designs using 2960S's. For my market it was the ideal switch and was rock-solid when is was new day one 5 years ago.
This is a major issue. I think, you should escalate the TAC ticket, if not as you already know they don't pay attention to your issue.
They actually are paying attention. Downgrading to EX3 would probably resolve this specific issue but the most viable operation of the SFP-10G-LRM GBIC's in use for this network is with EX5.
I suspect the lack of upgrade option means they don't know of a fix in the 15.2 train.
Now that they have the information they asked for I suspect it will now be escalated to the DE group. Even then we are probably looking at a week or two for a fix.
Have you tried cold booting the switch. I know it is a horrible work around, but I ran circles around this problem on a switch I had upgraded (well- downgraded) from 15.2(3)E that was hitting this bug. Warm boots seemed to do nothing to clear the %ILET-1-AUTHENTICATION_FAIL: This switch may note have been manufactured by Cisco (so on so forth). Once I pulled the power cable from the switch and waited several seconds prior to reapplying power, the switch came up cleanly. Currently, Cisco has safe harbored 15.0(2a)EX5. Did you happen to catch which bootloader combos with 15.0(2a)EX5 seemed to have the most consistent problems with this bug? The campus I work at is currently in the process of rolling out about 120 of these switches and originally decided to stick with 15.0(2)EX5 until we hit an OIR with SFPs that TAC almost immediately acknowledged as a new development for that specific image. We as a group decided to try out 15.2(3)E and quickly found out that the bug still existed. By that point, TAC was aware of the problem and over the weekend (2 days time after opening the ticket on a Friday if I remember right), 15.0(2a)EX5 was released.
Here is a small sampling of the switches that we have rolled out and what their bootloader + running IOS is. Note: There are a few "stable" cases running 15.2(3)E, but they will be rolled back during maintenance windows in the next few weeks. Everything that has been deployed since 15.0(2a)EX5 has been available has been running without issue on that IOS. I don't know how useful this data is, but I can continue to build a list, though it looks like every switch that I have that is running 15.0(2)EX5 from the factory that is getting upgraded to 15.0(2a)EX5 has the 15.2(2r)E1 bootloader, with a few cases with the 15.0(2r)EX3 bootloader.
Does anyone have information on when an IOS upgrade will trigger a bootloader upgrade? One thing I can't put a finger on right now is that during the last few rack/stack/configs of new stacks I have noticed that some of the switches running 15.0(2)EX5 from the factory that are being upgraded to 15.0(2a)EX5 are receiving a bootloader upgrade while the vast majority are not.
The console error message essentially says the FlexStack adapters are not Cisco authentic
This is a known issue with the Flexstack module.
What I normally would do, is "slap" the side of the module several times and re-insert it. I have never have to RMA a Flexstack module.
Original TAC case was seemingly resolved with release of IOS 15.0(2a)EX5. However, in reality with some versions of firmware/BOOTLDR the failure still occurs.
New details on the error include this tidbit from the real console after soft reload:
POST: ACT2 Authentication : Begin
POST: ACT2 Authentication : End, Status Failed
FlexStack Module SmartChip Authentication Failed
This later leads to the logged error message:
ILET-1-DEVICE_AUTHENTICATION_FAIL: The FlexStack Module inserted in this switch may not have been manufactured by Cisco or with Cisco's authorization.
I re-opened the TAC case and now Cisco is in an mode of obvious stalling tactics like asking for proof of the origin/purchase of the switch and the flexstack module and just barely servicing the case every 72 hours.
The sad thing about this field notice is came nearly 2 years after the problem was originally identified. Also, my Cisco partner organization must have hit the lotto on the percentages because we eventually identified 80% of over 100 2960X's ordered/installed in a 3 month window as having bad flex-stack modules. Worst of it while we can replace the physical modules there no recourse for recovering cost of labor to do so.
And that's the directions from Cisco :
In my case it was just an power cycle that solve it. For now.
Except now as of the August 2016 update to FN - 63972 you can open a TAC case and get the Flex-Stack module RMA'd free of charge. (aka Fix-on-Failure)..