Is anyone else running 16.12.3a IOS on 3850 switches?
Here is an issue we are facing but I can not find any documentation of a bug in this code and its still recommended as the code to go to...
Issue: POE Stops functioning on random ports but works on others. POE will not work for Avaya phones, cameras, cisco phones, or Cisco AP's ( 3602,3702,3802).
Work around: Reboot switch, downgrade, or find a port that will provide POE.
We began upgrading and testing on several stacks for a month or 2 with no issues prior to deploying to approximately 30 stacks of 3850's. After we did a mass deployment we began to see POE issues on switches that seem to be triggered when removing or adding a POE device. Once the condition has been triggered it will not go away until rebooted or downgraded. Logs will state " Controller port error, interface x/x/x, power given, but machine power good wait timer timed out.
I have found similar issues or bugs in older codes, have we regressed?
did this happen after upgrade to 16.12.3a ?
in my opinion PoE is a hardware function, not directly OS related, but....
there may be a change in defaults for CDP/LLDP that can trigger your issue
I did encounter some incompatibility between Cisco Poe+ and an (heated) external camera housing,
that was not IOS version related but behaved diffently on different platforms
the initial drawn PoE current was too high so Cisco switch shuts down the port temporarely, this process loops
Vendor claimed this being an within the PoE+ standard unspecified behaviour how to handle this .....
Yes it was after upgrading and I suspected that at first and actually RMA'd the switch but the issue started popping up weeks later across multiple sites.
Downgrading to 16.9.5 or upgrading to 16.12.4 or rebooting resolves the issue. I have not run either code long enough yet to see if it will happen again. I have had to do this to about 8 switch stacks so far.
These are similar bugs but state it was fixed in prior releases but as we all know the bug can regress and re-appear in later codes.
The 3650/3850 has some design defects and one of them is affected by a hardware bug called MOSFET (which are the two bugs you've mentioned).
Hard reboot (pull the power) is only a workaround.
The only way to fix this is to RMA the appliance.
Thanks for the response.
I do not consider this to be a hardware issue when a reboot or if downgrading/upgrading resolves the issue. If it was a hardware issue it should pop right back up after performing the steps above. I would have also expected this bug to pop up in the many 3.x.x upgrades we have performed on these over the past 5 years. The issue has popped up on 3850 switches manufactured from 2013 through 2018 and only after upgrading to 16.12.3a. This code version also has a lot of SNMP issues that do not allow you to poll certain oids.
I am going to deploy 16.12.4 to a larger sample pool to see if the issue re-occurs or can be triggered.
Well it was deployed to a larger sample and the issue appeared in a stack running 16.12.4 after 4 weeks.
Is no one else experiencing this issue?
What code is everyone else running on their 3850s?