cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
28801
Views
95
Helpful
63
Replies

Cisco 3850 16.12.3a POE issues

AdamF1
Level 1
Level 1

Good morning,

 

Is anyone else running 16.12.3a IOS on 3850 switches?

 

Here is an issue we are facing but I can not find any documentation of a bug in this code and its still recommended as the code to go to...

 

Issue: POE Stops functioning on random ports but works on others. POE will not work for Avaya phones, cameras, cisco phones, or Cisco AP's ( 3602,3702,3802).

 

Work around: Reboot switch, downgrade, or find a port that will provide POE.

 

We began upgrading and testing on several stacks for a month or 2 with no issues prior to deploying to approximately 30 stacks of 3850's. After we did a mass deployment we began to see POE issues on switches that seem to be triggered when removing or adding a POE device. Once the condition has been triggered it will not go away until rebooted or downgraded. Logs will state " Controller port error, interface x/x/x, power given, but machine power good wait timer timed out.

 

I have found similar issues or bugs in older codes, have we regressed?

63 Replies 63

Hi Art,

 

How did your testing with 16.12.5b go? I am looking to upgrade 3850s that have been plagued with POE issues for a while now. This thread went quiet, so I am hoping this means good news. Thanks.


@Tmac71 wrote:

How did your testing with 16.12.5b go?


Currently have about ten stacks on 16.12.5b.  

So far, only two stacks (3850) are having SNMP and PoE issues after, get this, 6 DAYS uptime!

I have been running 16.12.5b on 8 stacks for weeks. I have a couple of stacks that I have ran it on for 10weeks with no issues. 

I have it running on several heavy POE switch stacks that contain over 48 POE cameras.

 

I have not encountered any SNMP or POE issues thus far. I will deploy to a wider range of switches in the next week and most likely set this as the golden image.

We had been running 16.12.5b code on 34 Catalyst 3850s - some upgrades only had place 4 days ago, while other running for 15days now. So far no issue had been reported, but as you may remember issue only show up when new PoE devices was added, which on already setup network doesn't happen daily. 

I am wondering if whoever experienced POE issue opened a TAC case and let Cisco investigate more. We are getting close to September when Cisco will stop supporting development on 16.12 track, so whatever need to be fixed have to be troubleshooted now.  

We will be doing more upgrades and should end up with approximately 140 3850 switches on this code soon. I reminded everyone on my team to not reboot switch if POE issue would show up in order to have TAC look on it. 


@Art Astafiev wrote:

I am wondering if whoever experienced POE issue opened a TAC case and let Cisco investigate more. 


TAC Case created and troubleshooting with TAC. 

Additional info:  Aside from PoE not being delivered the two switch members (from two different stacks) are both showing "Unknown" power supply details. 

I believe both (and the same) stack members (from two different stacks) are hitting a very familiar 16.12.4 PoE/SNMP bug.

We have an urgency to get 16.12.X bugs sorted because 16.12.6 is about to come out (first week of September) and this is the last-and-final firmware for the 3650/3850.  

More bad news for anyone that follows this thread... I upgraded 13 more stacks last night to this code and the last one I did had 3 random ports on my 2nd switch stop supplying POE to 2 AP's and 1 phone.... The odd thing is that I'm not getting any type of error this time around and it just shows the port down and nothing brings it up. I moved the devices to different ports and they all came back up and still no error in the logs. I will attempt to reboot this stack tonight to see if it resolves it as I have a test AP plugged into one of the bad ports.

 

Leo- are you seeing the bad machine power in your logs for the stack that is experiencing the issue?


@AdamF1 wrote:

Leo- are you seeing the bad machine power in your logs for the stack that is experiencing the issue?


No, I am not seeing any logs.  
I purposely got one of my colleagues to "witness" this.  He plugged the phone himself and confirms what I am seeing:  Port does not provide power to a particular switch member.  Move the phone to a different switch (of the same stack) and the phone powers up.  Move the phone back to the same switch (but different port) and no power.  No errors or logs observed.  Nothing.  Bouncing the port and no change in behaviour.  

@AdamF1, may I request if you can raise a TAC Case with what you are seeing?  I urgently want this friggin bug fixed before 16.12.6 is released on September 2021.

Id open a case if I could but we don’t pay for smart net on edge switching. 

I am able to move my devices to the same switch and it works fine, it just doesn’t work on these 3 particular ports grouped together other than that my issue sounds very familiar to yours as I see nothing in the logs, just does nothing. 

Tmac71
Level 1
Level 1

I understand that the 16.12.x train has been problematic, especially with POE. Were there problems with the 16.9 train as well? I have seen some feedback regarding 16.9.6 being a good, stable version. Can anyone here attest to that? I am moving up from 16.6.7


@Tmac71 wrote:

I understand that the 16.12.x train has been problematic, especially with POE. Were there problems with the 16.9 train as well? I have seen some feedback regarding 16.9.6 being a good, stable version. Can anyone here attest to that? I am moving up from 16.6.7


I have been "holding my breath" for a long time and my frustration has gone past the "threshold".  

My answer to this question is this:  If the stack can run 3.6.X, do it.  Downgrade to the final version of 3.6.X and stay happy. 

I still have stacks on 3.6.X with an uptime of >3 years.  No problem with PoE at all.  No crash.  NOTHING.  The stacks on 3.6.X are "boring" but very stable.  

16.9.6 has been a decent code for us as it’s what I rolled back to when I deployed 16.12.x. I’d recommend the 16.9.x code as it was 16.12 that Cisco changed something that just gummed up the 3850s. I ran the 16.12 code fine on my 9300s and 9500s as it appears the code was actually written for their ASICS. 

 

the downside to running 3.6.x is that there are security vulnerabilities that you won’t be able to address. 

I agree about Security Vulnerabilities with 3.6.X, however, it is up to the operator to determine if they are affected or not and whether or not workaround is possible/applicable.  

@AdamF1, if you have multiple stacks with this issue, can you reboot (use the "reload" command) the switch member (not the entire stack) and see if this fixes or not?  If not, cold reboot (pull the power cable) of the affected switch stack member (again, not the entire stack).  

Another thing, can I ask if the output to the command "sh env all" will come up showing the same problematic switch member having "Unknown" power supply status like this:  

 

SW  PID                 Serial#     Status           Sys Pwr  PoE Pwr  Watts
--  ------------------  ----------  ---------------  -------  -------  -----
1A  PWR-C1-1100WAC      DTN2028XXXX  OK              Good    Good     1100
1B  PWR-C1-1100WAC      DTN2028XXXX  OK              Good    Good     1100
2A  PWR-C1-1100WAC      DTN2030XXXX  OK              Good    Good     1100
2B  PWR-C1-1100WAC      DTN2028XXXX  OK              Good    Good     1100
3A  Unknown             Unknown      OK              Good    Good     Unknown    
3B  Unknown             Unknown      OK              Good    Good     Unknown    
4A  PWR-C1-1100WAC      DTN2030XXXX  OK              Good    Good     1100
4B  PWR-C1-1100WAC      DTN2028XXXX  OK              Good    Good     1100
5A  PWR-C1-1100WAC      DTN2030XXXX  OK              Good    Good     1100
5B  PWR-C1-1100WAC      DTN2030XXXX  OK              Good    Good     1100
6A  PWR-C1-1100WAC      DTN2028XXXX  OK              Good    Good     1100
6B  PWR-C1-1100WAC      DTN2028XXXX  OK              Good    Good     1100

 

16.9.5 and 16.9.6 are pretty solid from everything I have seen.  I know 16.9.7 is out but I haven't had a need to upgrade to that.  I have downgraded multiple stacks from 16.12.4 and 16.12.5 down to 16.9.6 and seen all the POE issues disappear.


@Wisco6977 wrote:

16.9.5 and 16.9.6 are pretty solid from everything I have seen.  I know 16.9.7 is out but I haven't had a need to upgrade to that.  I have downgraded multiple stacks from 16.12.4 and 16.12.5 down to 16.9.6 and seen all the POE issues disappear.


I originally upgraded from 16.9.X to 16.12.X due to a bug which causes the stack to crash.  
TAC recommended 16.12.X because the bug will not be fixed in 16.9.X.  
Now, I am forever stuck in this [expletive] train-wreck.  

oscar.garcia88
Level 1
Level 1

Thanks for the comments.

Review Cisco Networking for a $25 gift card