cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Announcements
Join Customer Connection to register!
3920
Views
30
Helpful
14
Replies
HarryKemp0646
Beginner

SG350X-24 switch overheating, shuts down ports under low stress

i have been recieving the following error on my SG350X-24 managed switch. it only occurs when we plug 2 devices into the switch's 10gbe ports (rj45, not sfp+), even when no data is being transferred over the 10gbe lines, the fans go from level 0, through 1-2-3-4 and then 5, followed shortly after by the following warning:

 

%Environment-A-ENV-MONITOR-CHNG: sensor 4 (Phy) temperature 99C .Temperature reached the critical threshold, All ports will be shut down in 2 minutes

 

the ports then shut down. the switch is cool to touch during this whole process.

 

switches are in a 25°C environment with good airflow.

 

i have two of these switches, linked together with an sfp+ cable. when i try the same test, but have 1x device connected to 1x rj45 10gbe port on each switch, instead of both on the same switch, there is no problem.

 

any ideas? i found another topic with similar reported issues with a SG500 a few years ago, looks like it was resolved with a firmware update, but i can't find any info about my switch in particular. https://bst.cloudapps.cisco.com/bugsearch/bug/CSCux37567/?rfs=iqvred

 

thanks!

 

14 REPLIES 14
HarryKemp0646
Beginner

here's the rest of the logs leading up to the error, maybe this helps!

 

%Environment-A-ENV-MONITOR-CHNG: sensor 4 (Phy) temperature 99C .Temperature reached the critical threshold, All ports will be shut down in 2 minutes

2147483468 2019-May-24 14:54:44 Informational %Environment-I-FANS-SPEED-CHNG: FAN’S speed level - 3 changed to level - 2.
2147483469 2019-May-24 14:53:49 Informational %Environment-I-FANS-SPEED-CHNG: FAN’S speed level - 4 changed to level - 3.
2147483470 2019-May-24 14:53:28 Informational %Environment-I-FANS-SPEED-CHNG: FAN’S speed level - 5 changed to level - 4.
2147483471 2019-May-24 14:52:43 Alert %Environment-A-ENV-MONITOR-CHNG: sensor 4 (Phy) temperature 93C .Temperature out from the critical threshold, enable all ports
2147483472 2019-May-24 14:52:30 Warning %LINK-W-Down: te1/0/2
2147483473 2019-May-24 14:52:02 Alert %Environment-A-ENV-MONITOR-CHNG: sensor 4 (Phy) temperature 99C .Temperature reached the critical threshold, All ports will be shut down in 2 minutes
2147483474 2019-May-24 14:51:17 Alert %Environment-A-ENV-MONITOR-CHNG: sensor 4 (Phy) temperature 96C .Temperature reached the warning threshold
2147483475 2019-May-24 14:50:21 Informational %Environment-I-FANS-SPEED-CHNG: FAN’S speed level - 4 changed to level - 5.
2147483476 2019-May-24 14:50:16 Informational %Environment-I-FANS-SPEED-CHNG: FAN’S speed level - 3 changed to level - 4.
2147483477 2019-May-24 14:50:11 Informational %Environment-I-FANS-SPEED-CHNG: FAN’S speed level - 2 changed to level - 3.
2147483478 2019-May-24 14:50:08 Warning %STP-W-PORTSTATUS: te1/0/2: STP status Forwarding

This seems to be a bug in the 2.5.x code line.  I had the same issue after upgrading from the 2.4 code line.  Rolling back to the 2.4 line made everything happy again.

marce1000
VIP Advisor

 

 - Switch temperature  does not correlate to network  traffic volumes on ports. Make sure the device is really in an appropriate operational environment concerning temperature.

 M.

I can confirm this is a bug.  Had to rollback firmware upgrade on a couple of them today because of this.  One of them was the only device in the rack in a 68F room.  At least on 2.5.0.83.

I'm looking into this switch. I have a couple of questions:

1. If the 10GbE RJ45 ports are not used (only the 1GbE and 10GbE SFP+ ports are used), how loud is the fan?

2. Some folks mentioned that when the 10GbE RJ45 ports are used, the fan gets very loud. Is this still true? Or has a firmware upgrade resolved this issue?

Same problem here.

Tried to make stacking with two 10 gbit RJ45 ports (three switches, ring topology). Swithes ran latest firmware - 2.5.5.47. Also tried 2.5.0.92. Fans get loud when two 10 gbit RJ45 ports are used. About 8500 rpm or so. Environment conditions are ok, switches are cold.

 

Downgrading to 2.4.0.94 resolved this issue, seems whole 2.5.X.X branch is still bugged.

freeurmind
Beginner


Hello,

I had this issue too, but sorry folks, this is not a BUG. The branch 2.5.X.X is checking a temperature sensor that wasn’t checked in previous versions. The root cause for this issue is that is there a chipset that has a little heat sink insufficient to evacuate all the heat generated by that chipset. I’m using my switch for my homelab and I changed the heat sink related to that chipset and all the switch temperature sensors are ok now. I think that this is a manufacturer defective design and the only option to solve this issue is changing that heat sink.

I would suggest to all of you to open a case on cisco to request a new unit, maybe the new revisions for this switch haven’t this issue. I bought my cisco switch from eBay and I haven't any support from cisco, so I had to solve this issue by myself.

 

The heat sink insufficient to evacuate all the heating is this:

signal-2020-11-16-223243.jpeg


The heat sink that I've used to replace that one and a new fan that I've installed near to this new heat sink:

 

signal-2020-11-20-174627.jpeg

 

The temperatures before to change the mentioned heat sink:

imageold.png

 

The switch temperatures now:

image.png

Corsair
Beginner

Dealing with the same issue.  I've swapped out 3 switches so far and purchased 2 new ones, same model SG-350X.  ALL of them have had this same issue in a stack but only when plugging in the 10GB ports.  To make matters worse, its impossible to RMA these "limited lifetime warranty" switches without a SmartNet contract.  They don't even offer anywhere to open a support ticket for them.

SchmifloAT
Beginner

Hi,

we are a puplic school in Austria and have bought 14 SG350X-48P during summer 2020 as access switches for our network. I can confirm that after connecting 2x 10GE Interfaces the switch fan is starting to go from Fan-Level 0 to Level 5 in a few seconds and in our case is getting a fan error because of too high RPM. After downgrading to 2.4.0.94 the FAN-Level is at Level 2 and everything seems stable. With the 2.4.0.x Firmware there is no PHY Sensor Data available. I will check in the next days the senosor data after patching to the latest 2.5.5.x Firmware. Is there also someone from Cisco viewing this error or are Cisco Small Business Users not worthy any efford by them? For us this bug is very annoying because with Fan-Level 2 these PoE Switches are allready to loud for a study environment.

 

SG350X-Sensors-2.4.0.94.png

Any Ideas from Cisco? Because of this Problem we are currently planning to not buy SG350X Switches as our access switches anymore. We will buy Ubiquiti or some simmilar ones. Has someone good sugestions for 24Port PoE Access Switches with 10GE Uplinks (min 2)?

@HarryKemp0646 @SchmifloAT @Corsair @freeurmind @e.kudryashov 

 

Please upgrade the firmware to 2.5.7.8 release (https://software.cisco.com/download/home/286305760/type/282463181/release/2.5.7.85 )

and if the issue with the overheating still persists you must raise a case with Cisco STAC on one of the following numbers:

https://www.cisco.com/c/en/us/support/web/tsd-cisco-small-business-support-center-contacts.html 

Engineers will further investigate and take the proper actions.

 

Regards,

Martin

 

Hi all, 

 

Can you pls share the SNs (you may do it on PM if you wish)?

 

Thanks, 

Martin

SchmifloAT
Beginner

Hi,

thanks I will Test tomorrow and give you than a follow up.

 

Cheers,

Florian

freeurmind
Beginner

@Martin Aleksandrov 

 

Hello,

 

I applied the latest firmware update (2.5.7.85) on my SG350X-24 with SN DNI240705UK and I hadn’t any additional issues with this update for the moment. I saw that this latest firmware update has raised the temperature limit to 103 degree celsius for the sensor 4. I had to add a fan pointing to the heat sink of the Marvel 88X3220-BTH4 chipset in order to avoid reach the temperature limit. 


Cheers