04-07-2013 10:53 AM - edited 03-07-2019 12:41 PM
Hello,
we have two 6509E which are configured in VSS mode with the following modules in each:
1 48 CEF720 48 port 10/100/1000mb Ethernet
2 8 DCEF2T 8 port 10GE
3 4 CEF720 4 port 10-Gigabit Ethernet
5 5 Supervisor Engine 2T 10GE w/ CTS (Acti VS-SUP2T-10G
I need to know if the temperature of the attached modules are normal or there is a problem? and how can I solve it?
another question please: the status of one of the supervisor engines is "MajFail" as shown below:
Mod Online Diag Status
---- -------------------
1 Pass
2 Pass
3 Pass
5 Major Error
How can I solve that problem? Is it a hardware failure?
Thanks a lot in advance.
BR,
Heba
04-07-2013 04:07 PM
Hi,
The operating temperature for 6500 line card should be between 32 to 104°F (0 to 40°C). One of you line card (module 1) in and also module 5 which is the Sup are running above these numbers. Is the room temperature fine or is it hot?
If the room temperature is fine, then you need to open a ticket with TAC, so they can provide a solution or replace the Sup and the line cards.
HTH
04-07-2013 04:27 PM
In a normally operating system you should not see modules that hot. The major fail error is probably because you have environmental alarms for the box. Check your logs on the switch . check the fans , but most likely wherever this is located it is much too hot for that switch , you need to get more air conditioning in the room .
04-07-2013 05:04 PM
Hi,
The temperature running on this module looks okay and they are under the defined threshold.
Environmental Conditions | • Operating temperature: 32 to 104°F (0 to 40°C) • Storage temperature: -40 to 167°F (-40 to 75°C) • Relative humidity: 10 to 90 percent, noncondensing • Regulatory compliance |
Please find below the default alarm temperatures for the ASICs. The ASIC
temperatures seen in your case are normal and are within the permissible range.
module 5 asic-1 temperature: 24C
threshold #1 for module 5 asic-1 temperature:
(sensor value >= 90C) is system minor alarm
threshold #2 for module 5 asic-1 temperature:
(sensor value >= 105C) is system major alarm
module 5 asic-2 temperature: 24C
threshold #1 for module 5 asic-2 temperature:
(sensor value >= 90C) is system minor alarm
threshold #2 for module 5 asic-2 temperature:
(sensor value >= 105C) is system major alarm
module 5 asic-3 temperature: 23C
threshold #1 for module 5 asic-3 temperature:
(sensor value >= 90C) is system minor alarm
threshold #2 for module 5 asic-3 temperature:
(sensor value >= 105C) is system major alarm
module 5 asic-4 temperature: 24C
threshold #1 for module 5 asic-4 temperature:
(sensor value >= 90C) is system minor alarm
threshold #2 for module 5 asic-4 temperature:
(sensor value >= 105C) is system major alarm
(Cmd: 'show environment alarm thresholds module 5' )
My Explanation :-
1.If you use the ³show environment alarm threshold module 5 command
you will see that the low threshold for asic-1, asic-2, asic-3, and asic-4
is 90C. These are the fabric ASICs on the module, and these temperatures are
internal to the ASIC and are therefore expected to be much higher than the
inlet and outlet temperatures measured on the surface of the Supervisor main
board.
2.Notice the asic-1 and asic-2,3& 4 temperatures. These are measured from sensors that are part
of the fabric ASICs on the linecard, and it is expected that these temperatures are much
higher than either the inlet or outlet temperatures (even in this case where there is no traffic
traversing the switch).
The thresholds cannot be changed
3. ASICs have heat sinks on top of them (if you look at a 6k module, younwill see big black bricks on top of key Asics), and it is normalnfor them to have high temp, as the heat sinks will take care of them.
Below are the possible things which can be checked at DC:
==================================================
1. Is there enough space for heat dissipation?
2. Verify the temperature setting at the location of the device to ensure that it is adequately cooled.
3. Airflow is other thing which we can consider , If the device is surrounded with many devices around and doesn’t have enough space for the airflow( In & Out) then we can see a bit hike in the sensor of the Asic.
B) Regarding the Module showing Major alarm:
Please do the following :
1.Re-seat the card and check if it is still occuring.
2.Set the boot-up level to complete, and then reload only the card if the error still persists.
4.If the reseat and the reload does not fix it,it could be a hardware issue ..replace the card through RMA.
HTH
Regards
Inayath
*Plz rate if this info is usefull.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide