09-05-2011 02:23 AM - edited 03-07-2019 02:02 AM
Hi everyone
We have a Cisco Catalyst 4500e in the core and a couple of days ago the connection went down 'causing for the whole network to have a spasm (very slow intermitent pings, no access to the core servers).
In the logs I found these threshold violation warnings/errors:
Log Buffer (4096 bytes):
Temperature low alarm; Operating value: -121.3 C, Threshold value: -4.0 C.
Sep 2 13:45:31.886: %SFF8472-5-THRESHOLD_VIOLATION: Te1/1: Tx power high warning; Operating value: -0.5 dBm, Threshold value: -1.0 dBm.
Sep 2 13:45:31.886: %SFF8472-5-THRESHOLD_VIOLATION: Te1/1: Temperature high alarm; Operating value: 127.5 C, Threshold value: 74.0 C.
Sep 2 13:55:31.987: %SFF8472-5-THRESHOLD_VIOLATION: Te1/1: Tx power high warning; Operating value: -0.5 dBm, Threshold value: -1.0 dBm.
Sep 2 13:55:31.987: %SFF8472-5-THRESHOLD_VIOLATION: Te1/1: Temperature high alarm; Operating value: 118.9 C, Threshold value: 74.0 C.
Sep 2 14:05:32.088: %SFF8472-5-THRESHOLD_VIOLATION: Te1/1: Tx power high warning; Operating value: -0.5 dBm, Threshold value: -1.0 dBm.
Sep 2 14:05:32.088: %SFF8472-5-THRESHOLD_VIOLATION: Te1/1: Temperature high alarm; Operating value: 111.1 C, Threshold value: 74.0 C.
Sep 2 14:15:32.189: %SFF8472-5-THRESHOLD_VIOLATION: Te1/1: Tx power high warning; Operating value: -0.7 dBm, Threshold value: -1.0 dBm.
Sep 2 14:15:32.189: %SFF8472-5-THRESHOLD_VIOLATION: Te1/1: Temperature high alarm; Operating value: 103.9 C, Threshold value: 74.0 C.
Sep 2 14:25:32.289: %SFF8472-5-THRESHOLD_VIOLATION: Te1/1: Tx power high warning; Operating value: -0.8 dBm, Threshold value: -1.0 dBm.
Sep 2 14:25:32.289: %SFF8472-5-THRESHOLD_VIOLATION: Te1/1: Temperature high alarm; Operating value: 96.0 C, Threshold value: 74.0 C.
Sep 2 14:35:32.390: %SFF8472-5-THRESHOLD_VIOLATION: Te1/1: Tx power high warning; Operating value: -0.9 dBm, Threshold value: -1.0 dBm.
Sep 2 14:35:32.390: %SFF8472-5-THRESHOLD_VIOLATION: Te1/1: Temperature high alarm; Operating value: 87.0 C, Threshold value: 74.0 C.
Sep 2 14:45:32.491: %SFF8472-5-THRESHOLD_VIOLATION: Te1/1: Temperature high alarm; Operating value: 79.2 C, Threshold value: 74.0 C.
Now the temporary fix was to take the GBIC module out and put it back in. But this morning the errors reappeared and am worried that the network might go down again.
The connection is on MM fibre.
Anyone seen this before? Any ideas?
Thanks
Elena
09-05-2011 08:08 AM
This could be bug related.
I had exact same errors on 3560's and discovered I was hitting a bug;
Upgrading IOS resolved the issue.
Other possiblem problem could be distance between devices.
What 10Gb modules are you using and what is the distance between both devices?
09-05-2011 08:46 AM
Hi
I am on the latest IOS with mine and the module is a X2-10GB-SR, the distance is only a couple fo meters (basically in the room next door).
Thanks
Elena
09-05-2011 09:01 AM
Hello,
can you please provide me the following output,
sh inventory
sh interfaces te1/1 transceiver detail
sh idprom int te1/1
Thanks,
Richard
09-05-2011 09:08 AM
Hi Richard
Please see output below:
#sh inventory
NAME: "Switch System", DESCR: "Cisco Systems, Inc. WS-C4506-E 6 slot switch "
PID: WS-C4506-E , VID: V02 , SN: FOX1420GU8G
NAME: "Linecard(slot 1)", DESCR: "Supervisor 6L-E 10GE (X2), 1000BaseX (SFP) with 2 10GE X2 ports"
PID: WS-X45-SUP6L-E , VID: V01 , SN: JAE1346NXQN
NAME: "TenGigabitEthernet1/1", DESCR: "10Gbase-SR"
PID: X2-10GB-SR , VID: V02 , SN: FNS12380J55
NAME: "Linecard(slot 2)", DESCR: "10/100/1000BaseT (RJ45) with 48 10/100/1000 baseT ports"
PID: WS-X4548-GB-RJ45 , VID: V06 , SN: JAE14130MR0
NAME: "Linecard(slot 3)", DESCR: "10/100/1000BaseT (RJ45) with 48 10/100/1000 baseT ports"
PID: WS-X4548-GB-RJ45 , VID: V06 , SN: JAE14130HP5
NAME: "Fan", DESCR: "FanTray"
PID: WS-X4596-E , VID: V03 , SN: FOX1419GAKF
NAME: "Power Supply 1", DESCR: "Power Supply ( AC 1000W )"
PID: PWR-C45-1000AC , VID: V05 , SN: AZS14310A0S
NAME: "Power Supply 2", DESCR: "Power Supply ( AC 1000W )"
PID: PWR-C45-1000AC , VID: V05 , SN: AZS14310A13
#sh int te1/1 transceiver detail
ITU Channel not available (Wavelength not available),
Transceiver is internally calibrated.
mA: milliamperes, dBm: decibels (milliwatts), NA or N/A: not applicable.
++ : high alarm, + : high warning, - : low warning, -- : low alarm.
A2D readouts (if they differ), are reported in parentheses.
The threshold values are calibrated.
High Alarm High Warn Low Warn Low Alarm
Temperature Threshold Threshold Threshold Threshold
Port (Celsius) (Celsius) (Celsius) (Celsius) (Celsius)
--------- ------------------ ---------- --------- --------- ---------
Te1/1 -47.8 74.0 70.0 0.0 -4.0
High Alarm High Warn Low Warn Low Alarm
Voltage Threshold Threshold Threshold Threshold
Port (Volts) (Volts) (Volts) (Volts) (Volts)
--------- --------------- ---------- --------- --------- ---------
Te1/1 3.27 N/A N/A N/A N/A
Optical High Alarm High Warn Low Warn Low Alarm
Transmit Power Threshold Threshold Threshold Threshold
Port (dBm) (dBm) (dBm) (dBm) (dBm)
--------- ----------------- ---------- --------- --------- ---------
Te1/1 -3.8 2.9 -1.0 -7.3 -11.3
Optical High Alarm High Warn Low Warn Low Alarm
Receive Power Threshold Threshold Threshold Threshold
Port (dBm) (dBm) (dBm) (dBm) (dBm)
------- ----------------- ---------- --------- --------- ---------
Te1/1 -1.4 2.9 -1.0 -9.9 -13.9
#sh idprom int te1/1
X2 Serial EEPROM Contents:
Non-Volatile Register (NVR) Fields
X2 Version :0x1E = MSA Version 0x1E
NVR Size in bytes :0x100
Number of bytes used :0x100
Basic Field Address :0xB
Customer Field Address :0x77
Vendor Field Address :0xA7
Extended Vendor Field Address :0x0
Reserved :0x0
Transceiver type :0x2 =X2
Optical connector type :0x1 =SC
Bit encoding :0x1 =NRZ
Normal BitRate in multiple of 1M b/s :0x2848
Protocol Type :0x1 =10GbE
Standards Compliance Codes :
10GbE Code Byte 0 :0x1 =10GBASE-SR
10GbE Code Byte 1 :0x0
SONET/SDH Code Byte 0 :0x0
SONET/SDH Code Byte 1 :0x0
SONET/SDH Code Byte 2 :0x0
SONET/SDH Code Byte 3 :0x0
10GFC Code Byte 0 :0x0
10GFC Code Byte 1 :0x0
10GFC Code Byte 2 :0x0
10GFC Code Byte 3 :0x0
Transmission range in 10m :0x1E
Fibre Type :
Fibre Type Byte 0 :0x1 =MM, Generic
Fibre Type Byte 1 :0x0 =Unspecified
Centre Optical Wavelength in 0.01nm steps - Channel 0 :0x1 0x4C 0x8
Centre Optical Wavelength in 0.01nm steps - Channel 1 :0x0 0x0 0x0
Centre Optical Wavelength in 0.01nm steps - Channel 2 :0x0 0x0 0x0
Centre Optical Wavelength in 0.01nm steps - Channel 3 :0x0 0x0 0x0
Package Identifier OUI :0xC6420
Transceiver Vendor OUI :0x269800
Transceiver vendor name :CISCO-FINISAR
Part number provided by transceiver vendor :X2-10GB-SR
Revision level of part number provided by vendor :A
Vendor serial number :FNS12380J55
Vendor manufacturing date code :2010100801
Current Reference :
5V Stressed Environment Reference 100 percent=1A :0x1
3.3V Stressed Environment Reference 100 percent=2A :0x1
APS Stressed Environment Reference 100 percent=2A :0x4
Normal APS Voltage :0x4
Digital Optical Monitoring Capability Byte :0xC1
Digital Optical Monitoring :Implemented
Low Power Start-up(LPS) Mode Capability Bit :0x1
Reserved :0x0
Basic Field Checksum :0x11
Customer Writable Area :
0x00: 58 32 2D 31 30 47 42 2D 53 52 20 20 20 20 20 20
0x10: 20 56 30 32 20 46 4E 53 31 32 33 38 30 4A 35 35
0x20: 31 30 2D 32 32 30 35 2D 30 32 20 20 41 30 20 20
Vendor Specific :
0x00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x30: 00 00 00 00 11 6C 7E 5D 2F 1C FA 54 93 BF E5 58
0x40: 35 7F EA 00 4C 00 00 00 00 00 00 00 00 00 78 DE
0x50: 89 C2 00 00 00 00 00 00 00
09-05-2011 09:38 AM
Hello,
This seems to be strange. Avago transceivers(CSCsz81516) replicate the same type of issue as you explained but this shouldnt be happening with the above FNS serial numbers transceivers. It is possible that the GBIC might have got faulty. can you please try changing the GBIC with the new one and check if you are facing this problem?
Thanks,
Richard
*Please rate useful posts
09-05-2011 03:45 PM
I don't know if this is related but whenever I get Avago transceivers I always send them back to Cisco in it's original pack.
I don't want to waste (down) time with the Avago-manufactured products.
09-06-2011 06:34 AM
Hi guys
Thanks for your ideas/suggestions. I guess the next step is to get the GBIC replaced.
I will post whether that's fixed the problem or not.
Thanks
Elena
12-12-2011 12:52 PM
Hello,
I am facing same issue with 4948 switch.
Dec 12 10:46:00: %SFF8472-5-THRESHOLD_VIOLATION: Te1/50: Temperature high alarm; Operating value: 114.8 C, Threshold value: 74.0 C.
Dec 12 13:56:02: %SFF8472-5-THRESHOLD_VIOLATION: Te1/49: Temperature low alarm; Operating value: -64.3 C, Threshold value: -4.0 C.
I am really amazed to see the high temprature to 114.8C and low tem to -64.3C, does this really mean the 10Gig module is being heated up or cooled down?
Thanks & Regards
Ahmed...
12-12-2011 01:44 PM
That sounds like a IOS bug . Open a tac case..
12-13-2011 03:14 AM
Hi
I have replaced the 10Gb module that was being reported as over and under heated and that has fixed the problem.
From what I have read this is usually a software problem on the gbic itself and I guess you could send it back to Cisco. But meanwhile it's worth keeping a spare.
Elena
12-13-2011 03:16 AM
Hi
I have replaced the 10Gb module that was being reported as over and under heated and that has fixed the problem.
From what I read this is usually a software problem on the gbic itself and I guess you could send it back to Cisco. But meanwhile it's worth having a spare.
Elena
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide