10-16-2020 02:13 PM - edited 07-05-2021 12:39 PM
Hello all, I've got another question. My 5520 has been flooding syslog with this message:
*OctTempMonitor: env_monitor.c:101 Check sensor fail: Octeon Temperature, errno 0, errstring Success
So far I haven't seen where it negatively affects the performance of the device. The operating temperatures seem ok (it's in a well cooled server room):
Operating Environment............................ Commercial (10 to 35 C)
Internal Temp Alarm Limits....................... 10 to 38 C
Internal Temperature............................. +15 C
Fan Status....................................... OK
I haven't been able to locate this error online yet. Has anyone ever seen it before? Is it cause for alarm?
10-16-2020 10:35 PM
- This looks like a bug. You may upgrade to advisory release if applicable , feasible , available and or not yet done. Check if the problem persists afterwards.
M.
10-17-2020 06:32 AM
We've actually got a TAC case open for this on 8540 at the moment. Wireless Network Business Unit dev team are thinking about it ...
We think it's a WLC software bug because a reboot clears the problem. It appears to me that the WLC loses connection with the CIMC (or IMM as they still call it). CIMC can still see the sensors fine. Side effect is that because WLC can't read the CIMC sensors it sets the fans to full speed to prevent possible overheating so increases wear on the fans.
If it's the same problem that we're seeing then you'll get zero response to the show imm chassis commands:
(wlc) >show imm chassis fan
(wlc) >show imm chassis fan-profile
(wlc) >show imm chassis temperature
(wlc) >
But on CIMC:
WLC-CIMC# scope sensor
WLC-CIMC /sensor # show fan
Name Sensor Status Reading Units Min. Warning Max. Warning Min. Failure Max. Failure
-------------------- -------------------- ---------- ---------- --------------- --------------- --------------- ---------------
FAN1_SPEED Normal 17100 RPM 1600 N/A 1200 N/A
FAN2_SPEED Normal 16000 RPM 1600 N/A 1200 N/A
FAN3_SPEED Normal 16000 RPM 1600 N/A 1200 N/A
FAN4_SPEED Normal 17100 RPM 1600 N/A 1200 N/A
FAN5_SPEED Normal 16000 RPM 1600 N/A 1200 N/A
FAN6_SPEED Normal 17100 RPM 1600 N/A 1200 N/A
WLC-CIMC /sensor #
You might also see RAID failures which Cisco have previously told us are hardware failures (RMA) but which I suspect might be linked with the same problem because they also clear after reboot:
RAID Volume Status
Drive 0.......................................... Good
Drive 1.......................................... Bad
They're considering re-opening https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvd63113 for it although they closed the original bug because they couldn't reproduce the problem or work out what triggers it
@marce1000there is no fix for it at this point but reboot will usually clear it if it's not a genuine hardware failure.
10-20-2020 09:56 AM
Thank you for this feedback. I'm not very versed on IMM and CIMC however I have tried the imm commands you shared and I get no feedback - in fact there's a noticeable pause at the CLI.
How can we move forward if there's no workaround or plans to release bug-fixes?
04-23-2021 08:26 PM
Hello, were you able to get to the bottom of this? I haven't seen a change since and I'm wondering if could possibly cause some issue down the road.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide