cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
7918
Views
0
Helpful
6
Replies

how to read the ucs tech support logs

MOHAN.RAJ1
Level 1
Level 1

Hi Team

Can you please tell me how to read the ucs tech support logs to find out the hardware issues

2 Accepted Solutions

Accepted Solutions

Kirk J
Cisco Employee
Cisco Employee

Greetings.

Your question is probably something that is hard to tackle in this format.

As hardware and firmware is so rapidly changing, so are the various diagnostics capabilities (and how, when, where they write to logs).  I think it would pretty much be impossible to maintain  an updated comprehensive guide to tech support file content and how to interpret them.

With that being said, there are some common ones for both Blade/UCSM and stand alone C series servers such as the SEL logs that generally log major errors such as DIMM failures, HD failures, etc

There are guides for general troubleshooting such as http://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/ts/guide/UCSTroubleshooting/UCSTroubleshooting_chapter_0111.html

There are also courses that map to the DCUCT 642-035 exam http://www.cisco.com/c/en/us/training-events/training-certifications/exams/current-list/dcuct.html that go into some details on looking at logs, among other things.

Are there certain hardware issues you are trying to  look for?  Do you have some kind of polling app that scans files you are trying to setup alerts for, etc?

Thanks,

Kirk

View solution in original post

Hi

UCS is agnostic to the OS; however, this document

http://www.cisco.com/c/en/us/support/docs/servers-unified-computing/ucs-manager/116349-technote-product-00.html

provides you information how to extract OS drivers; which should match those found in 

http://www.cisco.com/web/techdoc/ucs/interoperability/matrix/matrix.html

Walter.

View solution in original post

6 Replies 6

Kirk J
Cisco Employee
Cisco Employee

Greetings.

Your question is probably something that is hard to tackle in this format.

As hardware and firmware is so rapidly changing, so are the various diagnostics capabilities (and how, when, where they write to logs).  I think it would pretty much be impossible to maintain  an updated comprehensive guide to tech support file content and how to interpret them.

With that being said, there are some common ones for both Blade/UCSM and stand alone C series servers such as the SEL logs that generally log major errors such as DIMM failures, HD failures, etc

There are guides for general troubleshooting such as http://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/ts/guide/UCSTroubleshooting/UCSTroubleshooting_chapter_0111.html

There are also courses that map to the DCUCT 642-035 exam http://www.cisco.com/c/en/us/training-events/training-certifications/exams/current-list/dcuct.html that go into some details on looking at logs, among other things.

Are there certain hardware issues you are trying to  look for?  Do you have some kind of polling app that scans files you are trying to setup alerts for, etc?

Thanks,

Kirk

Hi Kirk

Thanks for providing complete details about tech support logs.

Yesterday one of our blade B230M3 went hung state, Its running VMware esxi 5.5 U2. I reviewed the Faults,SEL & Event Logs but unable to find any information.
Finally I generated tech support logs from ucs manager extracted  sam_techsupportinfo through notepad and found inventory details for FI,Chassis ,blades etc with status.
If possible can you please brief how to check the drivers or firmware's incompatibility from sam_techsupportinfo files.
is there any other steps to troubleshoot further

Thanks in Advance.

 

Hi

UCS is agnostic to the OS; however, this document

http://www.cisco.com/c/en/us/support/docs/servers-unified-computing/ucs-manager/116349-technote-product-00.html

provides you information how to extract OS drivers; which should match those found in 

http://www.cisco.com/web/techdoc/ucs/interoperability/matrix/matrix.html

Walter.

Hi Kirk and Walter

Wish you happy new year 2016.

Thanks again for giving wonderful information and provided troubleshooting steps.

I will go through reference document and follow the informations.

Greetings.

We frequently see tickets/requests where the OS may have an old driver that needs to be updated, or some service/process simply crashed/froze in the OS itself, with no actual hardware issues.

For VMware it is super helpful to have an external syslog server configured and a dumpfile location defined as you will frequently get more diag info from the OS. See http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1000328.

As Walter mentioned, you want to have the correct drivers that match the firmware.

The primary drivers you are normally concerned with for our blade servers are local raid controller(LSI), enic, and fnic.

To see what your current driver is for those, run the following from a putty/ssh session with the ESXi mgmt IP:

 #vmkload_mod -s fnic

#vmkload_mod -s enic

#vmkload_mod -s megaraid_sas

You may want to open a ticket with VMware to have someone see if there is anything in the logs pointing to software or hardware issues.  You may want to open a TAC ticket to confirm you don't have hardware issues.

Thanks,

Kirk

Keny Perez
Level 8
Level 8

Hello,

Like Kirk said, we do not have a cook book to show where to look for specific issues (and keep that reference up to date all the time) but this post is very helpful if you are planning on getting more familiar with the logs/files you will find in the show tech:

https://supportforums.cisco.com/document/66296/how-read-ucs-b-series-tech-support-files-ucsm-detail

HTH,

-Kenny

Review Cisco Networking for a $25 gift card

Review Cisco Networking for a $25 gift card