I want to rewrite our current monitoring solution to get UCS faults from Intersight rather than polling individual UCSM.
The format of the fault must be the following:
DC - Server ChassisId/SlotId - Short fault description
LON - Server 3/6 - DIMM A0/A1 failed
The description must contain full error text.
I've done this for individual UCSM using PowerShell, but for Intersight PowerShell module as well as Python it's just a wrapper for the REST API endpoints, there is no way to do something like in PowerShell:
Get-UcsBlade | where SlotId -eq 3/8 | Get-UcsFaults
In general, there were two ways how to get faults:
Store Get-Faults output and then group by Dn
Get-UcsBlade | where (Get-Faults)
basically return all blades where faults non-empty.
The problem for now that querying fault/Instances Intersight endpoint returning faults without references to the physical blade to which they are belonging to.
Ideally would be (to minimize the number of queries sent to Intersight) to retrieve all faults with parents/ancestors, I mean blades, which as well contain ServiceProfile information (just a name if associated) and datacenter as a tag.
So far I can't find any interceptions between faults and physical (blades) or network-element (FIs) summaries to map one output on another even if I had to run 3 separate queries.