I've been frantically trying to find a list of all the events, alerts, alarms that will be triggered by the UCS infrastructure over the last few days.
But to no avail, as this information doesn't seem to be readily available on the cisco website.
(5108 Chassis, Half size blades, IO modules, 6120 fabric interconnects, etc)
What i'm really looking for is a list of the SNMP and syslog events that will be triggered. Categorization of their criticality levels would be a massive bonus.
Really looking at something like this : http://www.veeam.com/nworks/overview/data/collectorEvents.html
Thanks in advance for your time.
While not exactly a detailed list, the UCSM_GUI_Configuration guide list the classes of items that will be alerted on. http://www.cisco.com/en/US/docs/unified_computing/ucs/sw/gui/config/guide/b_GUI_Config_Guide.html
Look at the last chapter on statistics.
Thankyou for your reply.
Unfortunately, my requirements were more indepth, as it involved knowing what SNMP traps would be sent (6120's utilise nexus 5000 mibs), and also what syslog events would be triggered.
As a result, i was looking more for something along the lines of http://www-europe.cisco.com/en/US/docs/switches/datacenter/sw/system_messages/reference/sl_nxos_book.html
in terms of the level of detail needed.
Unfortunately, i've chased this up with Cisco SE's and the UCS Manager does not currently SNMP trap for the UCS Chassis (Fans, Power). Though this is expected in a upcoming version to be released in June.
As a result of this, i need to have a look at the syslog events that will be generated by the UCS Manager application, and form the regex' that will be used to raise events for the important ones, unfortunately this is not documented, and is also expected around the time of the next release.
It would be great to know how you currently handle the monitoring / alerting of the compute infrastructure, i understand that the UCS Manager application will display these events, but in terms of automatic ticket generation / event correlation, this seems unachievable at the moment.
We don't currently monitor any UCS components since we haven't purchased any of them. My shop is in currently evaluating blade systems so we immersed ourselves fairly well into all the documentation. Good to know that the chassis doesn't alarm just yet.
I'm using the Microsoft SCOM management pack for monitoring which gets it's information over http. I couldn't find a way to export all the different alerts it produces but there are plenty of them, covering hardware failures, down links, discoveries etc. and even errors at the VMware level if you have Microsoft SCVMM. Might be useful for you.
Here is a sample alert.
|Date and Time:||4/15/2010 10:48:00 AM|
|Property Name||Property Value|
|Description||Power supply 2 in chassis 1 power: off|
Thankyou for your reply.
Are you currently capturing the syslog messages captured by your platform ?
In the environment in which i'm working i don't have capability other than syslog or snmp. I've been told that syslog will be more useful and have coverage outside the 6120's, but i just need to validate.
If anyone had any useful syslog messages it would be great. Really looking for things like :
Examples would be greatly appreciated.
We actually ended up pulling the plug on SCOM. It was hitting the management port so much it was freezing the http port and requiring reboots of the interconnect.