on 05-27-2014 10:41 PM
We are pleased to release Cisco UCS Manager Plugin 0.9.4 for Nagios
Nagios is an open source computer system monitoring, network monitoring and infrastructure monitoring software application. Nagios offers monitoring and alerting services for servers, switches, applications, and services.
The solution provides end-user with two components.
Supported Nagios Versions:
This plugin is supported on Nagios Core version 3.2 and higher versions.
Supported Cisco UCS Manager Releases:
This plugin is supported on Cisco UCS Manager Releases 2.1, 2.2 and 3.0.
Software Requirements for Release 0.9(4):
This release of the plugin requires Python SDK 0.8.3 to communicate with UCS Manager. Click here to download Python SDK Release 0.8(3) for UCS Manager.
New Features in Release 0.9(4):
Plugin Enhancements:
Auto-Discovery Add-On Enhancements:
Important Note regarding Backward Compatibility:
If you are using the add-on and upgrading from an older release of plugin to release 0.9.4 , you must re-discover all the domains using the new add-on to create new service definition files as the 'onlyFaults' flag in 'cisco_ucs_nagios' script is changed to 'faultDetails'.
Upgrading from Release 0.9.2 to Release 0.9.3 or above:
New Features in Release 0.9(3):
New Features in Release 0.9(2):
Refer to the attached user guide for installation/upgrade and usage details.
For any queries/feedback on Cisco UCSM Plugin for Nagios 3.x, please add a discussion to the Cisco Developed Integrations sub-space on Cisco UCS Communities.
Louis,
This appears to be a standalone rack server. There are two issues - the plugin here is for UCSM managed servers. For standalone racks, you need the following
Cisco IMC Plugin for Nagios (version 0.9.2)
Even then, the min firmware it works with, is 1.5.x.
--Arun S
Is there anyway to filter out alerts such as this?
WARNING - sys/chassis-1/blade-4/board/memarray-1/mem-1-DIMM A0 on server 1/4 has an invalid FRU
==== Fault # 1 ====
Dn : sys/chassis-1/blade-4/board/memarray-1/mem-1/fault-F0502
Descr : DIMM A0 on server 1/4 has an invalid FRU
severity : warning
Cause : identity-unestablishable
Type : equipment
Created : 2013-11-01T16:56:53.872
I need to do this temporarily until we can update the inventory catalog.
--samd
Hi Sam,
You need to update the variable SKIP_FAULT_LIST in the configuration file "cisco_ucs_nagios.cfg". You will find this file at the same location where your cisco ucs plugin is installed.
For your case you can add the 'Code' attribute to this list as Code:F0502 , like
SKIP_FAULT_LIST=Lc:suppressed,Type:fsm,Severity:info,Severity:condition,Code:F0502
This will filter out all the faults with fault code F0502
This is also mentioned in section 6.3 of user guide with heading as "Skipping Faults".
Regards
- Prateek
Thanks that fixed it! --samd
Hi,
I get this error running the script.
It is with
python 2.7
Nagios core 3.2.3
Redhat 5.5
#######
Traceback (most recent call last):
File "./installer.py", line 467, in <module>
extracted_name = name_list[0] + "_" + name_list[1] + "_" + name_list[2]
IndexError: list index out of range
[root@opsviewprim10 dave]#
##########
thanks
Hello Ian,
It seems to be an issue with the directory from where you are running the installer script.
Can you please provide the listing of your directory from where you are executing the script.
The directory should be similar to below output.
[root@nagios-centos cisco-ucs-nagios-0.9.3]# ls -lrt
total 92
-rwxr-xr-x 1 root root 32379 Jan 29 15:01 installer.py
-rw-r--r-- 1 root root 3456 Jan 29 15:01 INSTALL
-rw-r--r-- 1 root root 49473 Jan 29 15:01 cisco-ucs-nagios-0.9.3.tar.gz
We need to execute the "installer.py" from this directory only not from outside. And the naming convention of the tar.gz present in the directory should not be changed.
thanks
Thanks for this plugin, it is working quite well. A couple of questions though ...
Is there a way to query multiple classes at a time? I'd like to set up a service such as ucs_Servers - it doesn't matter to us if the server is Blade or Rack and we don't want to monitor every single server as an independent unit. I've tried many things but haven't come up with a way to query both in a single plugin call. For example this would be nice:
cisco_ucs_nagios -u user -p pass -H ucs-host -t class -q ComputeBlade,ComputeRackUnit --inHierarchical --onlyFaults
Above doesn't work of course. I have to set up 1 service for each type of server (ucs_Blades and ucs_Rack).
Also, in the plugin output there seems to almost always be many lines of inventory-type information, for example:
sys/rack-unit-14:OK - Model : UCSC-C240-M3L,Name : xxx
When there are over 30 servers in a UCS domain this becomes a bit verbose (especially when trying to pick out the single CRITICAL) that might be mixed in the middle. Is there any way to suppress output other than an overall OK, or the specific WARNINGs and CRITICALs?
Thanks,
Fraser
Hello Fraser,
Thanks for using the plugin and providing the valuable suggestions.
Query 1 : Using multiple classes at a time?
The plugin has been designed, taking into consideration that it will work one class at a time. So this is not possible in the current plugin and hence we need to create different services for different classes.
We will take a look on use cases where this multiple class thing can be useful.
Query 2: Is there any way to suppress output other than an overall OK, or the specific WARNINGs and CRITICALs?
The plugin gives out the inventory information if there are no faults on the object. This inventory information is controlled by "Inv_<class_name>" entry in the plugin CFG file "cisco_ucs_nagios.cfg". If for a specific class this entry is not there then we display out all attributes for that class.
If you don't want lengthy Inventory information then you can limit the attributes by providing an entry in the configuration file with minimal class attributes.
To SKIP specific WARNINGS and CRITICAL faults , we have provided "SKIP_FAULT_LIST" in the plugin CFG file.
You need to mention an attribute of "FaultInst" class in the below format:
SKIP_FAULT_LIST=<Class_Attribute>:<Attribute_Value>
SKIP_FAULT_LIST=Lc:suppressed,Type:fsm,Severity:info,Severity:condition
Your suggestion, that the Warnings and Critical entries get mixed in between the "OK" entries if we have lot of objects for a specific class. We will surely take it up and look for how a better view can be given for the same.
Regards,
Prateek
Thanks Prateek.
I think another option to accomplish some of what I want would be if the plugin accepted alternate config files. I could then query everything but use CLASS_FILTER_LIST in a check-specific config file to control what specific hardware my check returns (i.e. ComputeBlade,ComputeRackUnit for "servers" check).
Regards,
Fraser
Hello,
I am hoping that someone can help me with this.
I am using the plugin in icinga. But I got this output:
Error is of Type : URLError Message >> <urlopen error [Errno 13] Permission denied> Error while trying to run the UCS Nagios monitoring service. Check for Nagios logs as it may help finding error details. Exception: ------------------------------------------------------------ Traceback (most recent call last): File "/usr/lib64/nagios/plugins/cisco_ucs_nagios", line 1221, in args_object.proxy) File "/usr/lib/python2.7/site-packages/UcsSdk/UcsHandle.py", line 362, in Login response = self.AaaLogin(username, password, dumpXml) File "/usr/lib/python2.7/site-packages/UcsSdk/UcsHandle.py", line 2373, in AaaLogin response = self.XmlQuery(method, WriteXmlOption.Dirty, dumpXml) File "/usr/lib/python2.7/site-packages/UcsSdk/UcsHandle.py", line 223, in XmlQuery f = opener.open(req) File "/usr/lib64/python2.7/urllib2.py", line 431, in open response = self._open(req, data) File "/usr/lib64/python2.7/urllib2.py", line 449, in _open '_open', req) File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain result = func(*args) File "/usr/lib64/python2.7/urllib2.py", line 1244, in http_open return self.do_open(httplib.HTTPConnection, req) File "/usr/lib64/python2.7/urllib2.py", line 1214, in do_open raise URLError(err) URLError:
I used the v8.4 ucs python sdk hoping it would still work.
Thank you in advance!
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: