cancel
Showing results for 
Search instead for 
Did you mean: 
cancel

ASR9000/XR NAG killer, how to suppress annoying syslog messages.

730
Views
10
Helpful
1
Comments

If you have a frequent appearing syslog message, such as you are pulling the bridge domain mac tables via CISCO PRIME, the RESYNC messages may appear constantly and filling the log and syslog servers.

This guide provides some tips on how to set up logging suppression for these unneeded messages saving some alarms in syslog and the log buffer.

Configuration

!! enter config mode... 
conf t
!! this removes the correlator so you can edit it...
no logging correlator apply rule kill-annoyances all-of-router

!!! define the rule
logging correlator rule kill-annoyances type nonstateful
  timeout 600000
!!! this is the "root cause" one... make sure you pick something that happens frequently
rootcause PLATFORM ENVMON FAN_FAIL

!!! these are all the NON root cause events. this is what gets squashed along with the root cause.
!!! add things here that you want squashed.
  nonrootcause
  alarm PLATFORM ENVMON FAN_CLEAR
  alarm PLATFORM ENVMON FANTRAY_FAIL
  alarm PLATFORM ENVMON ENV_CONDITION
  alarm PLATFORM ENVMON FANTRAY_CLEAR
 
!!! timeouts are currently maxed at ten minutes... (smu anyone?)
  timeout-rootcause 600000

!!! this re-applies the correlator
 logging correlator apply rule kill-annoyances all-of-router
!!! now commit the thing
commit
!!! done...

Explanation and detail

 

  logging correlator rule fan type nonstateful
  timeout 600000
  rootcause PLATFORM ENVMON FAN_FAIL
  nonrootcause
  alarm PLATFORM ENVMON FAN_CLEAR
  alarm PLATFORM ENVMON FANTRAY_FAIL
  alarm PLATFORM ENVMON ENV_CONDITION
  alarm PLATFORM ENVMON FANTRAY_CLEAR
  !
  timeout-rootcause 600000
!
logging correlator apply rule fan
  all-of-router
!


Which essentially says the following:


1) a message of format "PLATFORM-ENVMON-FAN_FAIL" is a 'root cause' event.
the timeout for root cause events is set to 600000ms (ten minutes), so
no matter how many of these events I see, I will only actually throw a
syslog every ten minutes.

2) underneath this 'root cause' event are a number of 'nonrootcause'
events.  If I see any of these events within the timeout specified (again,
ten minutes) of a 'root cause' I will also suppress these messages -- the
theory here is that I already know the root cause and don't want to clutter
myself up with all the side effects.   

3) this particular "correlator rule" is applied to the whole router (you
*can* do all sorts of funky stuff with where you apply it if you want).

4) in real environments the idea is to have lots of different correlators
for different events... 


Limitations:

5) you can now set timeouts up to 7200000 seconds (LONG time...)

6) the only really annoying part is that you have to unapply the rule
before you can edit it... so the process is "unapply rule, commit, change
rule, apply rule, commit" instead of just "change rule".  


7) if you want to see the messages that got suppressed/correlated, use the
"show logging correlator buffer all-in-buffer" command -- and sit back and
be amazed at how much console bandwidth you've saved.  ;-)

8) Suppressed messages are not sent to the syslog server and are not sent to CTY/VTY but will appear in a separate suppress buffer.

Comments

Thanks for the explanation, it is very clear, was really helpful to me.

 

regards

CreatePlease to create content
Content for Community-Ad
August's Community Spotlight Awards