cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1935
Views
0
Helpful
10
Replies

IOS-XR 5.1.3 SP2 - %OS-RT_CHECK-3-INCONSISTENCY_DETECTED : ipv4-unicast detected inconsistency with

I'm curious if anyone else has seen this message logged after an up/down grade to 5.1.3 w/ SP2

%OS-RT_CHECK-3-INCONSISTENCY_DETECTED : ipv4-unicast detected inconsistency with 1 entries for scan-id N

We were told by TAC this is a cosmetic issue only and not to worry.  However, the engineer inside me wants to know what the router is upset about and how to suppress the log message.  I'd also like to ask Cisco to create a cosmetic bug fix for 5.1.3 to resolve the log message if it is indeed truly cosmetic only in nature.

Thanks!

-ben

10 Replies 10

Hi Benjamin,

 

Couple of comments from my side:

-) IOS-XR is in general much more chatty than IOS, that is for us engineers nice in most situations but might be annoying sometimes as cosmetic issues like this might appear more often

-) The engineer inside me wants to have a clean OS running without cosmetic bugs, mostly because I like to avoid customer X telling me that "those nasty error messages never appear on vendor Ys equipment..." IMHO the only way to go into this direction is to be decisive with TAC and tell them to open a DDTS for this

-) If you are already in touch with TAC I recommend to let them open a DDTS, I dont think someone from Cisco will do that for you because of a posting in this forum

-) Bugs that are of a cosmetic nature or where there are simple workarounds available are generally not subject to SMUs, or at least the chance to get an SMU is very low

 

Florian

the ddts that fixes this is : CSCug62553

there have been smu requests for this, but declined, this because some other users have complained about the number of smu's out there, so now everything is under heavy scrutening, it's never easy :)

Since this is indeed a simple cosmetic issue, there is no need to file a case, ddts or smu request per-se, but there are some options.

You can use logging correlator to suppress these messages or use the clear log message command to remove (periodically) some erroneous messages from the log buffer.

Also wanted to mention the RCC/LCC for XR is rather rigid, it doesnt add much value and was something that was done for one user group back in the early GSR XR days (carried over from the IOS conversion). We are working on somehting much more comprehensive called XR blackhole detection that can pinpoint in the system precisely where a possible drop and inconsistency may be and possibly resolve it also (if it is not a sw bug).

With that said, no need to use RCC/LCC, if you like it still, use the logging correlator or clear log option to suppress and or remove the messages.

cheers!

xander

Thanks for the explanations.

Can you explain how we would implement the workaround and fix you suggest.  How do we obtain and install the DDTS CSCug62553?  How do we configure logging correlation?

Thanks again.

-ben

I took a stab at a logging suppression config (I've never done this before).  How does this look?  I haven't had a chance to test.

config
logging suppress rule OS-RT_CHECK-3-INCONSISTENCY_DETECTED
alarm OS RT_CHECK INCONSISTENCY_DETECTED
commit

logging suppress apply rule OS-RT_CHECK-3-INCONSISTENCY_DETECTED
all-of-router
commit

end

show logging correlator rule

hi Ben, the ddts is fixed in 52x forward. There is no smu planned for prior releases.

you can schedule a periodic selective clear of the log buffer via:

RP/0/RSP0/CPU0:A9K-BNG#clear log events delete [option] [field]

the logging correlator, here is a little write up on that (copy/paste from my kb)

 

XR nag Killer

The short version  (i.e. here's the code to make it happen!)

!! enter config mode... 

conf t

!! this removes the correlator so you can edit it...
no logging correlator apply rule kill-annoyances all-of-router

!!! define the rule
logging correlator rule kill-annoyances type nonstateful
  timeout 600000

!!! this is the "root cause" one... make sure you pick something that happens frequently
rootcause PLATFORM ENVMON FAN_FAIL

!!! these are all the NON root cause events. this is what gets squashed along with the root cause.
!!! add things here that you want squashed.
  nonrootcause
  alarm PLATFORM ENVMON FAN_CLEAR
  alarm PLATFORM ENVMON FANTRAY_FAIL
  alarm PLATFORM ENVMON ENV_CONDITION
  alarm PLATFORM ENVMON FANTRAY_CLEAR
 
!!! timeouts are currently maxed at ten minutes... (smu anyone?)
  timeout-rootcause 600000

!!! this re-applies the correlator
 logging correlator apply rule kill-annoyances all-of-router

!!! now commit the thing

commit

!!! done...

On a somewhat related note, if anyone is not already familiar with the
 "logging correlator" function -- it can be used to greatly reduce the
 amount of "noise" generated by all these various little things that are
 broken (like single fan tray systems!)
 
 An example config that I have on my box is as follows:
 

 logging correlator rule fan type nonstateful
  timeout 600000
  rootcause PLATFORM ENVMON FAN_FAIL
  nonrootcause
  alarm PLATFORM ENVMON FAN_CLEAR
  alarm PLATFORM ENVMON FANTRAY_FAIL
  alarm PLATFORM ENVMON ENV_CONDITION
  alarm PLATFORM ENVMON FANTRAY_CLEAR
  !
  timeout-rootcause 600000
!
logging correlator apply rule fan
  all-of-router
!
>>
Which essentially says the following:
>>
1) a message of format "PLATFORM-ENVMON-FAN_FAIL" is a 'root cause' event.
the timeout for root cause events is set to 600000ms (ten minutes), so
no matter how many of these events I see, I will only actually throw a
syslog every ten minutes.

2) underneath this 'root cause' event are a number of 'nonrootcause'
events.  If I see any of these events within the timeout specified (again,
ten minutes) of a 'root cause' I will also suppress these messages -- the
theory here is that I already know the root cause and don't want to clutter
myself up with all the side effects.    In reality we're just hacking the
correlator to get rid of messages, but hey -- it works.  ;-)
>>
3) this particular "correlator rule" is applied to the whole router (you
*can* do all sorts of funky stuff with where you apply it if you want).

4) in real environments the idea is to have lots of different correlators
for different events... but what I do is basically maintain a great big
list of known syslog messages that I don't want to have splattering my
screen, and the correlator gobbles them all up for me.
>>
Limitations:
>>
5) UPDATE: you can now set timeouts up to 7200000 seconds (LONG time...)
>>
6) the only really annoying part is that you have to unapply the rule
before you can edit it... so the process is "unapply rule, commit, change
rule, apply rule, commit" instead of just "change rule".  But hey, it's
better than nothing.
>>
7) if you want to see the messages that got suppressed/correlated, use the
"show logging correlator buffer all-in-buffer" command -- and sit back and
be amazed at how much console bandwidth you've saved.  ;-)
>>
Hope people find this helpful...

config example courtesy of LJ Wobker.

xander

The bug has the fix in later code, as Xander mentioned this is cosmetic so there is no SMU for this issue.

 

As for suppression the format is like this:

%ROUTING-LDP-4-DUP_ADDRS :
logging suppress rule NODUPMESSAGES
alarm ROUTING LDP DUP_ADDRS
!
logging suppress apply rule NODUPMESSAGES
all-of-router
!

 

So in your case the message is %OS-RT_CHECK-3-INCONSISTENCY_DETECTED

Which translates to 'alarm OS RT_CHECK INCONSISTENCY_DETECTED'

 

Thanks,

Sam

I've run into the same issue on an ASR 9001 running 5.3.2:

RP/0/RSP0/CPU0:AGG-01#sh install active summ
Tue Dec 29 09:37:57.431 UTC
Default Profile:
SDRs:
Owner
Active Packages:
disk0:asr9k-mini-px-5.3.2
disk0:asr9k-fpd-px-5.3.2
disk0:asr9k-mcast-px-5.3.2
disk0:asr9k-mgbl-px-5.3.2
disk0:asr9k-mpls-px-5.3.2
disk0:asr9k-k9sec-px-5.3.2
disk0:asr9k-px-5.3.2.CSCur86301-1.0.0
disk0:asr9k-px-5.3.2.CSCuv49399-1.0.0
disk0:asr9k-px-5.3.2.CSCuv76327-1.0.0
disk0:asr9k-px-5.3.2.CSCuv83731-1.0.0
disk0:asr9k-px-5.3.2.CSCuv87607-1.0.0
disk0:asr9k-px-5.3.2.CSCuv98171-1.0.0
disk0:asr9k-px-5.3.2.CSCuw01521-1.0.0
disk0:asr9k-px-5.3.2.CSCuw18466-1.0.0
disk0:asr9k-px-5.3.2.CSCuw26855-1.0.0
disk0:asr9k-px-5.3.2.CSCuw28784-1.0.0
disk0:asr9k-px-5.3.2.CSCuw36143-1.0.0
disk0:asr9k-px-5.3.2.CSCuw39764-1.0.0

Here is the log message being generated:


RP/0/RSP0/CPU0:Dec 29 09:34:28.306 UTC: rt_check_mgr[1189]: %OS-RT_CHECK-3-INCONSISTENCY_DETECTED : ipv4-unicast detected inconsistency with 1 entries for scan-id 28

Here is my suppression work-around:

logging suppress rule SUPPRESS-DDTS_CSCug62553
alarm OS RT_CHECK INCONSISTENCY_DETECTED

logging suppress apply rule SUPPRESS-DDTS_CSCug62553
all-of-router

I've run into the same issue on an ASR 9010 running 5.3.3:

RP/0/RSP0/CPU0:ASR9010#sh install active summ
Mon Feb 22 15:01:13.585 UTC
Default Profile:
SDRs:
Owner
Active Packages:
disk0:asr9k-k9sec-px-5.3.3
disk0:asr9k-9000v-nV-px-5.3.3
disk0:asr9k-asr901-nV-px-5.3.3
disk0:asr9k-bng-px-5.3.3
disk0:asr9k-mcast-px-5.3.3
disk0:asr9k-mini-px-5.3.3
disk0:asr9k-mpls-px-5.3.3
disk0:asr9k-px-5.3.3.CSCux95550-1.0.0
disk0:asr9k-px-5.3.3.CSCuy04039-1.0.0
disk0:asr9k-px-5.3.3.CSCuy18293-1.0.0
disk0:asr9k-px-5.3.3.CSCuy18354-1.0.0

Here is the log message being generated:

RP/0/RSP0/CPU0:Feb 22 14:59:20.611 UTC: rt_check_mgr[1184]: %OS-RT_CHECK-3-INCONSISTENCY_DETECTED : ipv4-unicast detected inconsistency with 1 entries for scan-id 7268

ben.wiechman
Level 4
Level 4

6.1.2 and still going strong... 


RP/0/RSP0/CPU0:ASR9010#admin show install active summ
Fri Jun 23 22:22:39.066 UTC
Default Profile:
SDRs:
Owner
Active Packages:
disk0:asr9k-9000v-nV-px-6.1.2
disk0:asr9k-doc-px-6.1.2
disk0:asr9k-fpd-px-6.1.2
disk0:asr9k-k9sec-px-6.1.2
disk0:asr9k-mcast-px-6.1.2
disk0:asr9k-mgbl-px-6.1.2
disk0:asr9k-mini-px-6.1.2
disk0:asr9k-mpls-px-6.1.2
disk0:asr9k-services-px-6.1.2
RP/0/RSP0/CPU0:2017 Jun 19 13:09:22.055 UTC: rt_check_mgr[1200]: %OS-RT_CHECK-3-INCONSISTENCY_DETECTED : ipv4-unicast detected inconsistency with 1 entries for scan-id 79089 
RP/0/RSP0/CPU0:2017 Jun 19 13:10:35.318 UTC: rt_check_mgr[1200]: %OS-RT_CHECK-3-INCONSISTENCY_DETECTED : ipv4-unicast detected inconsistency with 1 entries for scan-id 79090
RP/0/RSP0/CPU0:2017 Jun 19 13:11:48.582 UTC: rt_check_mgr[1200]: %OS-RT_CHECK-3-INCONSISTENCY_DETECTED : ipv4-unicast detected inconsistency with 1 entries for scan-id 79091

hi ben,it may be an actual inconsistency that is detected. but since it is only 1, it could also be a false positive.

there are a few debugs and show commands available to help identify the precise culprit for this message, which effectively means that we need to get the specifics of this scanID and see what it was doing and where it found an issue.

just to note, some false positives were observed in per-ce label assignment recently in xr6.

anyways, here is some good aid possibly to help identify where it came from,

xander

Debugging Help

Check the state of the processes involved - ipv4_rib, ipv6_rib, mpls_lsd, fib_mgr and rt_check_mgr

Debugs to turn on

debug rcc lib
debug rcc background
debug rcc error
debug rcc manager
debug cef errors level 2 loc <>
debug rib rcc
debug mpls lsd rcc

show commands

sh rcc ipv4/v6 unicast statistics summary
sh lcc ipv4/v6 unicast statistics summary
sh rcc ipv4/v6 unicast statistics scan-id <>
sh lcc ipv4/v6 unicast statistics scan-id <>

sh rcc/lcc ipv4/v6 unicast statistics (this command displays the summary and log together)

Commands to start on demand scan

sh rcc ipv4/v6 unicast all [vrf <>] (check all routes under the vrf)
sh rcc ipv4/v6 unicast <prefix/mask> [vrf <>] (check a specific route under the vrf)

sh lcc ipv4/v6 unicast all (check all labels and TE interface)
sh lcc ipv4/v6 unicast label/tunnel-interface <> (check a specific label/tunnel-interface)

Commands to configure BG scan

conf t >
> rcc ipv4/6 unicast enable
> rcc ipv4/6 unicast period <>
> lcc ipv4/v6 unicast enable
> lcc ipv4/v6 unicast period <>

The period here is the time taken by the producers in between batch updates given to FIB.

Example:

RP/0/0/CPU0:head(config)#rcc ipv4 unicast ?

  enable  Enable background scan

  period  Period of checks in milliseconds

RP/0/0/CPU0:head(config)#rcc ipv4 unicast period ?

  <500-600000>  Period between buffers in scans in milliseconds

The default period is 15000 ms.
Default route churn is 100rts/sec

Other handy show commands

show rib tables summary
show route summary
sh mpls label table summary
sh cef summary
sh mpls forwarding

sh mpls traffic-eng/lsd forwarding

sh cef vrf <> [ipv4|ipv6] <prefix/mask> backup location <>

For details of these commands pls refer respective module specific wiki's

To create inconsistency

debug cef ipv4/6 test inconsistency vrf <vrf> <acl> loc <>

This command will drop the route updates based on the configured acl.
One can configure the acl accordingly to drop all route updates or for a specific prefix.

debug cef mpls test inconsitency all loc <>

This command will drop all the label updates in cef.

Adding a new route/label with this command enabled will trigger a missing route/label situation in FIB.
Modifying an existing route/label with this command enabled will trigger a mismatch in the information between RIB|LSD/FIB for the route/label.
Deleting an existing route/label with this command enabled will trigger a stale route/label in FIB.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: