6509-E- high peak ulitisation

paul driver · ‎02-28-2017

All

Can someone confirm what I am looking at here - I cannot see the wood for the trees so to speak!

We have 2 x 6509-E with cpu utilisation peaks occurring just one the chassis (primary)
Whats vexing is I cannot see this occurring on the specific DFC modules.

We are also getting constant EARL messages pertaining to DFC 6/7/8 modules, Which I am led to believe are cosmetic but would this make the cpu utilization peak / increase ?

So my queries would be:

Is it possible these two instances be related?

How do I assimilate between the line card and the chassis (sup card) buffer readouts?

DCef is enabled all line cards and at present all traffic is traversing this primary chassis

res
Paul

Please rate and mark as an accepted solution if you have found any of the information provided useful.
This then could assist others on these forums to find a valuable answer and broadens the community’s global network.

Kind Regards
Paul

Julio E. Moisa · ‎02-28-2017

Hi

Please check this link:

https://quickview.cloudapps.cisco.com/quickview/bug/CSCsz52301

My suggestion is open a case with the Cisco TAC.

>> Marcar como útil o contestado, si la respuesta resolvió la duda, esto ayuda a futuras consultas de otros miembros de la comunidad. <<

paul driver · ‎03-01-2017

Hello

Thanks for the link, This seems a viable suggestion although it I am thinking it must be more of an isolated issue to these two interconnected chassis.

We have 80+ of these 6500's with MSFC3 sup720 and DFC's line cards within the estate, Majority the same hardware and I am suspecting software and as far as i am aware all running with no issues.

I did cross referenced with a couple of other chassis which have the same modules and the same IOS versions/feature sets and their not reporting/logging any problems,

The interfaces show no buffering of packets to support my initial theory of packets getting punted and so process switched, which would indeed increase the cpu utilization, it also looks like Cef is doing its job and there isn't really very high traffic loads traversing the interconnects being either switched or routed.

The SP and RP Cpu's processes on both primary and secondary chassis show high peak utilization but the average util is below 30% and the secondary isnt doing much its basically redundant at this time as all traffic is hitting this primary chassis.

On the other hand that cisco bug document you posted does highlight the same interrupts I have seen in the readouts.

Do you think these constant EARL messages pertaining to a few of DFC modules could be having any involvement in what I am seeing?

res
Paul

Please rate and mark as an accepted solution if you have found any of the information provided useful.
This then could assist others on these forums to find a valuable answer and broadens the community’s global network.

Kind Regards
Paul

Reza Sharifi · ‎03-01-2017

Hi Paul,

Regarding the error message in the log, it appears that this error is in SX revision and you are running SXJ.

BTW, this IOS is over 8 years old now. Also, here is the output from the document I found.

The switch reports this error message:

EARL_L3_ASIC-SP-3-INTR_WARN: EARL L3 ASIC: Non-fatal interrupt [chars]

This example shows the console output that is displayed when this problem occurs:

Apr 20 17:53:38: %EARL_L3_ASIC-SP-3-INTR_WARN: EARL L3 ASIC: 
           Non-fatal interrupt Packet Parser block interrupt
Apr 20 19:13:05: %EARL_L3_ASIC-SP-3-INTR_WARN: EARL L3 ASIC:
           Non-fatal interrupt Packet Parser block interrupt

Description

The error message %EARL_L3_ASIC-SP-3-INTR_WARN indicates that the Enhanced Address Recognition Logic (EARL) Layer 3 (L3) application-specific integrated circuit (ASIC) detected an unexpected non-fatal condition. This indicates that a bad packet, probably a packet which contains a Layer 3 IP checksum error, was received and dropped. The cause of the issue is a device on the network that sends out bad packets. These issues, among others, can cause the bad packets:

Bad NICs
Bad NIC drivers
Bad applications

In older Cisco IOS Software releases, these packets are normally dropped without being logged. The feature of logging error messages about this problem is a feature found in Cisco IOS Software Release 12.2SX and later.

link:

http://www.cisco.com/c/en/us/support/docs/switches/catalyst-6500-series-switches/41265-186-ErrormsgIOS-41265.html

As for high CPU, looking at the attachment, it appears that module 6 has the highest average CPU.

HTH