cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
283
Views
0
Helpful
1
Replies

6509 Logs

Hi Team,

Please help me with identifying the problem of below events. (6509 Switch)

XXXXXXXXX#sh diagnostic events module 5
Diagnostic events (storage for 500 events, 8 events recorded)
Event Type (ET): I - Info, W - Warning, E - Error

Time Stamp         ET [Card] Event Message
------------------ -- ------ --------------------------------------------------
01/27 20:26:56.456 I  [5]    Diagnostics Passed
01/27 20:28:49.992 E  [5]    TestErrorCounterMonitor: ID:42 IN:0 PO:255 RE:200
                              RM:255 DV:1 EG:2 CF:1 TF:1
01/27 20:40:35.192 E  [5]    TestErrorCounterMonitor: ID:42 IN:0 PO:255 RE:200
                              RM:255 DV:65535 EG:2 CF:1 TF:2

1 Reply 1

InayathUlla Sharieff
Cisco Employee
Cisco Employee

What is TestErrorCounterMonitor?

The TestErrorCounterMonitor has detected that an error counter in the specified module has exceeded a threshold. Specific data about the error counter will be sent in a separate system message. The TestErrorCounterMonitor is a non-disruptive health-monitoring background process that periodically polls the error counters and interrupt counters of each line card or supervisor module in the system. This message contains specific data about the error counter, including the ASIC and register of the counter, and the error count.

These error message may be generated either by Supervisor or on behalf of any of the classic line cards present in the chassis.

How to proceed from here?

In the 'show module' output, see whether any classic line cards are available. If so, check for any CRC errors from any of the interfaces among these classis line cards. If any of the classic card is found to be faulty, it has to be replaced. If there is no CRC errors found on the interfaces pertaining to any of the classic line cards, then the problem would be with the supervisor itself.

Upon decoding these error messages, you may get the bad packet CRC related result.

Example:

"HY_FD_PG_PG_BAD_PKT_CRC The number of packets with bad_pkt_crc found."

Sometimes, a simple reseat of supervisor may fix the issue if no classic line cards are present.

 

Why are we suspecting the fault @ Classis line cards instead of Supervisor?

Hyperion on the Supervisor is most likely just detecting the errors via counter HY_FD_PG_PG_BAD_PKT_CRC The number of packets with bad_pkt_crc found. The register start with FD = old Medusa ASIC, so this is related to the old bus, this is why we need to look at the classic line cards.

Few Cisco Defects:

There are possibilities for these errors can be false positive on certain 67xx cards. Please verify whether we hit any of these bugs.

CSCua09073  TestErrorCounterMonitor is not available in 7600 boxes for 6708 LC.

CSCtq73026  6708 port ASIC generates txCRC

CSCtl77057  TestErrorCounterMonitor can generate false positive on 67XX cards

As a last resort, after verifying there is no issues with classic line cards or with supervisor, chassis replacement is the final solution.

Regards,

Inayath

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: