07-13-2007 02:06 AM
I'm seeing events from two of my switches which show critical alarms for high error rates on a few ports, but investigation of the ports shows nothing. No errors via "show int", no errors via "show counters int" or "show int counters errors". Nothing appears to be wrong, but CiscoWorks opened and closed events for one of these interfaces 17 times overnight.
I've tried removing the device from DFM and rediscovering, but it still happens so I don't think it is a corruption. I notice when looking at the events it shows high valudes for InputPacketErrorRate (69000+ pps on tengig interface). I don't see any errors on the interface on the other side of the link that match the timestamps, but the opposing interfaces log the same kinds of InputPacketErrorRate events at other times.
I'm having trouble getting our support provider to believe there may be an issue as they only trust in the CLI.
Does anyone know what this counter in DFM is showing? I'd like to hook it up to a graph in OpenView to prove it isn't CiscoWorks making it up.
07-13-2007 09:55 AM
It depends on the type of device, but objects you should be looking at are ifInErrors, locIfInRunts, locIfInGiants, locIfInCRC, locIfInFrame, locIfInIgnored, locIfInAbort, as well as the following for ethernet interfaces:
dot3StatsFCSErrors
dot3StatsInternalMacReceiveErrors
dot3StatsAlignmentErrors
dot3StatsFrameTooLongs
The default DFM polling interval is 240 seconds, so look to see if they are increasing over that polling interval. DFM calculates a HighErrorRate event using:
HighErrorRate =
CurrentUtilization > MinimumUtilization &&
(InputPacketErrorPct > ErrorThreshold ||
OutputPacketErrorPct > ErrorThreshold);
Where the PacketErrorPcts are:
PacketErrorPct =
(PacketErrorRate / (PacketRate) * 100;
08-15-2007 12:11 PM
Hi,
i've got the same issue with tengig interfaces. CLI doesn't report any errors but DFM reports input errors. However this only happens, if tengig's are grouped as etherchannel.
What i could imagine is that tengig interfaces are not correctly supported by DFM. Because there is no dedicated interface group for them! Thresholds for interface util. and error rate are taken from the 10/100 interface group! hm? I'm using LMS 2.6.
Is it working better with LMS 3.0
08-15-2007 12:24 PM
Which asl file is containing the above formula for calculating the PacketErrorPct?
Thanks!
08-15-2007 12:27 PM
None. This is part of the binary code. This algorithm has not changed in LMS 3.0. You will have to create (or modify) a custom group to handle the polling and thresholds for these TenGig etherchannels.
08-15-2007 10:28 PM
My issue turned out to be an IOS bug. My workaround is to disable DFM on the effected interfaces until an upgrade is completed - changing performance counters wouldn't help because it was telling me error rates like 9.49996868x10^9pps and I wanted to keep monitoring the interfaces that weren't displaying this error.
TAC sent me this:
CSCsb85024
WS-X6148A-GE-TX corrupt counter dot3StatsInternalMacReceiveErrors
which is switching related, and not CiscoWorks DFM related issue.
You can check it on the following URL:
http://www.cisco.com/cgi-bin/Support/Bugtool/home.pl
As you see in the attached error message in your problem description you have IntMacRx-Err 16777216 :
Port SQETest-Err Deferred-Tx IntMacTx-Err IntMacRx-Err Symbol-Err
Gi6/1 0 0 0 16777216 0
And also have this WS-X6148A-GE-TX module in both of the show techs.
CSCsb85024
WS-X6148A-GE-TX corrupt counter dot3StatsInternalMacReceiveErrors
*************************************
Release Notes
When sending streams at full line rate 100M / Full Duplex, the 6148A module
would produce port counter errors on other ports other than those that were
connected to the IXIA traffic generator.
The only port counter error that would be valued is the
dot3StatsInternalMacReceiveErrors
IXIA streams were configured for a L3 subnet broadcast, or a L3 all FF's
broadcast.
Problem is easily reproducible, and appears to be cosmetic. No
system impact was noticed.
The problem was seen using a Supervisor2 and a pre-release of CatOS 8.5.1 8.5(0.171)JAC
Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize
----- ---------- ---------- ---------- ---------- ---------
8/1 0 0 0 16777216 0
8/2 0 0 0 16777216 0
8/3 0 0 0 0 0
8/4 0 0 0 4278190080 0
8/5 0 0 0 4278190080 0
8/6 0 0 0 4278190080 0
8/7 0 0 0 4278190080 0
8/8 0 0 0 4278190080 0
8/9 0 0 0 4278190080 0
8/10 0 0 0 4278190080 0
8/11 0 0 0 4278190080 0
8/12 0 0 0 4278190080 0
8/13 0 0 0 4278190080 0
8/14 0 0 0 4278190080 0
8/15 0 0 0 4278190080 0
8/16 0 0 0 4278190080 0
8/17 0 0 0 4278190080 0
8/18 0 0 0 0 0
8/19 0 0 0 0 0
8/20 0 0 0 0 0
8/21 0 0 0 0 0
8/22 0 0 0 0 0
8/23 0 0 0 0 0
8/24 0 0 0 0 0
8/25 0 0 0 0 0
8/26 0 0 0 0 0
8/27 0 0 0 0 0
8/28 0 0 0 0 0
8/29 0 0 0 0 0
8/30 0 0 0 0 0
8/31 0 0 0 0 0
8/32 0 0 0 0 0
8/33 0 0 0 0 0
8/34 0 0 0 0 0
8/35 0 0 0 0 0
8/36 0 0 0 0 0
8/37 0 0 0 0 0
8/38 0 0 0 0 0
8/39 0 0 0 0 0
8/40 0 0 0 0 0
8/41 0 0 0 0 0
8/42 0 0 0 0 0
8/43 0 0 0 0 0
8/44 0 0 0 0 0
8/45 0 0 0 0 0
8/46 0 0 0 0 0
8/47 0 0 0 0 0
8/48 0 0 0 0 0
******************************************************
As you see this is fixed in IOS 12.2(18)SXF05, but you have only Version 12.2(18)SXF3 on both affected switches.
Please visit the following link and download the necessary IOS (SXF10 is the lastest)
12.2.18-SXF10 (ED)
12.2.18-SXF9 (ED)
12.2.18-SXF8 (ED)
12.2.18-SXF7 (ED)
12.2.18-SXF6 (ED)
12.2.18-SXF5 (ED)
08-15-2007 10:29 PM
You could do a show interfaces counters errors command to see if IntMacRx-Err is the counter going out of control on your switch.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide