10-10-2011 03:59 AM - edited 03-07-2019 02:42 AM
i try newest ios 15.1(3)S0a and 12.2(33)SRE
in both case some times i obtain lowing traffic on interface and highest cpu - to 100%
after clear cef linicard i obtain growing traffic and cpu 0%
#sh proc cpu s
CPU utilization for five seconds: 87%/83%; one minute: 91%; five minutes: 96%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
7 1711512 87883 19474 4.06% 0.84% 0.89% 0 Check heaps
245 5688 10115351 0 0.16% 0.12% 0.13% 0 Ethernet Msec Ti
210 2276 2538856 0 0.08% 0.02% 0.01% 0 IP ARP Retry Age
211 44772 267882 167 0.08% 0.04% 0.05% 0 IP Input
244 428 326057 1 0.08% 0.00% 0.00% 0 Ethernet Timer C
#clear cef linecard
#sh proc cpu s
CPU utilization for five seconds: 0%/0%; one minute: 3%; five minutes: 3%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
157 36588 1511 24214 0.23% 0.02% 0.00% 0 Per-minute Jobs
346 110540 43940 2515 0.15% 0.06% 0.05% 0 HIDDEN VLAN Proc
211 45600 280259 162 0.07% 0.04% 0.03% 0 IP Input
10-10-2011 05:22 AM
Hi Alex,
what do you mean by "i obtain lowing traffic on interface" ? Do you see drops? Can you document them?
how often do you see the problem?
how many boxes are affected?
Next time you see the problem can you take the following BEFORE clearing CEF.
show module
show version
show process cpu sorted
show ibc brief
show cef line
after you clear cef you wait 2-3 minutes and then
show process cpu s
show ibc
show cef line
Let's see if I can get something useful from those outputs or else, as I wrote on the other thread, you'd better open a TAC case for deeper investigation.
Riccardo
10-12-2011 02:14 AM
Hi, Riccardo
problem is vary irregular - can be many times per day or no per week
any parameters no chage befor and after command clear
>how many boxes are affected?
do not understand - what is box ?
#sh mod
Mod Ports Card Type Model Serial No.
--- ----- -------------------------------------- ------------------ -----------
1 2 Route Switch Processor 720 (Active) RSP720-3CXL-GE
2 4 CEF720 4 port 10-Gigabit Ethernet WS-X6704-10GE
3 4 CEF720 4 port 10-Gigabit Ethernet WS-X6704-10GE
Mod MAC addresses Hw Fw Sw Status
--- ---------------------------------- ------ ------------ ------------ -------
1 5.7 12.2(33r)SRB 12.2(33)SRE5 Ok
2 2.7 12.2(14r)S5 12.2(33)SRE5 Ok
3 2.3 12.2(14r)S5 12.2(33)SRE5 Ok
Mod Sub-Module Model Serial Hw Status
---- --------------------------- ------------------ ----------- ------- -------
1 Policy Feature Card 3 7600-PFC3CXL 1.2 Ok
1 C7600 MSFC4 Daughterboard 7600-MSFC4 1.1 Ok
2 Centralized Forwarding Card WS-F6700-CFC 4.1 Ok
3 Centralized Forwarding Card WS-F6700-CFC 4.1 Ok
sh ver
Cisco IOS Software, c7600rsp72043_rp Software (c7600rsp72043_rp-IPSERVICESK9-M), Version 12.2(33)SRE5, RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2011 by Cisco Systems, Inc.
Compiled Thu 15-Sep-11 01:11 by prod_rel_team
ROM: System Bootstrap, Version 12.2(33r)SRB4, RELEASE SOFTWARE (fc1)
BOOTLDR: Cisco IOS Software, c7600rsp72043_rp Software (c7600rsp72043_rp-IPSERVICESK9-M), Version 12.2(33)SRE5, RELEASE SOFTWARE (fc1)
OnePower uptime is 5 days, 3 hours, 39 minutes
Uptime for this control processor is 5 days, 3 hours, 37 minutes
System returned to ROM by reload (SP by reload)
System restarted at 11:35:01 YEKST Fri Oct 7 2011
System image file is "bootdisk:/c7600rsp72043-ipservicesk9-mz.122-33.SRE5.bin"
Last reload type: Normal Reload
Cisco CISCO7604 (M8500) processor (revision 2.0) with 1900544K/131072K bytes of memory.
Processor board ID FOX1326GDYU
BASEBOARD: RSP720
CPU: MPC8548_E, Version: 2.0, (0x80390020)
CORE: E500, Version: 2.0, (0x80210020)
CPU:1200MHz, CCB:400MHz, DDR:200MHz,
L1: D-cache 32 kB enabled
I-cache 32 kB enabled
Last reset from power-on
9 Virtual Ethernet interfaces
2 Gigabit Ethernet interfaces
8 Ten Gigabit Ethernet interfaces
3964K bytes of non-volatile configuration memory.
507024K bytes of Internal ATA PCMCIA card (Sector size 512 bytes).
Configuration register is 0x2102
show process cpu sorted | e 0.00% 0.00% 0.00%
CPU utilization for five seconds: 3%/2%; one minute: 5%; five minutes: 5%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
367 97620 1475871 66 0.07% 0.01% 0.00% 0 BGP I/O
306 3900 2421176 1 0.07% 0.02% 0.01% 0 TCP Timer
29 88848 674355 131 0.07% 0.01% 0.00% 0 IPC Seat Manager
438 210188 2401185 87 0.07% 0.03% 0.02% 0 Port manager per
47 244 444522 0 0.07% 0.00% 0.00% 0 GraphIt
245 6396 54021009 0 0.07% 0.08% 0.07% 0 Ethernet Msec Ti
248 2900 13539680 0 0.07% 0.01% 0.00% 0 IPAM Manager
59 80 44615 1 0.07% 0.00% 0.00% 0 Net Background
7 5529260 295431 18715 0.00% 0.90% 1.02% 0 Check heaps
2 17712 89010 198 0.00% 0.01% 0.00% 0 Load Meter
73 3040 4847 627 0.00% 0.01% 0.00% 1 Virtual Exec
210 2496 13539674 0 0.00% 0.02% 0.00% 0 IP ARP Retry Age
211 200952 2778465 72 0.00% 0.05% 0.07% 0 IP Input
294 116484 358092 325 0.00% 0.02% 0.02% 0 XDR mcast
346 254440 222523 1143 0.00% 0.05% 0.05% 0 HIDDEN VLAN Proc
444 878864 2088940 420 0.00% 0.05% 0.07% 0 BGP Router
499 7460608 36888 202250 0.00% 0.87% 1.22% 0 BGP Scanner
show cef line
Slot Flags
1/0 up
VRF IPv4:Default, 368765 routes
Slot I/Fs State Flags
1/0 6 Active sync, table-up
VRF IPv6:Default, 2 routes
Slot I/Fs State Flags
1/0 0 Active sync, table-up
show ibc brief
Interface information:
Interface IBC0/0(idb 0x150A73E8)
5 minute rx rate 11585000 bits/sec, 1529 packets/sec
5 minute tx rate 23542000 bits/sec, 3059 packets/sec
2970488354 packets input, 2217147403408 bytes
2970394762 broadcasts received
2969724007 packets output, 2214293033620 bytes
66688948 broadcasts sent
0 Bridge Packet loopback drops
2967333578 Packets CEF Switched, 0 Packets Fast Switched
0 Packets SLB Switched, 0 Packets CWAN Switched
Label switched pkts dropped: 0 Pkts dropped during dma: 1097
Invalid pkts dropped: 0 Pkts dropped(not cwan consumed): 0
IPSEC pkts: 4553242
Xconnect pkts processed: 0, dropped: 0
Xconnect pkt reflection drops: 0
Total paks copied for process level 0
Total short paks sent in route cache 438681305
Total throttle drops 0 Input queue drops 0
total spd packets classified (1269812 low, 1627108 medium, 48840 high)
total spd packets dropped (1097 low, 0 medium, 0 high)
spd prio pkts allowed in due to selective throttling (0 med, 0 high)
IBC resets = 1; last at 23:35:59.527 YEKST Sat Jul 15 2000
10-12-2011 02:18 AM
Alex,
have you rated/closed the other question yet?
Riccardo
10-12-2011 02:26 AM
Hi Alex,
As we can see High CPU is due to interrupts:
CPU utilization for five seconds: 87%/83%; one minute: 91%; five minutes: 96%
Value after / - 83% is interrupts which are used to send traffi to CPU. So for some reason some traffic getting software switched instead of being HW switched.
As clear cef solves the issue I guess some CEF/GRT mismatch is happening. Possibly some CEF routes are lost making router to send prefixes for those lost entries to the CPU.
First of all we need to understand what are those packets.
Best tool to sniff cpu is netdr. It is safe to use with High CPU
Configure it when you see CPU going up again:
- debug netdr capture rx (let it run few seconds)
- show netdr capture
http://www.cisco.com/en/US/docs/routers/7600/ios/15S/configuration/guide/dos.html#wp1163918
You will see the packets coming in CPU. And then you will be able to check if you have CEF entries for destination ips.
That will give more clue of possible root cause.
Nik
10-12-2011 04:18 AM
Hi, Nikolay
CPU now grow
sh proc cpu s
CPU utilization for five seconds: 41%/40%; one minute: 44%; five minutes: 43%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
244 1784 55116403 0 0.15% 0.10% 0.10% 0 Ethernet Msec Ti
498 388716 566545 686 0.07% 0.04% 0.06% 0 BGP Task
12 28076 190864 147 0.07% 0.00% 0.00% 0 ARP Input
4 0 174 0 0.00% 0.00% 0.00% 0 Retransmission o
5 0 3 0 0.00% 0.00% 0.00% 0 IPC ISSU Dispatc
6 0 1 0 0.00% 0.00% 0.00% 0 PF Redun ICC Req
i do not see that
>So for some reason some traffic getting software switched instead of being HW switched.
all are show command give me picture than all traffic goes throw CEF
show ibc brief
Interface information:
Interface IBC0/0(idb 0x150A73E8)
5 minute rx rate 304992000 bits/sec, 48826 packets/sec
5 minute tx rate 621909000 bits/sec, 97638 packets/sec
4090902684 packets input, 2930329196392 bytes
4090652673 broadcasts received
4090450834 packets output, 2926531746102 bytes
107393 broadcasts sent
0 Bridge Packet loopback drops
4088722150 Packets CEF Switched, 0 Packets Fast Switched
0 Packets SLB Switched, 0 Packets CWAN Switched
Label switched pkts dropped: 0 Pkts dropped during dma: 0
Invalid pkts dropped: 0 Pkts dropped(not cwan consumed): 0
IPSEC pkts: 5277877
Xconnect pkts processed: 0, dropped: 0
Xconnect pkt reflection drops: 0
Total paks copied for process level 0
Total short paks sent in route cache 536662685
Total throttle drops 0 Input queue drops 0
total spd packets classified (787923 low, 1116063 medium, 54980 high)
total spd packets dropped (0 low, 0 medium, 0 high)
spd prio pkts allowed in due to selective throttling (0 med, 0 high)
IBC resets = 1; last at 10:53:29.327 YEKST Fri Oct 7 2011
10-12-2011 04:39 AM
Hi Alex,
Actually these values:
41%/40% mean that CPU is busy on 41% of resources and 40% of CPU taken by traffic and only 1% by SW processes (STP, telnet, etc.). I don't mean that all traffic hitting the CPU. It may be just some - but that it causing the spike. As I advised previously you need to understand what is the traffic hitting the CPU.
You can do either CPU SPAN or netdr as suggested before.
Nik
01-31-2013 10:22 PM
Hello Alex,
Do you found a problem?
I have the same problem on SRE6 and SRE7a.
02-19-2013 01:16 AM
Same problem here with SRE6, RSP720-3CXL
CPU load in interrupts grows to 50-80%, "clear cef linecard" helps for random time from hours to days.
It is one of two almost identical boxes, configuration also almost the same. On another box no such problem.
show netdr captured gives lots of normal packets, which are supposed to be routed in hardware but
for some reason spontaneously hit CPU.
If anybody has a solution, please let me know. Thanks!
02-25-2013 08:07 AM
Alex,
Just to reiterate what Nikolay said before, CPU utilization on an RSP720 or a SUP720 under interrupts, as you see, is generally due to punted traffic. There will be no single solution for this type of problem as each situation is unique. The question that must be answered in a situation like this is WHY is the traffic being punted? The key to answering this question is always figuring out what that traffic is. A netdr capture or a span of the inband channel will help in this process, but this is not an easy process to track down why it is occurring. In your situation it sounds like we may be dealing with an issue in which the CEF forwarding table is either too large, and this is resulting in punts, or is somehow becoming stale and the clear allows it to be reprogrammed. This issue will not be easily troubleshot through posts to the support forums, though we can certainly help if you prefer to handle the issue in this way. My suggestion would be that you open a case with the TAC when this issue is being seen so that we can have a live view of what is being punted and assist in determining why.
If you open a case, the best tech/subtech to select for this type of problem is:
TECH: LAN Switching
SUBTECH: Cat 6000, 6500 Troubleshooting High CPU Running IOS
PROBLEM CODE: Error Messages, Logs, Debugs
The same answer goes for sbr@infonet.ee, if you are actively seeing this problem engaging the TAC will get you the fastest resolution for this type of issue. If you would prefer to go through the support forums, we will need to see the packets being punted through a netdr capture as Nikolay described above.
-Nick
02-25-2013 10:24 AM
6 days ago issued command, found from some internet forum discusstion "remote command switch test mls cef tcam-shadow off" and CPU load is still at normal 2-5%.
02-25-2013 12:17 PM
Alex,
Do you see anything in the output of the 'show mls cef inconsistency' or 'show mls cef logg' outputs? I can't come up with a valid reason why you should have to disable tcam-shadowing in order for the punts to stop. I expect there's a problem that is causing an inconsistency and resulting in punts to the RP CPU. I would not be content to leave it as is, and would be interested in investigating further the reason for these punts.
-Nick
02-25-2013 02:24 PM
show mls cef exception status is/was always FALSE, I've checked it while high CPU load.
show mls cef logg is empty
#show mls cef inconsistency
Consistency Check Count : 52973
TCAM Consistency Check Errors : 0
SSRAM Consistency Check Errors : 0
Still waiting for high CPU load event after reload 6 days ago except single one minutes right after reboot, but it is gone after (or maybe it was coincidence) "remote command switch test mls cef tcam-shadow off".
02-26-2013 09:07 AM
Alex,
The exception status would be an indication that we had filled the TCAM and had to punt everything. My suspicion is that we are not in an exception state where we are punting everything, but rather that there is an inconsistency causing certain types of traffic to be punted, when they shouldn't. It is possible that disabling tcam shadowing is avoiding this issue, by going directly to the source, rather than relying on the shadow copy to determine if forwarding can take place without a punt.
-Nick
03-05-2013 08:58 PM
Dear All,
Please kindly advise how to use "remote command switch test mls cef tcam-shadow off" for this case. I go command on router but there're not any test option. The below for your reference
remote command switch ?
LINE Remote command string
Best Regards
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide