03-05-2013 03:05 AM - edited 03-04-2019 07:12 PM
Hi
Hi
I am observing a strange behaviour of high CPU spikes once in a week at a very same time and also in the same pattern . The loggs are below. Show version command attached. The switch did not crash but a lots of %SYS-3-CPUHOG: appears after couple of %PLATFORM_UCAST-4-PREFIX: messages at the end. Any idea ?
Mar 5 10:25:38: %SYS-3-CPUHOG: Task is running for (2099)msecs, more than (2000)msecs (8/3),process = HL3U bkgrd process.
-Traceback= 0x1BA3250z 0x27A933Cz 0x27A9264z 0xE4ABD8z 0x1BA2300z 0x1BA3824z 0x27A933Cz 0x27A9264z 0x1FAB190z 0x1EFF338z 0x1F006B8z 0x1F00BE4z 0x1F01F14z 0x1FABCF8z 0x1FBAB58z 0x1FAFCDCz
Mar 5 10:25:48: %SYS-3-CPUHOG: Task is running for (2098)msecs, more than (2000)msecs (3/0),process = HL3U bkgrd process.
-Traceback= 0x1C21168z 0x1C214F8z 0x1BFE4F0z 0x4EFBA8z 0x4E5904z 0x4D944Cz 0x1FAB140z 0x1EFF338z 0x1F006B8z 0x1F00BE4z 0x1F01F14z 0x1FABCF8z 0x1FAF3ECz 0x1FBAC30z 0x1FAFCDCz 0x1FB06CCz
Mar 5 10:25:59: %SYS-3-CPUHOG: Task is running for (2098)msecs, more than (2000)msecs (6/4),process = HL3U bkgrd process.
-Traceback= 0x1FD2578z 0x2597634z 0x1FDE560z 0x1FBAFECz 0x1FAF4E8z 0x1FBAC30z 0x1FAFCDCz 0x1FB06CCz 0x1FB1524z 0x1FA4DA4z 0x2687C04z 0x2682358z
Mar 5 10:26:09: %SYS-3-CPUHOG: Task is running for (2106)msecs, more than (2000)msecs (27/5),process = HL3U bkgrd process.
-Traceback= 0x1C21368z 0x1BFE4D0z 0x4EFBA8z 0x4E5904z 0x4D944Cz 0x1FAB140z 0x1EFF338z 0x1F006B8z 0x1F00BE4z 0x1F01F14z 0x1FABCF8z 0x1FAF3ECz 0x1FBAC30z 0x1FAFCDCz 0x1FB06CCz 0x1FB1524z
Mar 5 10:26:19: %SYS-3-CPUHOG: Task is running for (2099)msecs, more than (2000)msecs (10/7),process = HL3U bkgrd process.
-Traceback= 0x1BFE4B0z 0x4EFBA8z 0x4E5904z 0x4D944Cz 0x1FAB140z 0x1EFF338z 0x1F006B8z 0x1F00BE4z 0x1F01F14z 0x1FABCF8z 0x1FBAB58z 0x1FAFCDCz 0x1FB06CCz 0x1FB1524z 0x1FA4DA4z 0x2687C04z
Mar 5 10:26:41: %PLATFORM_UCAST-4-PREFIX: One or more, more specific prefixes could not be programmed into TCAM and are being covered by a less specific prefix, and the packets may be software forwarded
Mar 5 10:28:59: %PLATFORM_UCAST-4-PREFIX: One or more, more specific prefixes could not be programmed into TCAM and are being covered by a less specific prefix, and the packets may be software forwarded
Switch Ports Model SW Version SW Image
------ ----- ----- ---------- ----------
* 1 18 WS-C3560E-12SD 15.0(2)SE1 C3560E-UNIVERSALK9-M
03-05-2013 03:35 AM
Hello Syerdumairali,
>> Mar 5 10:28:59: %PLATFORM_UCAST-4-PREFIX: One or more, more specific prefixes could not be programmed into TCAM and are being covered by a less specific prefix, and the packets may be software forwarded
This is much more important the the CPUHOG messages
This is the sign that the switch is learning too many routes, there are too many CEF entries that cannot be programmed in the TCAM for hardware based forwarding.
You can check the total number of routes with
show ip route summary.
Now, the use of TCAM resources is decided by the SDM template in use. A routing SDM template may give you more routing entries and less entries for MAC addresses.
see
A reload is needed after an SDM template change
If SDM template cannot be changed or it is already the routing SDM you need to reduce the number of routes received by the switch using route summarization or other protocol specific tools ( like OSPF stub areas)
Hope to help
Giuseppe
03-05-2013 04:40 AM
Thanks Giuseppe,
Output of the sh ip route summary, and cef stats attached. Could you point out if the number of router is the case here ? I can see they are far less routes than allowed (8k). Or am I seeing the wrong numbers ?
swgw2#sh ip route summary
IP routing table name is default (0x0)
IP routing table maximum-paths is 32
Route Source Networks Subnets Replicates Overhead Memory (bytes)
connected 1 115 0 6960 20416
static 0 0 0 0 0
ospf 1 83 605 0 53340 123840
Intra-area: 496 Inter-area: 41 External-1: 0 External-2: 151
NSSA External-1: 0 NSSA External-2: 0
internal 85 52424
Total 169 720 0 60300 196680
swgw2#sh ip cef switching statistics
Reason Drop Punt Punt2Host
RP LES No route 682 0 2
RP LES No adjacency 13640 0 333
RP LES Incomplete adjacency 5272818 0 0
RP LES TTL expired 0 0 177
RP LES IP options set 0 0 4212
RP LES Features 1 0 3
RP LES Neighbor resolution req 191820 6085 0
RP LES Total 5478961 6085 4727
All Total 5478961 6085 4727
03-05-2013 06:47 AM
Hello Syedumairali,
yes the number of total routes looks like less then 8k. This is strange the error message is typically seen in a scenario like the one I have described in my previous post.
>> IP routing table maximum-paths is 32
Even considering multiple CEF entries used for equal cost multi path it cannot justify the error message.
Hope to help
Giuseppe
03-12-2013 03:25 AM
Hi Giuseppe,
I have again seen the exactly the same message at he very same time. I have taken few traces and found out that there is a machine in our network which scan all of the subnets (may be exchange server).
Than I looked out the tcam utlization (sh platform tcam uti), that shows unicast directly connected routes are fully utilized.
IPv4 unicast directly-connected routes: 2048/2048 2048/2048
sh platform ip unicast faile routes also shows a huge over 2000 entries
sh platform ip unicast faile adjacencies shows also high failure with a strange message " ATM fail when added, still has ATM fail" for some of entries.
Questions :
why tcam for directly connected routes is overutilized in case of scanning a system ?
why the strange message "ATM fail when added, still has ATM fail" ?
Do you still think changing a sdm need to be change ?
Regards,
Umair
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide