09-29-2012 06:27 AM - edited 03-04-2019 05:42 PM
Hello everyone,
here is a brief description of a problem:
On a Cisco 7600 with SUP720 BXL sudden increase of CPU utilization (from 35% to 98%) appeared.
This utilisation happend and still happens on the SP CPU (not RP) and it is interrupt based.
router-sp#sh proc cpu
CPU utilization for five seconds: 99%/82%; one minute: 98%; five minutes: 98%
I also got output from ibc
router-sp#show ibc
Interface information:
Interface IBC0/0(idb 0x44E47588)
Hardware is Mistral IBC (revision 5)
5 minute rx rate 24000 bits/sec, 44 packets/sec
5 minute tx rate 68000 bits/sec, 122 packets/sec
1186057298 packets input, 80461312123 bytes
1179948920 broadcasts received
3254930405 packets output, 224120381448 bytes
85350832 broadcasts sent
0 Inband input packet drops
0 Bridge Packet loopback drops
0 Packets CEF Switched, 0 Packets Fast Switched
0 Packets SLB Switched, 0 Packets CWAN Switched
IBC resets = 2; last at 00:28:58.792 CET Wed Nov 30 2011
Using SPAN I collected packets that were punted to SP CPU.
Statistics showed that 96,82% of all pakets were STP BPDUs and second next were CDP packets with 0,43% share.
It can be seen from the collected packets that 70% off all STP BPDUs are from 2 source MAC addresses.
Since this layer 2 network consists of few hundreds Cisco switches (series 7600, 6500, 2960, 3560,3750, and Linksys) it is very hard to trace this 2 MAC addreses.
questions:
-----------------
1) is there a way to somehow trace these two MAC addreses/switches besides loggin into every switch in the network?
Problem is that these 2 MAC addresses don't appear in CAM or ARP tables on network, And also, as far as I know, huge problem is that 2960,3560 series for BID (bridge identifier) use special MAC address which can be seen only with "show version" (and this MAC is also source MAC address for BPDU)
2) regarding the output of "show ibc", the traffic comming and going to/from SP CPU shouldn't be significant for this CPU utilisation of 98%.
5 minute rx rate 24000 bits/sec, 44 packets/sec
5 minute tx rate 68000 bits/sec, 122 packets/sec
That leads me to conclusion that this could be some kind of loop in processing on CPU and reload could help. Have anyone seen maybe something like this? Am I misled or 122 BPDUs per second can realy overwhelm CPU?
3) Generaly have anyone experienced problem with high SP CPU utilisation on SUP720 And what was the cause usualy?
(and if someone can recommend what else should I look for)
Thanks in advance,
A.
10-04-2012 08:54 PM
CPU is high due to interrupts
Could you attach the following outputs
1. Show tech
2. Show platform hardware capacity forwarding
3. Show mls stat
4. show ip traffic
10-05-2012 05:05 AM
10-05-2012 09:45 AM
Thank you
Can you get me the show tech from RP.
Also get me the following outputs too from RP
1. show mls rate
2. Show mls rate usgae
Then loging to SP
1. Debug netdr capture rx
wait for 2 minutes
2. undebug all
3. term len o
4. show netdr captured-packets
Send me the output of "show netdr captured-packets "
10-09-2012 05:38 AM
10-09-2012 08:44 AM
I checked the 4096 packets of netdr. 3685 was STP packets and most of it is coming from
00.15.2B.0D.62.A1 and 00.15.63.05.17.0D.
Youn need to from where these MACs are coming. if there is any loop situation
10-09-2012 12:28 PM
Hi
Thak you very much.problem is that there is few hundred switches in my network.
and those two mac addresses are bridge identifiers.
I cant find them in cam tables and only way I can think of is to log in every switch and issue show version.
Is there a better/easier way to find these addresses ?
Regards
A.
10-10-2012 09:21 AM
Let us check if there is some spanning-tree issues
can you get me
1. Show spanning-tree detail
10-11-2012 06:19 AM
10-11-2012 07:01 AM
There is lot of topology changes for amny Vlans. Below is few
VLAN0005 is executing the rstp compatible Spanning Tree protocol
Bridge Identifier has priority 32768, sysid 5, address 0015.c630.8e00
Configured hello time 2, max age 20, forward delay 15, tranmsit hold-count 6
Current root has priority 0, address 0015.63f3.4180
Root port is 769 (TenGigabitEthernet7/1), cost of root path is 2
Topology change flag set, detected flag not set
Number of topology changes 309093 last change occurred 00:00:21 ago
from TenGigabitEthernet7/1
VLAN0013 is executing the rstp compatible Spanning Tree protocol
Bridge Identifier has priority 32768, sysid 13, address 0015.c630.8e00
Configured hello time 2, max age 20, forward delay 15, tranmsit hold-count 6
Current root has priority 0, address 0015.63f3.4180
Root port is 769 (TenGigabitEthernet7/1), cost of root path is 2
Topology change flag set, detected flag not set
Number of topology changes 309077 last change occurred 00:00:24 ago
from TenGigabitEthernet7/1
VLAN0014 is executing the rstp compatible Spanning Tree protocol
Bridge Identifier has priority 32768, sysid 14, address 0015.c630.8e00
Configured hello time 2, max age 20, forward delay 15, tranmsit hold-count 6
Current root has priority 0, address 0015.63f3.4180
Root port is 769 (TenGigabitEthernet7/1), cost of root path is 2
Topology change flag set, detected flag not set
Number of topology changes 309113 last change occurred 00:00:24 ago
from TenGigabitEthernet7/1
Check the below command multiple times and check if there is continuous spanning-tree changes and if it is always from Ten 7/1
1. show spanning-tree detail | inc compatible Spanning Tree protocol||Last change|From
If the last change is always coming from Ten 7/1 .Check what is connected to Ten 7/1 and check the same above command there
Thank you
Raju
10-11-2012 07:02 AM
Hello Antonio,
there are several vlans with more then 300000 topology changes with last change in less then a minute
Example:
VLAN0005 is executing the rstp compatible Spanning Tree protocol
Bridge Identifier has priority 32768, sysid 5, address 0015.c630.8e00
Configured hello time 2, max age 20, forward delay 15, tranmsit hold-count 6
Current root has priority 0, address 0015.63f3.4180
Root port is 769 (TenGigabitEthernet7/1), cost of root path is 2
Topology change flag set, detected flag not set
Number of topology changes 309093 last change occurred 00:00:21 ago
>>> from TenGigabitEthernet7/1
VLAN0011 is executing the rstp compatible Spanning Tree protocol
Bridge Identifier has priority 32768, sysid 11, address 0015.c630.8e00
Configured hello time 2, max age 20, forward delay 15, tranmsit hold-count 6
Current root has priority 0, address 0015.63f3.4180
Root port is 769 (TenGigabitEthernet7/1), cost of root path is 2
Topology change flag set, detected flag not set
Number of topology changes 309079 last change occurred 00:00:24 ago
>>> from TenGigabitEthernet7/1
interface te7/1 points to the root bridge for these vlans.
You should move on the root bridge and use the same command to identify the interface that received TCN bpdu and recursively you can find the device(s) causing this.
I would focus on one vlan at a time then I would repeat the search for other two vlans to see if the resulting device is the same.
Likely you will find out the same device(s) causing TC in multiple Vlans.
Hope to help
Giuseppe
10-12-2012 02:38 AM
Hi,
thank you both for your answers
but occasional TCNs sholdn't be the reason for such a high CPU usage on sup720BXL SP, should it?
it's a relatively huge layer2 network (mixture of MST and PVST+, and CST) and from time to time I do trace the origins of TCN BPDUs (ususally due to unknown unicast flooding problem).
And further more, even in quiet periods (when there is no TCN for 5-10 minutes SP CPU doesn't drop below 90%.
All other switches (2960, 6500 series) in network also propagate those TCNs but their CPU are normal (below 40%).
I traced last ten swtches that originated TCNs and non of these had the two MAC addresses found in the debugging earlier.
Regards,
A
10-12-2012 04:59 AM
It is possible that your netdr was taken during this topology flaps and we got only stp packets in the capture
Collect the netdr when the STP is stable and send to me. We will analyze it
Also could you please send me the following output
1. Show mls cef exception status
10-15-2012 03:45 AM
Hi,
I captured packets during STP stable status. I also attached outputs which confirms that there were no STP toplogy changes during the capturing.
But it seems that again there is only BPDU STP packets.
here is also requested show output:
1) router#show mls cef exCEption status
Current IPv4 FIB exception state = FALSE
Current IPv6 FIB exception state = FALSE
Current MPLS FIB exception state = FALSE
regards,
A.
10-12-2012 05:13 AM
Hello Antonio,
>> I traced last ten swtches that originated TCNs and non of these had the two MAC addresses found in the debugging earlier.
there are some strange facts in your issue:
a) analysis of packet capture of traffic punted to SP cpu shows a lot of STP messages sourced by two specific MAC addresses
00.15.2B.0D.62.A1 -----> 0015.2B0D.62A1 in IOS CLI format
00.15.63.05.17.0D -----> 0015.6305.170D in IOS CLI format
STP messages sourced by these two MAC addresses should be originated by adjacent switches, as STP BPDUs do not travel in the network, but are generated switch hop by switch hop.
You say that affected node CAM table does not contain entries for the two MAC addresses listed above.
This is strange they should be there.
Those two MAC addresses should not need to be searched everywhere in the network, they should belong to switch ports of one or two adjacent switches.
Now, unless the switch does not perform MAC address learning on the source MAC addresses of STP BPDUs, the only reason for not having MAC addresses in the CAM table is that the CAM table is full.
So I would check how many MAC addresses are in the CAM table.
B) the analysis of show spanning-tree detail shows that there are many (not all) Vlans that have experienced high number of STP topology changes (more then 300000) and that for each affected vlan like vlan 5,11,13 the port on which the topology change has been received is tengiga7/1 that is also the root port pointing to the root bridge.
The command details the root bridge-id and the designated port bridge-id, but not the designated port MAC address
Port 769 (TenGigabitEthernet7/1) of VLAN0013 is root forwarding
Port path cost 2, Port priority 128, Port Identifier 128.769.
Designated root has priority 0, address 0015.63f3.4180
Designated bridge has priority 0, address 0015.63f3.4180
Designated port id is 128.2, designated path cost 0
Timers: message age 16, forward delay 0, hold 0
Number of transitions to forwarding state: 1
Link type is point-to-point by default, Peer is STP
BPDU: sent 5, received 7398335
Here, the source MAC address of these 7398335 RX BPDUs is not told to us.
Hope to help
Giuseppe
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide