cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1458
Views
0
Helpful
4
Replies

6509 Sup2 - weird CPU usage and packet loss

smailmilak
Level 4
Level 4

Hello,

we have a 6509 with the following modules:

Mod Ports Card Type                              Model              Serial No.

--- ----- -------------------------------------- ------------------ -----------

  1    2  Catalyst 6000 supervisor 2 (Active)    WS-X6K-SUP2-2GE   

  2    2  Catalyst 6000 supervisor 2 (Standby)   WS-X6K-SUP2-2GE   

  3   16  16 port 1000mb GBIC ethernet           WS-X6416-GBIC    

  4   16  16 port 1000mb GBIC ethernet           WS-X6416-GBIC     

  5   48  SFM-capable 48 port 10/100/1000mb RJ45 WS-X6548-GE-TX  

  6   48  SFM-capable 48 port 10/100/1000mb RJ45 WS-X6548-GE-TX    

  7   48  48 port 10/100 mb RJ45                 WS-X6348-RJ-45    

  8   48  48 port 10/100 mb RJ45                 WS-X6348-RJ-45    

Mod Sub-Module                  Model           Serial           Hw     Status

--- --------------------------- --------------- --------------- ------- -------

  1 Policy Feature Card 2       WS-F6K-PFC2        2.0    Ok

  1 Cat6k MSFC 2 daughterboard  WS-F6K-MSFC2      1.2    Ok

  2 Policy Feature Card 2       WS-F6K-PFC2          2.0    Ok

  2 Cat6k MSFC 2 daughterboard  WS-F6K-MSFC2         1.2    Ok


The problem is that the CPU usage is around 40% and the switch is doing normal routing with BGP but it has only the default route to

the ISP and no full BGP routing table. Other services are not enabled (QoS, ACL...)

There are max. 8000 subscribers (Cable and WiFi) and 800Mbit/s link top, but normaly the link utilization is around 500-600 Mbit/s.

Is the Supervisor to powerless for this much subscribers?

I tried to enable NBAR on one interface and the CPU usage jumped to 99%.

After enabling NBAR I found out that there is a lot of Bittorrent traffic, on second place after HTTP.

And we have packet loss when doing normal ping to the ISP (directly connected over optical cable) on a windows machine.

About every 15th is a lost packet.

And what is more strange is that I am losing packets when I ping the Cat6509 when I am in the same VLAN as the 6509 WAN interface (connected over a 3550 GE swtich with CPU usage around 3%). It should not lose any packets.

I put the sh proc cpu output in the attachment. I typed the command after we disconnected the cable to the ISP for testing so it is a little bit less cpu usage.

I think that the SUP2 is just to weak for this kind of operation.

I whould like to hear your opinions.

Thank you!

4 Replies 4

smailmilak
Level 4
Level 4

Anyone??

Hello,

It sounds like your MSFC is getting hit with high level of traffic at maybe for very short periods and dificult to trace if this are microburst.

You can SPAN the MSFC to a sniffer port as bellow and apply ACL to filter unwanted traffic. This way unwanted traffic will not hit MSFC CPU (it will be done on PFC).

on SP

remote login switch

test monitor add 1 rp-inband both

test monitor del 1 rp-inband both

ON RP


create a monitor session with source using an admin down/unused port - using same monitor session number as SP monitor

create a destination monitor session to sniffer with same session number a SP session

Hopefuly you have an sniffer that can summarise top talkers. It is quite common to have unwanted multicast/broadcast traffic hammering your MSFC CPU.

I hope it helps.

Regards

Gonçalo

Thank your, I will try that as soon as possible.

with

"You can SPAN the MSFC to a sniffer port as bellow and apply ACL to  filter unwanted traffic. This way unwanted traffic will not hit MSFC CPU  (it will be done on PFC)."

you mean that I apply an ACL AFTER I found out what traffic is overloading the MSFC?

on SP

remote login switch

test monitor add 1 rp-inband both

test monitor del 1 rp-inband both

ON RP

SP is the Supervisor?

RP is the MFSC?

Do you have a link for the configuration guide? I dont want to mess something up. I do not have any experience on the 6500 series.

Hello,

There is no formal documentation for this procedure (as far I am aware). I learned this technique a while back.

There is a document highlighting most common issues for high CPU utilization on 6500. This can give you a pretty good idea:

http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note09186a00804916e0.shtml

Back to your other questions:

You will need to apply ACL AFTER you find found out what traffic is overloading the MSFC. This is housekeeping post SPAN traffic analysis.

SP=Switch Processor (mainboard on Sup card and is use for L2 services "mainly")

RP= Router Processor (daughter card on Sup card and it is used for L3 routing "mainly")

Good luck :-)

Gonçalo

Review Cisco Networking for a $25 gift card