10-04-2010 09:38 AM - edited 03-06-2019 01:18 PM
We just replaced our core network with (2) 3560G's. CORE1 is version 04. CORE2 is version 01. Both are running about 10 VLAN's with HSRP and have 12 EIGRP peers. Both are running auto QoS. We're passing about 50 Mbps through CORE1 and 3 Mbps through CORE2.
CORE1's CPU ranges from 8-12%. CORE2's CPU was showing about 7% this weekend when network utilization was virtually nothing. Now, with 3 Mbps through it, the CPU is at about 25-30%. The IP Input process is at about 10-20%:
CPU utilization for five seconds: 17%/3%; one minute: 21%; five minutes: 26%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
190 7080239 17633742 401 7.02% 10.04% 13.88% 0 IP Input
CORE2#show ip int | i CEF
Shows the following for each interface:
IP CEF switching is enabled
IP CEF switching turbo vector
IP route-cache flags are Fast, CEF
There is no EIGRP reconvergence occuring - the queues are 0.
Could this difference be due to the different versions of the switches and is this cause for concern? I find it odd that CORE1's CPU utilization is minimal at ~8% under peak load and CORE2's (which is essentially idle) CPU utilization is ~20-30%.
Solved! Go to Solution.
10-05-2010 06:00 PM
Just enter the command "clear ip traffic" and hit Enter. No idea why that one was hidden.
HSRP should not drive the ARP numbers really.
If all the traffic coming through the core sees one of the routers as the best path towards the destination subnet it would arp. But remember the ARP's are only for directly connected ip's.
Did you confirm at what rate the 'sh ip traffic' counters were going up and which ones were increasing the most correlating to the higher
CPU?
I agree...get someone from TAC on the box to take a closer look on the box with you.
Post the resolution back to the thread once it's figured out.
10-04-2010 10:00 AM
If you look at "show interface stat" can you identify which interface/VLAN is causing so many of the process level packets. It's packets going out of hardware to process level that drives the "IP Input" process up.
You may be able to catch and dump a few of them via "show buffers input-interface
Then you have to do an analysis of those packets to see why they are not being hardware switched through the device.
It could be things such as "TTL expired, packets to an ip address on the box, packets with options set, etc...".
10-04-2010 10:50 AM
Okay, from what I can see from 'show inter stat', there are 2 L3 ports that connect to our voice service routers that handle PRI/CONF/XCODE/MTP. The counters appear to be incrementing quicker on the CORE2 than CORE1, so I believe the path on CORE2 is being chosen over the path on CORE1. I assume the voice packets fall under the process switched "IP Packets with Options"? I'm seeing about 300pps of voice traffic through that interface. I'm now separately graphing PRI/CONF/XCODE/MTP utilization so I can compare those graphs to the CORE2 CPU utilization to see if there's a correlation.
Any suggestions on how to improve performance or should I leave as-is? I supposed I could disable QoS going to the voice service routers since only voice traffic hits those links, but it would be better to leave things as-is for now since 20% CPU utilization is acceptable.
10-04-2010 10:55 AM
Unless you have some features configured that need to look in the payload those voice packets should be hardware switched through the device.
Did you try the "show buffers input-interface
10-04-2010 11:10 AM
So on both 3560's:
0/13 = Voice Service Router 1 (PRI/XCODE/MTP/CONF)
0/14 = Voice Service Router 2 (PRI/XCODE/MTP/CONF)
CORE1:
GigabitEthernet0/13 is up, line protocol is up (connected)
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
GigabitEthernet0/14 is up, line protocol is up (connected)
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
CORE2:
GigabitEthernet0/13 is up, line protocol is up (connected)
Input queue: 44/75/0/0 (size/max/drops/flushes); Total output drops: 0
GigabitEthernet0/14 is up, line protocol is up (connected)
Input queue: 19/75/0/0 (size/max/drops/flushes); Total output drops: 0
The output from 'show buffers input-interface gig 0/13 packet' shows voice packets from the voice service router to various voice endpoints (phones, SIP gateway, etc.)
Gig 0/13 is configured as follows:
interface GigabitEthernet0/13
description TS-VSVC1-Gig0/1
no switchport
ip address 64.27.32.13 255.255.255.252
no ip redirects
no ip unreachables
ip pim sparse-dense-mode
ip igmp version 3
ip cgmp
load-interval 30
speed 1000
duplex full
srr-queue bandwidth share 1 30 35 5
queue-set 2
priority-queue out
mls qos trust dscp
auto qos trust
end
10-04-2010 11:12 AM
And I just noticed no QoS configured on the router side:
Voice Service Router 1:
interface GigabitEthernet0/1
description To TS-CORE2
ip address 64.27.32.14 255.255.255.252
no ip redirects
no ip unreachables
ip pim sparse-dense-mode
ip cgmp
load-interval 30
duplex full
speed 1000
end
And here is a packet in the buffer on CORE2:
CORE2#show buffers input-interface gig 0/13 packet
Buffer information for RxQ7 buffer at 0x4400A10
data_area 0x6A8A670, refcount 1, next 0x5253DE8, flags 0x200
linktype 7 (IP), enctype 1 (ARPA), encsize 14, rxtype 1
if_input 0x47F83B8 (GigabitEthernet0/13), if_output 0x0 (None)
inputtime 2d16h (elapsed 00:00:31.516)
outputtime 00:00:00.000 (elapsed never), oqnumber 65535
datagramstart 0x6A8A6B6, datagramsize 214, maximum size 2196
mac_start 0x6A8A6B6, addr_start 0x6A8A6B6, info_start 0x0
network_start 0x6A8A6C4, transport_start 0x6A8A6D8, caller_pc 0x185E23C
source: 10.14.10.131, destination: 10.10.10.24, id: 0x4B96, ttl: 254,
TOS: 184 prot: 17, source port 17186, destination port 17204
0: 0012DAD9 2DCA0013 7F1021B1 080045B8 ..ZY-J....!1..E8
16: 00C84B96 0000FE11 47240A0E 0A830A0A .HK...~.G$......
32: 0A184322 433400B4 00008000 3FB9BF3A ..C"C4.4....?9?:
48: 69482297 064BFF7E 7B7E7C7D 7D7C7F7E iH"..K.~{~|}}|.~
64: FF7F7FFD 7FFEFEFD FC7CFFFE 7D7D7CFE ...}.~~}||.~}}|~
80: 7E7B7F7D 7E7F7EFD 7FFE7E7B FD7F7C7E ~{.}~.~}.~~{}.|~
96: 7F7D7AFF 7E79FD7F 7CFCFFFE FE7EF97E .}z.~y}.||.~~~y~
112: 7DFB777F 7E79FC79 7CFD7BFC 7D7CF97C }{w.~y|y|}{|}|y|
128: 7DFB7DFE FD7A7EFE 7D7F7C7E FF7C7CFE }{}~}z~~}.|~.||~
144: FC7E7CFF FC7CFDFC 79FAFF79 FB7A7E7D |~|.||}|yz.y{z~}
160: 79FA777C FA7AF97F 7CF77BFF FC7DFB77 yzw|zzy.|w{.|}{w
176: 7CFC79FE 7B7BFD7A 7DFEFBFD 7AFBFA7D ||y~{{}z}~{}z{z}
192: 7BFEFE7C FE7C7CFE 7DFF7CFF FA7BFF7D {~~|~||~}.|.z{.}
208: 7AF97C7B FE7B00 zy|{~{.
10-04-2010 12:31 PM
What does 'sh ip cef
10.10.10.24 detail' say?
If you have a valid CEF rewrite for that ip address and you see a lot of those frames on the input queue you probably need to have TAC take a closer look at the hardware programming.
The frame appears to be a small UDP frame and QOS should not impact it on ingress for Gig 0/13.
What interface does the 10.10.10.24 destination go through and what is the configuration on it?
Could you capture 'sh ip cef switching stat' to see if it gives an indication of punt reason for any traffic?
10-04-2010 01:26 PM
10.10.10.24 is the destination IP in the output I included, but the destination changes based on the call, device, etc. The source IP of 10.14.10.131 is consistent across the buffered packets and is what is connected to the interfaces that have their input queues non-zero. So to me it seems that the packets from the voice service router may have IP options or something else set that's causing the 3560 to process switch those packets. I noticed that on the voice service routers the only configuration relating to QoS is:
sccp ip precedence 3
I may also remove that line and then add 'auto qos voip' (which should default to untrust) on the interfaces to CORE1 and CORE2.
Here is the output you requested:
CORE2#show ip cef 10.10.10.24 detail
10.10.10.24/32, epoch 2, flags attached
Adj source: IP adj out of Vlan2, addr 10.10.10.24 049461A0
Dependent covered prefix type adjfib cover 10.10.0.0/16
attached to Vlan2
Okay, so on our CORE2 we have:
interface Vlan2
description Voice/CallManager VLAN
ip address 10.10.10.8 255.255.0.0
ip helper-address 10.10.10.10
ip helper-address 10.10.10.14
no ip redirects
no ip unreachables
ip pim sparse-dense-mode
ip igmp version 3
ip cgmp
no ip mroute-cache
load-interval 30
standby 0 ip 10.10.10.1
standby 0 preempt
end
On CORE1 we have:
interface Vlan2
description Voice/CallManager VLAN
ip address 10.10.10.90 255.255.0.0 secondary
ip address 10.10.10.7 255.255.0.0
ip helper-address 10.10.10.10
ip helper-address 10.10.10.14
no ip redirects
no ip unreachables
ip pim sparse-dense-mode
no ip route-cache cef
no ip route-cache
ip cgmp
no ip mroute-cache
load-interval 30
standby 0 ip 10.10.10.1
standby 0 priority 105
standby 0 preempt
end
I'm not sure why the bold statements are there. I inherited this config and haven't changed them. I think I'll remove them after-hours.
CORE2#show ip cef switching stat
Reason Drop Punt Punt2Host
RP LES Neighbor resolution req 37 0 0
RP LES Total 37 0 0
All Total 37 0 0
10-04-2010 02:07 PM
I'm having a hard time following your topology. Draw it out and show the source and destination and routers along the path.
You should never turn off CEF so do turn that back on.
If the problem still exist after that post the topology.
Rodney
10-04-2010 02:16 PM
CORE1---------------------------------- CORE2
gig0/13 gig0/13
| |
| |
---gig 0/0 -VOICERTR- gig 0/1----
gig 0/13 are both CORE's are routed interfaces. gig 0/13 on both CORE's show a non-zero input queue as traffic increases. So, it appears that traffic from the voice router is being punted. The voice router is routing packets that it sources to CORE2. This is why we're seeing a higher input queue depth and higher CPU on CORE2.
The voice router does not have QoS applied to gig0/0 or gig0/1 (though it should). The only QoS, etc. related entry on this router is:
sccp ip precedence 3
10-04-2010 02:27 PM
You said they are routed interfaces but posted the VLAN configuration which is what confused me.
10-04-2010 08:00 PM
Okay, I ran some additional tests this evening since traffic was low. With no traffic to the voice router connected to CORE2, you can see everything is idle:
CORE2: show proc cpu sort
CPU utilization for five seconds: 6%/0%; one minute: 21%; five minutes: 18%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
197 26737 2452 10904 0.47% 0.28% 0.33% 1 Virtual Exec
132 346717 9696901 35 0.15% 0.08% 0.07% 0 Hulc LED Process
204 2802256 5255833 533 0.15% 0.17% 0.24% 0 Spanning Tree
141 136523 104736 1303 0.15% 0.05% 0.04% 0 HRPC qos request
GigabitEthernet0/13 is up, line protocol is up (connected)
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
Then, I setup a conference and enabled packet debugging. With 1 conference going:
CPU utilization for five seconds: 10%/1%; one minute: 28%; five minutes: 18%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
190 9787979 25177958 388 4.31% 16.49% 8.70% 0 IP Input
GigabitEthernet0/13 is up, line protocol is up (connected)
Input queue: 2/75/0/0 (size/max/drops/flushes); Total output drops: 0
So, it's obvious that traffic through this port is causing the input queue depth to increase and also the CPU utilization to increase. (I'm sure the packet debug increases it as well, but the CPU increase along with the input queue increasing are the same symptoms as earlier.)
So, I enabled 'debug ip packet' with an ACL matching the IP of the router connected to Gig0/13 and saw a lot of the following:
Oct 4 22:46:54: IP: tableid=0, s=10.14.10.132 (GigabitEthernet0/14), d=10.14.10.131 (GigabitEthernet0/13), routed via FIB
Oct 4 22:46:54: IP: s=10.14.10.132 (GigabitEthernet0/14), d=10.14.10.131 (GigabitEthernet0/13), len 200, output feature, Check hwidb(72), rtype 1, forus FALSE, sendself FALSE, mtu 0, fwdchk FALSE
Oct 4 22:46:54: IP: s=10.14.10.132 (GigabitEthernet0/14), d=10.14.10.131 (GigabitEthernet0/13), g=64.27.32.14, len 200, forward
Oct 4 22:46:54: IP: s=10.14.10.132 (GigabitEthernet0/14), d=10.14.10.131 (GigabitEthernet0/13), len 200, sending full packet
Oct 4 22:46:54: IP: s=10.14.10.131 (GigabitEthernet0/13), d=10.14.10.132, len 200, input feature, MCI Check(63), rtype 0, forus FALSE, sendself FALSE, mtu 0, fwdchk FALSE
Oct 4 22:46:54: IP: tableid=0, s=10.14.10.131 (GigabitEthernet0/13), d=10.14.10.132 (GigabitEthernet0/14), routed via FIB
Oct 4 22:46:54: IP: s=10.14.10.131 (GigabitEthernet0/13), d=10.14.10.132 (GigabitEthernet0/14), len 200, output feature, Check hwidb(72), rtype 1, forus FALSE, sendself FALSE, mtu 0, fwdchk FALSE
Oct 4 22:46:54: IP: s=10.14.10.131 (GigabitEthernet0/13), d=10.14.10.132 (GigabitEthernet0/14), g=64.27.32.22, len 200, forward
Oct 4 22:46:54: IP: s=10.14.10.131 (GigabitEthernet0/13), d=10.14.10.132 (GigabitEthernet0/14), len 200, sending full packet
Oct 4 22:46:54: IP: s=10.14.10.131 (GigabitEthernet0/13), d=10.14.10.132, len 200, input feature, MCI Check(63), rtype 0, forus FALSE, sendself FALSE, mtu 0, fwdchk FALSE
Oct 4 22:46:54: IP: tableid=0, s=10.14.10.131 (GigabitEthernet0/13), d=10.14.10.132 (GigabitEthernet0/14), routed via FIB
Oct 4 22:46:54: IP: s=10.14.10.131 (GigabitEthernet0/13), d=10.14.10.132 (GigabitEthernet0/14), len 200, output feature, Check hwidb(72), rtype 1, forus FALSE, sendself FALSE, mtu 0, fwdchk FALSE
Oct 4 22:46:54: IP: s=10.14.10.131 (GigabitEthernet0/13), d=10.14.10.132 (GigabitEthernet0/14), g=64.27.32.22, len 200, forward
Oct 4 22:46:54: IP: s=10.14.10.131 (GigabitEthernet0/13), d=10.14.10.132 (GigabitEthernet0/14), len 200, sending full packet
Oct 4 22:46:54: IP: s=10.14.10.132 (GigabitEthernet0/14), d=10.14.10.131, len 200, input feature, MCI Check(63), rtype 0, forus FALSE, sendself FALSE, mtu 0, fwdchk FALSE
So, "routed via FIB" means the packets are CEF switched, correct? I'm stumped. Could this be a bug?
10-05-2010 07:53 AM
Can you post the configuration of the gig 0/13 and 0/14 interfaces?
Are they L3 ports or L2 ports inside of a VLAN?
I ask because " s=10.14.10.132 (GigabitEthernet0/14), d=10.14.10.131 (GigabitEthernet0/13)"
would be an ip packet coming in gig 0/14 that is having to be CEF forwarded out gig 0/13.
If they are on the same L3 interface an ICMP redirect would be sent. I wonder if that is the forward scenario and
even if you have the ip redirects turned off if the hardware isn't recognizing that and still punting to try and get the
redirect sent.
10-05-2010 08:05 AM
I think what happened here was that one call was terminated on an MTP on the router on gig0/14 and then a second call on gig0/13. I conferenced the calls together to make sure we'd see activity. So, that is why you seeing traffic between gig0/13 and gig0/14. The same issue occurs for traffing coming into gig 0/13 or gig0/14 destined elsewhere.
I'm thinking about putting a sniffer on gig0/13 today to capture some of the UDP packets in question and to see if they have any options set, etc. I'm not sure why they would because the config on the voice routers (2811's) isn't that complex.
The config is as follows:
CORE2:
interface GigabitEthernet0/13
description TS-VSVC1-Gig0/1
no switchport
ip address x.x.x.x 255.255.255.252
no ip redirects
no ip unreachables
ip pim sparse-dense-mode
ip igmp version 3
ip cgmp
load-interval 30
speed 1000
duplex full
srr-queue bandwidth share 1 30 35 5
queue-set 2
priority-queue out
mls qos trust dscp
auto qos trust
end
CORE2:
interface GigabitEthernet0/14
description TS-VSVC2-Gig0/1
no switchport
ip address x.x.x.x 255.255.255.252
no ip redirects
no ip unreachables
ip pim sparse-dense-mode
ip igmp version 3
ip cgmp
load-interval 30
speed 1000
duplex full
srr-queue bandwidth share 1 30 35 5
queue-set 2
priority-queue out
mls qos trust dscp
auto qos trust
end
VSVC1:
interface GigabitEthernet0/1
description To TS-CORE2
ip address x.x.x.x 255.255.255.252
no ip redirects
no ip unreachables
ip pim sparse-dense-mode
ip cgmp
load-interval 30
duplex full
speed 1000
end
10-05-2010 08:16 AM
The next step would be to dig in to the hardware forwarding entries for those destinations on the 3560 to figure out why they are being punted. I don't see any of the obvious reasons there. I would suggest opening a TAC SR to take a deeper look.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide