02-05-2013 02:35 AM - edited 03-07-2019 11:30 AM
Dear all,
my 3750-E Core Stack is connected to the Provider Router and is the DG for the internal LAN. I saw that the CPU is very high also in the night, but I found not the problem.
I use an SVI to connect the provider due to HA reasons.
I sniffered the network but saw no ecessive broadcaststorms. There was a PBR configured but I deleted it wihtout any success..
Here are some information. Any suggestion or help would be very nice. The default-route is pointed to the providers DG and not to an interface..(U saw that troubleshooting hint already).
switch Version
15.0(1)SE1
10#sh proc cpu so
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
12 2318098713 431897532 5367 61.10% 61.65% 61.97% 0 ARP Input
10#sh platform tcam uti
CAM Utilization for ASIC# 0 Max Used
Masks/Values Masks/values
Unicast mac addresses: 6364/6364 1056/1056
IPv4 IGMP groups + multicast routes: 1120/1120 1/1
IPv4 unicast directly-connected routes: 6144/6144 504/504
IPv4 unicast indirectly-connected routes: 2048/2048 88/88
IPv4 policy based routing aces: 452/452 12/12
IPv4 qos aces: 512/512 21/21
IPv4 security aces: 964/964 37/37
Note: Allocation of TCAM entries per feature uses
a complex algorithm. The above information is meant
to provide an abstract view of the current TCAM utilization
10#sh ip arp sum
519 IP ARP entries, with 1 of them incomplete
10#sh sdm pref
The current template is "desktop default" template.
The selected template optimizes the resources in
the switch to support this level of features for
8 routed interfaces and 1024 VLANs.
number of unicast mac addresses: 6K
number of IPv4 IGMP groups + multicast routes: 1K
number of IPv4 unicast routes: 8K
number of directly-connected IPv4 hosts: 6K
number of indirect IPv4 routes: 2K
number of IPv4 policy based routing aces: 0
number of IPv4/MAC qos aces: 0.5K
number of IPv4/MAC security aces: 1K
02-05-2013 10:32 AM
Move it to 15.0(1)SE3 which last (and quite good) maintenance release of that train.
I assume this is caused by a lots of ARP hitting CPU (debug if you can, logging console off, logg buffer big, PRAY), if not then CPU Profiling needs to be done on CPU, which out of scope of this forum so you shall involve TAC if you have support on those boxes.
sh ip traffic | b ARP (6x times every 10s and paste here).
02-07-2013 04:17 AM
Thanks,
Ok I will move it to the new code as soon as I get a maintenance windows...until that maybe we found the issue here, I need to travel there I guess would be better for debuging etc.....I have support, maybe I make a TAC if I get no solution from here.
here's the output from the show command.
10#sh ip traffic | b ARP
ARP statistics:
Rcvd: 1610376306 requests, 2132760356 replies, 60131 reverse, 0 other
Sent: 45612351 requests, 2136208385 replies (32271832 proxy), 0 reverse
Drop due to input queue full: 24655
10#sh ip traffic | b ARP
ARP statistics:
Rcvd: 1610377353 requests, 2132779412 replies, 60131 reverse, 0 other
Sent: 45612351 requests, 2136227471 replies (32271832 proxy), 0 reverse
Drop due to input queue full: 24655
10#sh ip traffic | b ARP
ARP statistics:
Rcvd: 1610378525 requests, 2132799094 replies, 60131 reverse, 0 other
Sent: 45612351 requests, 2136247202 replies (32271832 proxy), 0 reverse
Drop due to input queue full: 24655
10#sh ip traffic | b ARP
ARP statistics:
Rcvd: 1610379735 requests, 2132819566 replies, 60131 reverse, 0 other
Sent: 45612352 requests, 2136267698 replies (32271832 proxy), 0 reverse
Drop due to input queue full: 24655
10#sh ip traffic | b ARP
ARP statistics:
Rcvd: 1610380963 requests, 2132840172 replies, 60131 reverse, 0 other
Sent: 45612352 requests, 2136288347 replies (32271832 proxy), 0 reverse
Drop due to input queue full: 24655
10#sh ip traffic | b ARP
ARP statistics:
Rcvd: 1610382145 requests, 2132861528 replies, 60131 reverse, 0 other
Sent: 45612352 requests, 2136309732 replies (32271832 proxy), 0 reverse
Drop due to input queue full: 24655
10#sh ip traffic | b ARP
02-07-2013 05:44 AM
Hi Sebastian,
As far as I can see kindly log a tac case for this. I believe there is some server which is sending continuos arp and we have to find it out and shut that link down. For this we have to run some debug commands and it would be good if TAC work with you on this.
Recommendation: 1) check if the arp requests would not have been generated by the server NIC defect. 2) look for incorrectly configured host in the network that was generating high volume of arp requests.
Regards
Inayath
02-07-2013 06:01 AM
10#sh ip traffic | b ARP
ARP statistics:
Rcvd: 1610376306 requests, 2132760356 replies, 60131 reverse, 0 other
Sent: 45612351 requests, 2136208385 replies (32271832 proxy), 0 reverse
Drop due to input queue full: 24655
10#sh ip traffic | b ARP
ARP statistics:
Rcvd: 1610377353 (DELTA 1047) requests, 2132779412(DELTA 19056) replies, 60131 reverse, 0 other
Sent: 45612351 requests, 2136227471(DELTA 19086) replies (32271832 proxy), 0 reverse
Drop due to input queue full: 24655
So this makes:
+ 105pps RX ARP Requests
+ 1905pps RX ARP Replies
+ 1908pps TX ARP Replies
=> Explains CPU load
=> Switch doesn't send many 1/60s ARP Requests but he receives 1900pps of ARP Replies.
=> Switch receives 100pps ARP Requests, but sends 1900pps of ARP Replies.
I cannot think about scenario when this combination can happen.
Since this is core stack, did you check routing of lower layer nodes ? (to be sure they don't route to "interface")
To really understand what are those do as follows within some safe maintenance window:
1) no logging console
2) logging buffered 10000000
3) debug arp (1 second is enough)
4) undebug all
5) show log.
From the analysis of log you can get some idea what is this all about at network level.
Looks like nice nut to crack
02-07-2013 06:31 AM
Thanks, I already requested a maintenance window and a date for traveling on-site.
I will let u know if the debug helps or what the TAC said.
After TSHOOT I will also move to the new code but not just blind without knowing more details about the problem.
02-19-2013 09:44 AM
Sorry for the very late reply. The problem is fixed. It was confiker once again...unbelievable....a virus which hits the CPU up to 68% by ARP requests.
The Guys on site found it one day befor I want to start to travel to the location...
thanks for all you replies.
Sebastian
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide