Re: 3750x ping and cpu usage - Page 2

victor910 · ‎07-30-2020

HI,

I have a couple "problems"

first of all, someone explain to me how I have total 12% cpu usage, just interesting:

switch#show processes cpu sorted
CPU utilization for five seconds: 12%/0%; one minute: 13%; five minutes: 11%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
75 19872 3283 6053 3.67% 3.63% 2.97% 0 RedEarth Tx Mana
74 8910 5058 1761 1.59% 1.62% 1.36% 0 RedEarth I2C dri
165 471 214 2200 1.43% 0.59% 0.13% 1 SSH Process
119 4245 507 8372 0.79% 0.79% 0.66% 0 hpm counter proc
159 309 11981 25 0.15% 0.05% 0.02% 0 Hulc LED Process
170 827 120 6891 0.15% 0.15% 0.15% 0 HQM Stack Proces
1 0 7 0 0.00% 0.00% 0.00% 0 Chunk Manager

for 5 secs must be - 7.78%, ok 8%.

or we have this "smart" mathematic":

3.67 -> 4

1.59 -> 2

1.43 -> 2

0.79 -> 1

0.15 -> 1

but any way:

4+2+2+1+1+1 = 11% not 12%.

second question, why this process ALWAYS uses my CPU:

RedEarth Tx Mana

RedEarth I2C dri

how to fix this" problem"?

probable you say this "not affecting" on traffic, ok then my last and most important question:

I believe problem #3 coming from "problem #1 and #2.

question #3,

why I have undestable ping responce from this switch, device is pinging directly connected to the switch port with 1 gbit/s speed, rg45 cable.

response from switch:

Reply from 10.10.0.254: bytes=32 time<1ms TTL=255
Reply from 10.10.0.254: bytes=32 time<1ms TTL=255
Reply from 10.10.0.254: bytes=32 time<1ms TTL=255
Reply from 10.10.0.254: bytes=32 time<1ms TTL=255
Reply from 10.10.0.254: bytes=32 time<1ms TTL=255
Reply from 10.10.0.254: bytes=32 time<1ms TTL=255
Reply from 10.10.0.254: bytes=32 time=179ms TTL=255
Reply from 10.10.0.254: bytes=32 time=202ms TTL=255
Reply from 10.10.0.254: bytes=32 time=93ms TTL=255
Reply from 10.10.0.254: bytes=32 time=171ms TTL=255
Reply from 10.10.0.254: bytes=32 time=25ms TTL=255
Reply from 10.10.0.254: bytes=32 time<1ms TTL=255
Reply from 10.10.0.254: bytes=32 time<1ms TTL=255

the device is Cisco 3750X

Same behaviour under IOS:

15.2.4E10

15.0.2.SE12

12.2.55.SE13

Config is very close to factory default, nothing special.

Cisco Guru you are welcome.

Leo Laohoo · ‎07-31-2020

Ok, now post the config of the switch.

victor910 · ‎08-01-2020

switch#show running-config
Building configuration...

Current configuration : 3414 bytes
!
! Last configuration change at 23:35:28 OmniTZ Fri Jul 31 2020 by root
! NVRAM config last updated at 23:57:14 OmniTZ Fri Jul 31 2020 by root
!
version 12.2
no service pad
service timestamps debug datetime msec
service timestamps log datetime msec
service password-encryption
!
hostname switch
!
boot-start-marker
boot-end-marker
!
enable password 7
!
username root privilege 15 secret 5
!
!
aaa new-model
!
!
aaa authentication login LOL_USERS local
!
!
!
aaa session-id common
clock timezone OmniTZ 10
switch 1 provision ws-c3750x-24p
system mtu routing 1500
ip dhcp excluded-address 10.10.0.1 10.10.0.128
!
ip dhcp pool dhcp_pool
network 10.10.0.0 255.255.0.0
default-router 10.10.0.1
domain-name lucas
dns-server 10.10.0.1
!
!
ip domain-name lucas
ip name-server 10.10.0.1
!
!
!
license boot level ipservices
spanning-tree mode pvst
spanning-tree extend system-id
!
!
!
!
vlan internal allocation policy ascending
!
ip ssh version 2
!
!
!
interface FastEthernet0
no ip address
shutdown
!
interface GigabitEthernet1/0/1
!
interface GigabitEthernet1/0/2
shutdown
!
interface GigabitEthernet1/0/3
shutdown
!
interface GigabitEthernet1/0/4
shutdown
!
interface GigabitEthernet1/0/5
shutdown
!
interface GigabitEthernet1/0/6
shutdown
!
interface GigabitEthernet1/0/7
shutdown
!
interface GigabitEthernet1/0/8
shutdown
!
interface GigabitEthernet1/0/9
shutdown
!
interface GigabitEthernet1/0/10
shutdown
!
interface GigabitEthernet1/0/11
shutdown
!
interface GigabitEthernet1/0/12
shutdown
!
interface GigabitEthernet1/0/13
shutdown
!
interface GigabitEthernet1/0/14
shutdown
!
interface GigabitEthernet1/0/15
shutdown
!
interface GigabitEthernet1/0/16
shutdown
!
interface GigabitEthernet1/0/17
shutdown
!
interface GigabitEthernet1/0/18
shutdown
!
interface GigabitEthernet1/0/19
description vic
!
interface GigabitEthernet1/0/20
description monitor
!
interface GigabitEthernet1/0/21
description dankavlad
!
interface GigabitEthernet1/0/22
shutdown
!
interface GigabitEthernet1/0/23
description wifi
!
interface GigabitEthernet1/0/24
shutdown
!
interface GigabitEthernet1/1/1
description wan
shutdown
!
interface GigabitEthernet1/1/2
shutdown
!
interface GigabitEthernet1/1/3
shutdown
!
interface GigabitEthernet1/1/4
shutdown
!
interface TenGigabitEthernet1/1/1
shutdown
!
interface TenGigabitEthernet1/1/2
shutdown
!
interface Vlan1
ip address 10.10.0.254 255.255.0.0
!
ip classless
!
no ip http server
no ip http secure-server
!
!
snmp-server community HomeVic RO
!
no vstack
!
line con 0
line vty 0 4
exec-timeout 120 0
privilege level 15
transport input telnet ssh
line vty 5 15
exec-timeout 120 0
privilege level 15
transport input telnet ssh
!
ntp clock-period 36026905
ntp server 10.10.0.1
end

Leo Laohoo · ‎08-01-2020

What is the response times when pinging 10.10.0.1?

victor910 · ‎08-01-2020

Response is ideal 0.2ms and always stable

Leo Laohoo · ‎08-01-2020

Turn off the routing feature set and try again.

victor910 · ‎08-02-2020

routing feature is disabled by default, should I do something in advance?

victor910 · ‎08-01-2020

I wanna be very precision, can you provide a command for added to config, thanks.

Joseph W. Doherty · ‎08-01-2020

Yes, I agree interrupt CPU shows zero, but I mentioned the possibility of a bug, which if there is one, might lead to an incorrect display of the actual CPU usages. You, yourself, noted the CPU usage percentages don't correctly sum to the expected total. (BTW, in the past, I've seen 3750s not properly show interface stats, although they would show them for the corresponding ASICs. I.e. there's already a history of such bugs. [I suspect Cisco puts most QA against dealing with traffic, and network functions, rather than "cosmetic" issues.])

"this just theory."

Yes, indeed it is, since I don't have access to actual IOS source code, detailed hardware reference documentation, and your switch, with your traffic, using diagnostic hardware. The "theory" is based on what the usual approach is for operating system implementations. So, I cannot state it's 100% correct (which is why I wrote "in theory"), but if you want to place a wager that I'm incorrect, I'll give you good odds. ;)

"First of all, you still not understand different between ping and latency, like and 99.999999% peoples at this world :)."

Oh, could you then explain the difference?

"Any cheap device provide correct ping response but for cisco we need something special, common Friend, this device definitely has a problem, ping response 200ms not under loading - its problem, end of the story, or you not agree and still will statement this is normal? just curious."

Well, since you're curious. . .

Perhaps. Also perhaps because a "cheap" device doesn't do what this Cisco switch is doing. So, yes, it may not be a problem and (somewhat) normal for it.

Hmm, I see in a later posting you provide longer term CPU stats. Interesting how a "not under loading" device hits 78% usage. Even for much, much lower CPU usages, it doesn't mean something like ping times will always be good.

Also on the subject of what perhaps 99.999999% network engineers of this world, understand, or not (laugh), CPU usage (and network bandwidth usage) is a measure of utilization of capacity over some time. So if we have, say 10% over five seconds, what that means is the CPU was 100% busy for a total of half a second (i.e. 500 ms) across the five seconds. What we don't know, is how that half a second of usage was distributed across the five seconds.

Suppose all the "work" was desired to be done during the same 100 ms. Of course, it cannot be, as only 100 ms of work can be done during 100 ms. So, what happens to the other 400 ms? Ideally, it's queued, normally prioritized too. In any case, the 500 ms of work is conducted during 500 ms. If part of the "work" was to respond to a "ping" request, as it's not very important, if might be the last "work" done. If so, your ping response would have an additional 500 ms beyond actual network latency. Yet, if the next ping hits the CPU when there is noting else to do, you get an almost instant response.

Such is how you might see highly variable network response time from a Cisco switch. As such, it can be perfectly "normal".

Since the forgoing is a real possibility, that's one reason Cisco has provided SLA ping responders. (How it works, the devices "timestamps" when ping response was received, which allows it a way to calculate how long the ping request was held on the device, waiting, before the ping reply is sent.)

BTW, while you get your erratic ping times from the 3750, have you tried, concurrently, pinging "through" the 3750? What you might see is no degradation to the "through" pings. If not, this because, again, the device is designed as a network switch, not as a ping responder.

Also BTW, years ago, I tried an experiment with a 3750G and a 2811, how low could I push RIP timers. When I got down to the one second range, the 3750G's CPU was running at 100% while the additional load on the 2811 was barely registered. At first, I thought, why is a 16 Gbps switch CPU running flat out while a 100 Mbps router barely notices? When you think about it, the 16 Gbps switch has its ASICs for data forwarding, but only the CPU for control plane functions. The router uses its CPU for both data plane and control plane. So, likely (in theory - laugh), the 2811's CPU is much, much "faster" than the 3750Gs, which would make sense, normally, as the ASICs should be doing 99.9% of all the work.

Oh, also I assume you're aware that many 3750s had the "issue" of high CPU caused by the HULC process. Cisco was a bit slow fixing this bug because it generally had no adverse impact to normal switch operations.

victor910 · ‎08-02-2020

stop going deep in demagogy, not interesting at all, I special not explain to you what is different, will be in illusion, I don't care.

ok, what I have,

trying 10000000 version IOS, not help.

finally, I disconnected all cables from the switch.

make factory-default by hardware reset!!!! and make all ports to a shutdown state.

and what we have? yes, still use 10-12% from RedEarth Tx Mana and RedEarth I2C dri, I enable just 1 port, assign IP to vlan1 and make ping, ping is unreasonable unstable.

conclusion, this device is garbage or IOS, who cares exactly, the next device will be from juniper.

Joseph W. Doherty · ‎08-03-2020

"stop going deep in demagogy, not interesting at all, I special not explain to you what is different, will be in illusion, I don't care."

Okay, I understand - you're unable to explain. Such a disappointment, I, and perhaps the rest of the "99.999999%", I'm sure, would have benefited from your enlightenment.

Wish you luck with Juniper equipment (they too have their "foibles", or so has been my experience with their equipment). Perhaps you might do best with "Any cheap device", since they work well for you.

Leo Laohoo · ‎08-04-2020

¯\_(ツ)_/¯

Joseph W. Doherty · ‎08-04-2020

Laugh