11-06-2018 12:47 PM - edited 03-08-2019 04:33 PM
Hey all
I am having a problem with a 3850-24XS. All traffic stops for 3 pings then returns to normal. Running a constant ping from multiple endpoints on trunk connected switches to multiple ports reveals 3 ping timeouts every 5 - 45 minutes resulting in a disruption of RDP session, SSH sessions, etc.
There is nothing in the logs. Everything was working fine for the last 6 months until yesterday. No changes were made to the switch in the last week.
Any tips on which debugging I can turn on to capture the problem?
! version 16.8 no service pad service timestamps debug datetime msec service timestamps log datetime msec service password-encryption no platform punt-keepalive disable-kernel-core ! hostname WK_ToR ! ! vrf definition Mgmt-vrf ! address-family ipv4 exit-address-family ! address-family ipv6 exit-address-family ! enable secret 5 ! no aaa new-model switch 1 provision ws-c3850-24xs ! ip routing ! ip domain name p.local ! cpp system-default ! crypto pki trustpoint TP-self-signed-88678622 enrollment selfsigned subject-name cn=IOS-Self-Signed-Certificate-88678622 revocation-check none rsakeypair TP-self-signed-88678622 ! ! crypto pki certificate chain TP-self-signed-88678622 certificate self-signed 01 ! ! diagnostic bootup level minimal ! spanning-tree mode rapid-pvst spanning-tree extend system-id ! ! username jwadmin privilege 15 password 7 ! redundancy mode sso ! ! class-map match-any system-cpp-police-topology-control description Topology control class-map match-any system-cpp-police-sw-forward description Sw forwarding, L2 LVX data, LOGGING class-map match-any system-cpp-default description EWLC control, EWLC data class-map match-any system-cpp-police-sys-data description Learning cache ovfl, Crypto Control, Exception, EGR Exception, NFL SAMPLED DATA, Gold Pkt, RPF Failed class-map match-any system-cpp-police-punt-webauth description Punt Webauth class-map match-any system-cpp-police-l2lvx-control description L2 LVX control packets class-map match-any system-cpp-police-forus description Forus Address resolution and Forus traffic class-map match-any system-cpp-police-multicast-end-station description MCAST END STATION class-map match-any system-cpp-police-multicast description Transit Traffic and MCAST Data class-map match-any system-cpp-police-l2-control description L2 control class-map match-any system-cpp-police-dot1x-auth description DOT1X Auth class-map match-any system-cpp-police-data description ICMP redirect, ICMP_GEN and BROADCAST class-map match-any system-cpp-police-stackwise-virt-control description Stackwise Virtual class-map match-any system-cpp-police-control-low-priority description General punt class-map match-any non-client-nrt-class class-map match-any system-cpp-police-routing-control description Routing control and Low Latency class-map match-any system-cpp-police-protocol-snooping description Protocol snooping class-map match-any system-cpp-police-dhcp-snooping description DHCP snooping ! policy-map system-cpp-policy class system-cpp-police-data class system-cpp-police-sys-data class system-cpp-police-sw-forward class system-cpp-police-multicast class system-cpp-police-multicast-end-station class system-cpp-police-punt-webauth class system-cpp-police-l2-control class system-cpp-police-stackwise-virt-control class system-cpp-police-routing-control class system-cpp-police-control-low-priority class system-cpp-police-l2lvx-control class system-cpp-police-topology-control class system-cpp-police-dot1x-auth class system-cpp-police-protocol-snooping class system-cpp-police-dhcp-snooping class system-cpp-police-forus class system-cpp-default ! ! interface GigabitEthernet0/0 vrf forwarding Mgmt-vrf ip address 10.7.0.40 255.255.0.0 speed 1000 negotiation auto ! interface TenGigabitEthernet1/0/1 no switchport ip address 10.1.0.2 255.255.254.0 ! interface TenGigabitEthernet1/0/2 description IP_Phones_on_VLAN11 switchport trunk allowed vlan 11 switchport mode trunk ! interface TenGigabitEthernet1/0/3 description ESXi1 switchport trunk allowed vlan 102,104,118 switchport mode trunk ! interface TenGigabitEthernet1/0/4 description ESXi1 switchport trunk allowed vlan 102,104,118 switchport mode trunk ! interface TenGigabitEthernet1/0/5 description ESXi2 switchport trunk allowed vlan 102,104,118 switchport mode trunk ! interface TenGigabitEthernet1/0/6 description ESXi2 switchport trunk allowed vlan 102,104,118 switchport mode trunk ! interface TenGigabitEthernet1/0/7 description ESXi3 switchport trunk allowed vlan 102,104,118 switchport mode trunk ! interface TenGigabitEthernet1/0/8 description ESXi3 switchport trunk allowed vlan 102,104,118 switchport mode trunk ! interface TenGigabitEthernet1/0/9 description NH-JBOD-01_10G switchport access vlan 104 spanning-tree portfast ! interface TenGigabitEthernet1/0/10 ! interface TenGigabitEthernet1/0/11 description NH-JBOD-02_10G switchport access vlan 104 spanning-tree portfast ! interface TenGigabitEthernet1/0/12 ! interface TenGigabitEthernet1/0/13 ! interface TenGigabitEthernet1/0/14 ! interface TenGigabitEthernet1/0/15 ! interface TenGigabitEthernet1/0/16 ! interface TenGigabitEthernet1/0/17 ! interface TenGigabitEthernet1/0/18 ! interface TenGigabitEthernet1/0/19 ! interface TenGigabitEthernet1/0/20 ! interface TenGigabitEthernet1/0/21 description WK_ToR TO WK-East-01 switchport trunk allowed vlan 11,102,106,108,110,112 switchport mode trunk ! interface TenGigabitEthernet1/0/22 description WK_ToR TO WK-Copy-01 switchport trunk allowed vlan 10,11,102,106,108,110,112,114 switchport mode trunk ! interface TenGigabitEthernet1/0/23 description WK-ToR TO WK-SVR-02 switchport trunk allowed vlan 11,102,104,106,108,110,112,114,116 switchport mode trunk ! interface TenGigabitEthernet1/0/24 description WK-ToR TO WK-SVR-01 switchport trunk allowed vlan 11,102,104,106,108,110,112,114,116 switchport mode trunk ! #####Remove ghost interfaces. ! interface Vlan1 no ip address shutdown ! interface Vlan11 no ip address ip helper-address 10.11.0.2 ! interface Vlan102 description DHCP_Clients ip address 10.1.2.1 255.255.254.0 ip helper-address 10.1.4.2 ! interface Vlan104 description Servers ip address 10.1.4.1 255.255.254.0 ! interface Vlan106 description Network-Printers ip address 10.1.6.1 255.255.254.0 ip helper-address 10.1.4.2 ! interface Vlan108 description PME_WiFi ip address 10.1.8.1 255.255.254.0 ip helper-address 10.1.4.2 ! interface Vlan110 description PME_Guest_WiFi ip address 10.1.10.1 255.255.254.0 ip helper-address 10.1.4.2 ! interface Vlan112 description IP_Phones ip address 10.1.12.1 255.255.254.0 ! interface Vlan114 description MISC-Machinery ip address 10.1.14.1 255.255.254.0 ip helper-address 10.1.4.2 ! interface Vlan116 description MGMT-VMware ip address 10.1.16.1 255.255.254.0 ! interface Vlan118 description DMZ no ip address ! ip forward-protocol nd ip http server ip http authentication local ip http secure-server ip route 0.0.0.0 0.0.0.0 10.1.0.1 ip route vrf Mgmt-vrf 0.0.0.0 0.0.0.0 10.10.1.1 ip ssh logging events ip ssh version 2 ! ! ! ! ! control-plane service-policy input system-cpp-policy ! ! line con 0 exec-timeout 120 0 logging synchronous stopbits 1 line aux 0 stopbits 1 line vty 0 4 session-timeout 120 exec-timeout 960 0 logging synchronous login local transport input ssh line vty 5 15 session-timeout 120 exec-timeout 960 0 logging synchronous login local transport input ssh ! ! mac address-table notification mac-move wsma agent exec ! wsma agent config ! wsma agent filesys ! wsma agent notify ! ! end
11-06-2018 12:55 PM - edited 11-06-2018 01:01 PM
Hi,
Have you checked the CPU?
What is the output of
sh processes cpu | exclude 0.00
Also, see if "sh processes cpu history" can correlate with when the ping loss happens.
HTH
11-06-2018 01:02 PM
WK_ToR#sh processes cpu | exclude 0.00 CPU utilization for five seconds: 3%/0%; one minute: 2%; five minutes: 2% PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process 9 14793048 1302971 11353 0.87% 0.30% 0.23% 0 Check heaps 79 4833389 26968325 179 0.07% 0.07% 0.07% 0 IOSD ipc task 123 1710901 26309162 65 0.07% 0.07% 0.07% 0 IOSXE-RP Punt Se 151 408508 750995 543 0.15% 0.03% 0.02% 0 NGWC L2M 197 4822089 100625837 47 0.07% 0.07% 0.07% 0 VRRS Main thread 217 6272981 200137135 31 0.07% 0.08% 0.07% 0 IPAM Manager 220 5704075 200137127 28 0.07% 0.07% 0.07% 0 IP ARP Retry Age 221 4256349 19530621 217 0.23% 0.26% 0.26% 0 IP Input 239 12878601 44498518 289 0.23% 0.21% 0.23% 0 Spanning Tree 340 4792139 100625671 47 0.07% 0.08% 0.08% 0 MMA DB TIMER 530 2679650 32178097 83 0.07% 0.04% 0.02% 0 ONEP Network Ele
11-06-2018 01:06 PM
Funny. It happened as I was typing the 'sh proc cpu history' so the the 60 second chart would show it.
WK_ToR#sh proc cpu history 11111 333300000222222222244444222225555566666555554444455555222223 100 90 80 70 60 50 40 30 20 10 ***** *************** ***** 0....5....1....1....2....2....3....3....4....4....5....5....6 0 5 0 5 0 5 0 5 0 5 0 CPU% per second (last 60 seconds) 1 11 1 664445644443754444864443486044674434474444411444558744441444 100 90 80 70 60 50 40 30 20 10 ** ** ** ** *** ** * ** **** * 0....5....1....1....2....2....3....3....4....4....5....5....6 0 5 0 5 0 5 0 5 0 5 0 CPU% per minute (last 60 minutes) * = maximum CPU% # = average CPU%
11-06-2018 01:11 PM
CPU looks very good with very little utilization. I did not see any specific bug case that could be associated with this but you may want to open a ticket with TAC as they may have a history on this.
HTH
11-06-2018 01:56 PM
While looking through the debug commands I used 'debug ethernet cfm error'.
Not 30 seconds later, Lucky me!, the switch did another timeout cycle. Not long enough to completely disconnect my SSH, only enough so that the session was unresponsive for a few seconds and leaving me with messages
*Nov 6 17:47:13.580: CFM-ERR: UTIL: Failed to find port for HWIDB Vlan106
*Nov 6 17:47:13.603: CFM-ERR: UTIL: Failed to find port for HWIDB Vlan112
*Nov 6 17:47:13.630: CFM-ERR: UTIL: Failed to find port for HWIDB Vlan102
*Nov 6 17:47:13.634: CFM-ERR: UTIL: Failed to find port for HWIDB Vlan11
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: