cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1762
Views
0
Helpful
22
Replies

WS-C3750X stack loosing management access once SNMP is enabled

HeliosX
Level 1
Level 1

Hi,

I have a stack of 5 WS-C3750X switches which work fine but when I enable SNMP and then set up poling on Observium I immediately loose remote access to it. The only way to fix it is to disable poling and do a cold reboot.

Does anyone have any solution for it so I can have SNMP working on it ?

 

This is the setup:
1 18 WS-C3750X-12S   15.2(1)E1    C3750E-UNIVERSALK9-M
* 2 30 WS-C3750X-24   15.2(1)E1    C3750E-UNIVERSALK9-M
3 18 WS-C3750X-12S   15.2(1)E1    C3750E-UNIVERSALK9-M
4 18 WS-C3750X-12S   15.2(1)E1    C3750E-UNIVERSALK9-M
5 18 WS-C3750X-12S   15.2(1)E1    C3750E-UNIVERSALK9-M

 

 

1 Accepted Solution

Accepted Solutions

There is a bug listed regarding high CPU with snmpwalk (CSCum72168) so I wonder if the SNMP server is walking through all MIBs to discover the device?

Does the 3750X support control-plane policing?  Might be worth looking into that?

View solution in original post

22 Replies 22

Deepak Kumar
VIP Alumni
VIP Alumni

Hi,

Please share your configuration and CPU uses (before and after enable the SNMP).

 

Regards,

Deepak Kumar

Regards,
Deepak Kumar,
Don't forget to vote and accept the solution if this comment will help you!

Hi,

One thing worth to mention is that when I setup Observium and then loose remote access no data is being poled by Observium.

CPU utilization for five seconds: 22%/2%; one minute: 25%; five minutes: 25%

Unfortunately I can not show you CPU usage after enabling SNMP as this is a working corporate environment and If I do it I will have to reboot it to regain access.

 

Current configuration : 26761 bytes
!
! Last configuration change at 14:26:01 BST Wed Aug 29 2018 by *****
! NVRAM config last updated at 14:13:44 BST Wed Aug 29 2018 by *****
!
version 15.2
no service pad
service tcp-keepalives-in
service tcp-keepalives-out
service timestamps debug datetime msec localtime show-timezone
service timestamps log datetime msec localtime show-timezone
service password-encryption
service sequence-numbers
!
hostname uk-***-sw11
!
boot-start-marker
boot-end-marker
!
logging buffered 12000 informational
no logging monitor
enable secret 5 *************
!
username admin privilege 15 secret 5 **************
aaa new-model
!
aaa authentication login default group radius local
aaa authentication enable default group radius enable
aaa accounting system default start-stop group radius
!
aaa session-id common
clock timezone BST 0 0
clock summer-time BST recurring last Sun Mar 1:00 last Sun Oct 2:00
no boot auto-copy-sw
switch 1 provision ws-c3750x-12s
switch 2 provision ws-c3750x-24
switch 3 provision ws-c3750x-12s
switch 4 provision ws-c3750x-12s
switch 5 provision ws-c3750x-12s
system mtu routing 1500
no ip source-route
no ip gratuitous-arps
!
!
ip domain-name ****
ip name-server 192.****
ip name-server 192.****
ip device tracking
!
crypto pki trustpoint TP-self-signed-4137962496
enrollment selfsigned
subject-name cn=IOS-Self-Signed-Certificate-************
revocation-check none
rsakeypair TP-self-signed-**************
!
crypto pki certificate chain TP-self-signed-*************
certificate self-signed 01
*************************************
*************************************
*************************************
*************************************
quit
!
spanning-tree mode pvst
spanning-tree logging
spanning-tree portfast default
spanning-tree portfast bpduguard default
spanning-tree extend system-id
!
spanning-tree mst configuration
instance 88 vlan 88
!
errdisable recovery cause bpduguard
!
vlan internal allocation policy ascending
!
ip ssh version 2
!
policy-map 10Mbps_Upload
class class-default
police 10000000 62500 exceed-action drop
!
interface Port-channel1
switchport trunk encapsulation dot1q
switchport trunk pruning vlan none
switchport mode trunk
storm-control broadcast level 35.00 30.00
storm-control multicast level 2.00 1.00
!
interface FastEthernet0
no ip address
no ip route-cache
!
interface range Gi1/0/1-1/0/12
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 1,88,264-267,750,760-770,773,800
switchport mode trunk
storm-control broadcast level 35.00 30.00
storm-control multicast level 2.00 1.00
!
interface GigabitEthernet1/1/1
description Unused
!
interface GigabitEthernet1/1/2
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 88,264-267
switchport mode trunk
!
interface GigabitEthernet1/1/3
description Unused
!
interface GigabitEthernet1/1/4
description Unused
!
interface TenGigabitEthernet1/1/1
description Unused
!
interface TenGigabitEthernet1/1/2
description Unused
!
interface range GigabitEthernet2/0/1-2/0/23
switchport access vlan 265
switchport mode access
speed auto 100
storm-control broadcast level 35.00 30.00
storm-control multicast level 2.00 1.00
no cdp enable
!
interface GigabitEthernet2/0/24
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 1,88,264-267,750,761-770,773
switchport mode trunk
!
interface GigabitEthernet2/1/1
description Unused
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 1,88,264-266,750,761-770
switchport mode trunk
!
interface GigabitEthernet2/1/2
description Unused
!
interface GigabitEthernet2/1/3
description Unused
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 1,88,264-266,750,761-770
switchport mode trunk
!
interface GigabitEthernet2/1/4
description Unused
!
interface TenGigabitEthernet2/1/1
switchport trunk encapsulation dot1q
switchport trunk pruning vlan none
switchport mode trunk
priority-queue out
mls qos trust dscp
storm-control broadcast level 35.00 30.00
storm-control multicast level 2.00 1.00
channel-group 1 mode active
!
interface TenGigabitEthernet2/1/2
switchport trunk encapsulation dot1q
switchport trunk pruning vlan none
switchport mode trunk
priority-queue out
mls qos trust dscp
storm-control broadcast level 35.00 30.00
storm-control multicast level 2.00 1.00
channel-group 1 mode active
!
interface range GigabitEthernet3/0/1-3/0/12
description UNIT 14
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 1,88,264-266,750,761-770,773
switchport mode trunk
storm-control broadcast level 35.00 30.00
storm-control multicast level 2.00 1.00
!
!
interface GigabitEthernet3/1/1
description Unused
!
interface GigabitEthernet3/1/2
description Unused
!
interface GigabitEthernet3/1/3
description Unused
!
interface GigabitEthernet3/1/4
description Unused
!
interface TenGigabitEthernet3/1/1
description Unused
!
interface TenGigabitEthernet3/1/2
description Unused
!
interface range GigabitEthernet4/0/1-4/0/12
description UNIT 22
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 1,88,264-266,750,761-770,773
switchport mode trunk
storm-control broadcast level 35.00 30.00
storm-control multicast level 2.00 1.00
!
interface GigabitEthernet4/1/1
description Unused
!
interface GigabitEthernet4/1/2
description Unused
!
interface GigabitEthernet4/1/3
description Unused
!
interface GigabitEthernet4/1/4
description Unused
!
interface TenGigabitEthernet4/1/1
description Unused
!
interface TenGigabitEthernet4/1/2
description Unused
!
interface range GigabitEthernet5/0/1-5/0/12
description UNIT 29
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 1,88,264-266,750,761-770,773
switchport mode trunk
storm-control broadcast level 35.00 30.00
storm-control multicast level 2.00 1.00
!
interface GigabitEthernet5/1/1
description Unused
!
interface GigabitEthernet5/1/2
description Unused
!
interface GigabitEthernet5/1/3
description Unused
!
interface GigabitEthernet5/1/4
description Unused
!
interface TenGigabitEthernet5/1/1
description Unused
!
interface TenGigabitEthernet5/1/2
description Unused
!
interface Vlan1
no ip address
no ip route-cache
shutdown
!
interface Vlan88
ip address 10.*.*.* 255.255.255.0
no ip route-cache
!
ip default-gateway 10.*.*.*
ip forward-protocol nd
no ip http server
no ip http secure-server
!
access-list 1 deny any log
access-list 1 permit *.*.*.* 0.0.0.255
access-list 1 permit *.*.*.* 0.0.0.255
access-list 100 remark Deny ALL
access-list 100 deny ip any any
snmp-server community *********** RO
snmp-server community *********** RO
snmp-server location ***************
snmp-server contact ************
snmp-server host *.*.*.* ************
!

radius server RADIUS-SERVER
address ipv4 *.*.*.* auth-port 1812 acct-port 1813
key 7 *********************
!
!
banner motd ^C
*********************************************************
* *
* WARNING: Use of this system is Restricted/Monitored *
* *
*********************************************************
^C
!
line con 0
exec-timeout 15 0
privilege level 15
logging synchronous
line vty 0 4
session-timeout 15
exec-timeout 15 0
timeout login response 60
privilege level 15
length 0
transport input ssh
line vty 5 15
session-timeout 15
exec-timeout 15 0
transport input none
transport output none
!
ntp server *.*.*.*
end

 

I do not see any issue with config, but good to have some ACL in place for only IP can poll SNMP Queries 

as example :

access-list 10 permit x.x.x.x (x.x.x.x - is where your SNMP Server)
snmp-server community string ro 10

 

what kind of SNMP Server, Solarwinds ? what is the polling interval ?

 

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Hi,

 

Switch stack is on management vlan so no one else but admins can access the management plane including snmp thus acl is not really necessary although recommended.

 

We were using Observium and the polling interval is 5 min (www.observium.org)

 

FYI these switches have cost us 6 thousands pounds each when we bought them so it is such a shame that they behave like that.

 

Before come to a conclusion, suggest you to open a TAC case with show tech information.

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Hello

Regards you configuration .. You have no MGT L3 svi interface, You are performing process switching instead of Fast switching/CEF which can be cpu intensive, your storm control for Broadcast/Multicast values differ largely, IPDT and policing enabled and you have two access-lists that are denying everything by default.

Can you confirm what error or logging messages you receive when you lose connection?


Suggest to remove the IPDT at first and the Policing, Either amend the acls to allow some traffic or remove them as well, then test again lastly and provide it with a L3 mgt address (if applicable)

 

 

 

 

 

 

 


Please rate and mark as an accepted solution if you have found any of the information provided useful.
This then could assist others on these forums to find a valuable answer and broadens the community’s global network.

Kind Regards
Paul

Hi Paul,

Those access lists do nothing there, I have L3 MGT interface and CEF is enabled by default.

Why do you suggest I should disable IPDT ?

Policy-map is there but it does not do anything either as it is not applied to any interface.

 

interface Vlan88 is our MGT interface.

Hello

 


@HeliosX wrote:

Hi Paul,

Those access lists do nothing there, I have L3 MGT interface and CEF is enabled by default.
Yes you do, I got you mixed up with another post regards that, however if you check that SVI interface it will confirm what switch process is applied  - sh ip int vlan 88 | in IP

Why do you suggest I should disable IPDT ?  -  If it is indeed enabled then to troubleshoot your SNMP issue and negate the switch from tracking your connected hosts

 


Lastly forgot to ask what snmp version/IOS are you running?

 


Please rate and mark as an accepted solution if you have found any of the information provided useful.
This then could assist others on these forums to find a valuable answer and broadens the community’s global network.

Kind Regards
Paul

Hi,

Your configuration is looking ok but I am going to suggest to add console on the system and remote all output or error message during SNMP configuration or pooling. 

I suspect for the high CPU utilization after some time.

 

Regards,

Deepak Kumar

Regards,
Deepak Kumar,
Don't forget to vote and accept the solution if this comment will help you!

So I have tried re-enabling SNMP and observium after disabling IP Device tracking.

 

Few minutes have past and Observium could not pool any data from switch. This time I could still access it via SSH but the CPU was at 98%. Disabling SNMP sorted the issue.

 

CPU utilization for five seconds: 98%/1%; one minute: 98%; five minutes: 98%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
394 1236824 37444 33031 70.79% 72.78% 72.10% 0 SNMP ENGINE
4 33720555 1458830 23114 3.82% 1.75% 2.26% 0 Check heaps
257 28272125 32383592 873 3.34% 2.64% 2.52% 0 Spanning Tree
89 25352141 5741048 4415 2.07% 1.85% 1.84% 0 RedEarth Tx Mana
266 19478292 47057216 413 1.59% 2.13% 1.94% 0 HULC DAI Process
180 8727452 36786994 237 1.43% 1.27% 1.27% 0 Hulc LED Process
88 8355711 7874134 1061 1.27% 1.33% 1.32% 0 RedEarth I2C dri
132 12783709 1476806 8656 1.11% 1.02% 0.99% 0 hpm counter proc

 

Any idea why is this happening ?

 

Not sure if this previous post helps?

 

https://community.cisco.com/t5/network-architecture-documents/snmp-engine-over-80-cpu-100/ta-p/3117261

 

When you disable SNMP are you removing all the SNMP configuration?  That post mentions disabling the SNMP engine.  I have never configured a SNMP EngineED manually.  I think it gets generated automatically when you enable SNMP and doesn't appear in the configuration unless you manually set it.  Not sure though....

 

By disabling SNMP I mean I do "no snmp-server" which removes snmp config and then CPU usage goes back to normal.

Probably not the answer you want but I'd probably upgrade the IOS on the stack in a maintenance window as 15.2(1)E1 was 5-years old yesterday. I'd suggest moving to 15.2.(4)E6 and then retesting.

Sorry I can't be of more help :(

 

I too was going to suggest using different code, because early releases often have some issues. Minimally, you might review the release notes for later updates to that train and perhaps at least move to 15.2(1)E3. Or, seriously consider Andrew's suggestion as 15.2(4)E6 is what's recommended by Cisco web page (it's also an MD not an ED, like the 15.2[1] train). (BTW, the prior release, to yours, 15.2[1]E is a deferred release - again, early releases often have issues.) Lastly, if you don't need 15.2 features, you might consider downgrading to 15.0(2)SE12 or 12.2(55)SE13, both are much deeper into their life cycle, which often means they are more solid.
Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Innovations in Cisco Full Stack Observability - A new webinar from Cisco