cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1610
Views
0
Helpful
1
Replies

SG300 LACP failing

Yordi Mahieu
Level 1
Level 1

Hello, i Work i a school responsible for al what is computer related (network, servers, workstations, ....)

i have 31 SG300 series switches (L2) and 1 (2 in stack) Catalyst 2960X L3 switch acting as a Core switch.

one link of the trunk goes to one interface of the stack, the second on goes to the second switch of the same stack. (for use of easy the same intereface number of the stack) (if on switch of the stack goes dead, everyting keeps on running ) als loadbalancing i use IP/MAC.

The counter on the Core switch confirms that both links are used.

example portchannel 24 = 1 link 1/0/52 (GIBIC Fiber) and 1 link 2/0/52 (Gibic Fiber)

i have on the core switch 15 LACP connections ( every lacp connection = 2 X 1 GB ) and using Vlan trunks in every LACP aggregation, and on every switch i have STP (RSTP) configured. set STP

FOR MORE THAN 1 YEAR every thing goes ok. the Switches have al the same default VLAN (changed to 10 , and this is also my untagged vlan for the trunks, so all my management interface of my netwerk are the sepparated VLAN 10 .

I use many security items availeble on the SG300 . (STP, DHCP snooping, port security, loopback control, IP sourc guard , ARP protection )

LLDP of, CDP only for thunks ..

Uplink (trunk) = ARP en DHCP trusted ! on the acces port for the workstation STP = guarding en port-fast .

All the switches have almost the same config. (to last interfaces = TRUNK ) rest is ACCES Port with untaged VLAN depending of the usage device (domain member,workstation, ip phone, kopie/print , AP , ... )

but no this week strangs things happens on randomly 2 or 3 switches .

normaly on my core switch i see the portchannel (etherchannel) the 2 interfaces in P modus (portchannel bond)

but suddenly i get both connection in I modus (independend). my guesses is that the core is not receiving LACP pdu anymore !!

my work around for the moment: shutdown 1 link on the core (the other stays in I mode ) reboot the SG300 switch (manualy) can not always connect with ssh or SSL ... after reboot the 1 link comes in P mode again. than I re enable the other link on the core and the second port joins the port channel again ... but after some time (couppel of days) the same tings happens again ...

so for some reasons the SG300 switch STOPS sending LACP until after reboot.

other things i have notice.

sometime i can acces the switch with SSH but not with SSL (https) i use a public signd wildcard certificate. ( i notis with this certificate the webinterface is very slowy to load !! )

when i am able to connect with SSH i trey to connect with SSL, i see the couple packets on wireshare.. of the SSL connection but then it fails ..(webbrowser TLS error ....)

even changing the certificat back to the default generated on ... set ip http secure-server on or off ..... if i enable the http then i can acces the switch webpage interface from http. the only sollution that works is reboot the switch.

Second strang thing that i see when the i disable the interface of the failing LACP port, the interface STAYS UP !!!! even after new login on the webinterface !!

Other strange thing: Sometime i can not acces the swith with SSH or SSL (https) but the SNMP query stil works !!! i can pull al the interfaces status. and the user vlans can keep on working .

Al the switches have the SAME latest Firmware (1.4.1.3) . and on the core-switch the configurtion of the LACP (portchannels ) is the same.

for management i use SNMP V3 with free NETXMS and traps, syslog ...

I am used of the sg300 series switches for about 6 years now..

This strang thing is keeping my busy troubleshooting it for days no ..... not found any same thing ...

Noting found in Logs of SG300..

 

1 Accepted Solution

Accepted Solutions

Michal Bruncko
Level 4
Level 4

Hi

> Second strang thing that i see when the i disable the interface of the failing LACP port, the interface STAYS UP !!!! even after new login on the webinterface !!

I observed the same quite long time ago and now decided to report this to Cisco. Now (after a small fight) this issue has official bug report. It is good chance that this could be fixed in coming firmware releases for SG300.

CSCuy09680: SG300-28: “shutdown” does not affect port-channel member ports

> when i am able to connect with SSH i trey to connect with SSL, i see the couple packets on wireshare.. of the SSL connection but then it fails ..(webbrowser TLS error ....)

this could be related to enabled SNMP monitoring - with frequent fetching of some MIB data. I heard issues related to SNMP - specifically to memory leaks which could result to unavailability of management access (the worst cases resulting into crash and unit reboot). the best would be to update switch to latest firmware and possibly stop using SNMP data gathering on testing switch - to see if situation is improved. Possibly another feature could do it - as you mention that you are using plenty of switch features.

> Sometime i can not acces the swith with SSH or SSL (https) but the SNMP query stil works !!!

Again possible same root cause - memory leak causing unability to fork process for SSH/HTTPS server instance or any other memory-intensive operation. As you said, only reboot helped.

unfortunately I cant help you with the rest as I did not reproduced issue.

View solution in original post

1 Reply 1

Michal Bruncko
Level 4
Level 4

Hi

> Second strang thing that i see when the i disable the interface of the failing LACP port, the interface STAYS UP !!!! even after new login on the webinterface !!

I observed the same quite long time ago and now decided to report this to Cisco. Now (after a small fight) this issue has official bug report. It is good chance that this could be fixed in coming firmware releases for SG300.

CSCuy09680: SG300-28: “shutdown” does not affect port-channel member ports

> when i am able to connect with SSH i trey to connect with SSL, i see the couple packets on wireshare.. of the SSL connection but then it fails ..(webbrowser TLS error ....)

this could be related to enabled SNMP monitoring - with frequent fetching of some MIB data. I heard issues related to SNMP - specifically to memory leaks which could result to unavailability of management access (the worst cases resulting into crash and unit reboot). the best would be to update switch to latest firmware and possibly stop using SNMP data gathering on testing switch - to see if situation is improved. Possibly another feature could do it - as you mention that you are using plenty of switch features.

> Sometime i can not acces the swith with SSH or SSL (https) but the SNMP query stil works !!!

Again possible same root cause - memory leak causing unability to fork process for SSH/HTTPS server instance or any other memory-intensive operation. As you said, only reboot helped.

unfortunately I cant help you with the rest as I did not reproduced issue.