Showing results for 
Search instead for 
Did you mean: 

LACP and etherchannel misconfig guard

Hey all,


I was hoping someone would be able to shed some light on a failure we experienced - we're confident that it was user-error so we're not looking to diagnose the root cause of the failure, but the state our Cisco Catalyst switches ended in was odd.


We have a number of Cisco Catalyst stacks cabled to a third-party switch pair using fiber transceivers or copper DACs, all configured using LACP mode active. After maintenance on one of the third-party switches we found that every Catalyst switch on our network had its uplink ports in an err-disabled state, and the logs showed that misconfig-guard had kicked in and disabled them with the reason given as "STP".


I believe this was because during maintenance of the third-party switch pair BPDU frames were sent with differing system-ids to each of the ports on the Cisco end i.e. our third-party switch pair failed to identify that it was part of a group, and they operated as independent units for a period of time which the Catalysts detected.


That's all fine and if it's correct explains the cause, the issue we then had was that this state on the Catalysts was fatal - the only resolution was to reload each of the stacks. If we disconnected one of the uplink ports and shut/no shut the other, the error would reoccur immediately. If we restarted the third-party switches we saw no difference. A reload immediately resolved the problem.


What made this stranger was that these were all different model Catalysts on different IOS versions - some were C2960X units, some were C3850 units - and other non-Cisco equipment using the same LACP active configuration to the same third-party switches had no issues whatsoever and continued to run throughout.


So my question is - is there some undocumented (or hard to locate documentation for...) feature in IOS that would cause this to happen? My only thought is that potentially the switch keeps track of the last system-id detected via BPDU on a port even if it's disconnected/down and this continues to trigger the misconfig-guard when one of the members comes back up, but that seems needlessly punitive.


Any ideas appreciated!






Content for Community-Ad