03-27-2017 12:36 PM - edited 03-08-2019 09:56 AM
I had a bad experience where I thought my stack redundancy configuration would save my bacon in an industrial automation situation. I have a stack of 2 WS-C3850-24S-S switches. I have etherchannels set up such that Giga1/0/2 and Giga2/0/2 are a port channel and are connected to the two gigabit uplinks on an industrial switch (Allen Bradley Stratix 8000 Series). The same goes for the next pairs (1/0/3&2/0/3 etc...) - so I have a total 5 port-channels going to 5 different Stratix switches. I had an error where it looks like the standby switch died momentarily on me, and this disrupted the whole operation. Devices came back online in a matter of seconds and everything began to communicate - the standby woke back up from whatever nightmare it was having and we were able to start the place up. The problem is the short disruption - even for a matter of about 50 milliseconds is enough to lose the industrial automation devices long enough to shut down equipment. If anyone in this forum has worked with PLC's and ethernet I/O scanning they will relate to how fussy the stuff is.
Is there something I am generally misunderstanding about how this redundancy is supposed to work? Could it be some feature relating to port-fast/spanning tree where I am not allowing the system to recover as fast as it should because I have a foolish setting? I had the supposed Allen-Bradley "gurus" help me set this up, give me the blessing and a false sense of security - but it obviously didn't do what I wanted.
Attached is the config file - if that does any good.
Solved! Go to Solution.
03-27-2017 06:07 PM
First lets start with software. Some of the early 3850 software (like 3.3) was not good. Make sure you are using a gold star release like 3.6.6E.
Next I see you are using mst. If all your kit can support rapid spanning tree, I would use that. It converges so much faster. Looking at your config, with all the trunk ports - I would probably make this my number 1 priority.
spanning-tree mode rapid-pvst
If you can use lacp and your kit supports "fast rate" then use this. It allows substantially faster Etherchannel failover. I would make this my number 2 priority.
I doubt you'll have this option because of the Allen Bradley switches, but Cisco Resilient Ethernet Protocol (known as Cisco REP) allows the use of rings with very fast failover time. This is especially popular in industrial environments. This would be a fundamental change, but you should probably be aware of this option.
http://www.cisco.com/c/en/us/support/docs/lan-switching/ethernet/116384-technote-rep-00.html
03-27-2017 06:07 PM
First lets start with software. Some of the early 3850 software (like 3.3) was not good. Make sure you are using a gold star release like 3.6.6E.
Next I see you are using mst. If all your kit can support rapid spanning tree, I would use that. It converges so much faster. Looking at your config, with all the trunk ports - I would probably make this my number 1 priority.
spanning-tree mode rapid-pvst
If you can use lacp and your kit supports "fast rate" then use this. It allows substantially faster Etherchannel failover. I would make this my number 2 priority.
I doubt you'll have this option because of the Allen Bradley switches, but Cisco Resilient Ethernet Protocol (known as Cisco REP) allows the use of rings with very fast failover time. This is especially popular in industrial environments. This would be a fundamental change, but you should probably be aware of this option.
http://www.cisco.com/c/en/us/support/docs/lan-switching/ethernet/116384-technote-rep-00.html
03-29-2017 03:06 PM
You were absolutely right on the spanning tree setting. All devices were set to mst and must not have been able to reconverge fast enough after a single point of failure. I have everything set to rapid pvst in my test system and can do pretty much whatever I want and it won't lose the I/O.
As for the #2 priority, I am limited by the Stratix switches which don't have a fast-rate setting.
As for REP, I don't really want to head down that path right now - that would be a major reconfigure.
I think I'm out of the woods though due to priority 1. Thanks for the help.
06-18-2019 11:14 AM
As a side note the AB switch is actually a rebranded Cisco Industrial Switch. You can wipe it and manage it like any Cisco switch without the Allen Bradley GUI.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide