cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
844
Views
5
Helpful
3
Replies

Cisco 3850 Stack Redundancy & Industrial Automation

echo3whiskey
Beginner
Beginner

I had a bad experience where I thought my stack redundancy configuration would save my bacon in an industrial automation situation.  I have a stack of 2 WS-C3850-24S-S switches.  I have etherchannels set up such that Giga1/0/2 and Giga2/0/2 are a port channel and are connected to the two gigabit uplinks on an industrial switch (Allen Bradley Stratix 8000 Series).  The same goes for the next pairs (1/0/3&2/0/3 etc...) - so I have a total 5 port-channels going to 5 different Stratix switches.  I had an error where it looks like the standby switch died momentarily on me, and this disrupted the whole operation.  Devices came back online in a matter of seconds and everything began to communicate - the standby woke back up from whatever nightmare it was having and we were able to start the place up.  The problem is the short disruption - even for a matter of about 50 milliseconds is enough to lose the industrial automation devices long enough to shut down equipment.  If anyone in this forum has worked with PLC's and ethernet I/O scanning they will relate to how fussy the stuff is.

Is there something I am generally misunderstanding about how this redundancy is supposed to work?  Could it be some feature relating to port-fast/spanning tree where I am not allowing the system to recover as fast as it should because I have a foolish setting?  I had the supposed Allen-Bradley "gurus" help me set this up, give me the blessing and a false sense of security - but it obviously didn't do what I wanted.

Attached is the config file - if that does any good.

1 ACCEPTED SOLUTION

Accepted Solutions

Philip D'Ath
Advisor
Advisor

First lets start with software.  Some of the early 3850 software (like 3.3) was not good.  Make sure you are using a gold star release like 3.6.6E.

Next I see you are using mst.  If all your kit can support rapid spanning tree, I would use that.  It converges so much faster.  Looking at your config, with all the trunk ports - I would probably make this my number 1 priority.

spanning-tree mode rapid-pvst

If you can use lacp and your kit supports "fast rate" then use this.  It allows substantially faster Etherchannel failover.  I would make this my number 2 priority.

http://www.cisco.com/c/en/us/td/docs/ios-xml/ios/cether/command/ce-xe-3se-3850-cr-book/ce-xe-3se-3850-cr-book_chapter_00.html#wp2006781800

I doubt you'll have this option because of the Allen Bradley switches, but Cisco Resilient Ethernet Protocol (known as Cisco REP) allows the use of rings with very fast failover time.  This is especially popular in industrial environments.  This would be a fundamental change, but you should probably be aware of this option.

http://www.cisco.com/c/en/us/support/docs/lan-switching/ethernet/116384-technote-rep-00.html

View solution in original post

3 REPLIES 3

Philip D'Ath
Advisor
Advisor

First lets start with software.  Some of the early 3850 software (like 3.3) was not good.  Make sure you are using a gold star release like 3.6.6E.

Next I see you are using mst.  If all your kit can support rapid spanning tree, I would use that.  It converges so much faster.  Looking at your config, with all the trunk ports - I would probably make this my number 1 priority.

spanning-tree mode rapid-pvst

If you can use lacp and your kit supports "fast rate" then use this.  It allows substantially faster Etherchannel failover.  I would make this my number 2 priority.

http://www.cisco.com/c/en/us/td/docs/ios-xml/ios/cether/command/ce-xe-3se-3850-cr-book/ce-xe-3se-3850-cr-book_chapter_00.html#wp2006781800

I doubt you'll have this option because of the Allen Bradley switches, but Cisco Resilient Ethernet Protocol (known as Cisco REP) allows the use of rings with very fast failover time.  This is especially popular in industrial environments.  This would be a fundamental change, but you should probably be aware of this option.

http://www.cisco.com/c/en/us/support/docs/lan-switching/ethernet/116384-technote-rep-00.html

You were absolutely right on the spanning tree setting.  All devices were set to mst and must not have been able to reconverge fast enough after a single point of failure.  I have everything set to rapid pvst in my test system and can do pretty much whatever I want and it won't lose the I/O.

As for the #2 priority, I am limited by the Stratix switches which don't have a fast-rate setting.

As for REP, I don't really want to head down that path right now - that would be a major reconfigure.

I think I'm out of the woods though due to priority 1.  Thanks for the help.