I have the following topology:
3850 (SW1) <-> Media converter <-> Media converter <-> 3850 (SW2)
Each port from SW1 towards SW2 is in a trunked etherchannel (up to 8 ports) and they are connected via point-to-point media converters. So for an etherchannel of 8 ports you'd have 16 converters.
They have a single shared VLAN (VLAN 400) between them. All switches are 3850. The media converters are necessary and cannot be replaced by a switch.
The issue is that even though each interface from SW1 is connected point-to-point to SW2, the etherchannel won't be able to detect a failure between two media converters. If the link between a pair of them fails then the etherchannel will still see the link as up since the connection between each switch and the adjacent MC is up.
3850 (SW1) <-> Media converter <-XXX-> Media converter <-> 3850 (SW2)
Is it possible to keep this topology at L2 and use BFD (via SVI for example) or some other mechanism to detect if the path between S1 and S2 is unavailable between any two interfaces? The detection should result in either the interface going down or having the interface suspended from the etherchannel until it recovers.
Does it have to be layer-2? Never done it and not sure if it works but can you make the Portchannle layer-3 and use BFD with static routes? The problem with MCs is that even though one side of the connection is down, the other switch thinks it is up and running because there is nothing wrong with the other MC/link.
Hi, it has to be L2 and the MC's have to be connected as described. The issue you bring up is the exact reason I need to either use BFD or some other mechanism for end-to-end fault detection.
As for static routes, both sides see each other as directly connected via L2 so static routes are irrelevant.
Something that comes to mind. Have you confirmed that your media converters have no option to indicate to other hardware they've lost a link. (Reason I ask, in the past, had some L2 connections being provided by an optical network, and discovered the equipment had to be configured to "signal" to the end points the link was down/broken. [First time I encountered this architecture, I wondered how one of our "fiber" links used SX optics on one end and the other end LX optics.])
Configure the etherchannel protocol to be LACP (channel-group x mode active). This will (should?) detect a far-end failure within 90-seconds using the default 30-second rate timers. You can reduce this to 3-seconds by using LACP fast rate interface command (lacp rate fast). You can also enable UDLD which should also detect L1/L2 issues before LACP (udld port aggressive).