Re: REP shared alternate port key process ?

LaurentRA · ‎03-03-2020

Hello community,

I had an issue on a customer site with a ring topology using REP where one port was in Fail state.

Physically (layer2) the port was fine.

But when I looked at REP status for that switch in Fail, the link is ok in Two Way state but the key identifying the current Alternate port is wrong on the Gi1/1 port compared to the Gi1/2 port and all other REP ports of the Segment.

Doing a shut / no shut on the failed port did nothing.

And removing the REP configuration (no rep segment 1) then reenter it, made the key to be all zero as on the capture below.

In order to solve the issue I had to reload the switch.

The access switches in the ring are IE2000 15.2(6)E1 and there are two IE5000 15.2(6)E1 as core/distribution both REP Edge (Primary and Secondary respectively) and with one Portchannel (no REP) in between.

I will ask the customer to install the lastest version 15.2(7)E1a to date.

But because I have not been able to reproduce the fault I would like to know or found more details about the neighboring process (I do not know if I use the right term) of REP. And what could prevent (apart from obvious physical failure of a port) the learning of that shared key.

At the moment I have not found any document to help me troubleshoot that part.

Thanks

Cristian Matei · ‎03-03-2020

Hi,

This looks like expected behaviour. Due to the inherent design of REP, those keys are used for loop prevention. Initially the port was in failed(blocking) thus no key existed (000000). When the port moves from the Fail state to Open or Alternate, it will generate a unique key based on the port number and a random number.

Regards,

Cristian Matei.

LaurentRA · ‎03-03-2020

Hello, thanks for the reply

The failed port stayed in that state (Failed with wrong key) for days from what the site operators told me. I do not know what was the cause.

Shutting down the port and Up again did not allow to clear the previous wrong learned key.

And reconfiguring the port as REP only change the key to 00000... But after minutes of waiting that Fail port never changed state and never received the key.

It is only after I forced a reload that the switch restarted, get the correct key and its port was placed in Open state.

>From what I found and understand, that key is composed of a random number followed by the PortID of the current elected Alternate port.

And a PortID is composed of a port number (4 digit) followed by the Switch's MAC address.

All REP ports of a Segment must have the same Key to identify the current Alternate port.

When one of the switches within the ring detect a failure, it send a notification message (destination MAC address 0100.0ccc.ccce) to all the segment member with that key in the message so that the current alternate port recognize itself, validate the message with the random number and reactivate itself (move from Alternate to Open) to recover communication.

But I cannot find the complete process (key sharing) or what could make a REP port to not have the right key when at Layer 2 all seems to go well.

LaurentRA · ‎03-03-2020

The best details I can found so fare are from the following technical support even if the drawings are difficult to read.

Resilient Ethernet Protocol Overview

randreetta · ‎03-24-2023

they are still there, but to be honest I don't find those explanations enlightening. Why introducing keys solves the previously explained race creating temporary loops in the network ?

Cristian Matei · ‎03-03-2020

Hi,

I see what you mean now. I would upgrade.

Regards,

Cristian Matei.

LaurentRA · ‎03-04-2020

Yes for the moment that's what I will do, Upgrade.