cancel
Showing results for 
Search instead for 
Did you mean: 
cancel

Channel outage caused by DCM toggling between Main and Backup services

252
Views
0
Helpful
0
Comments

Introduction

A Service Provider experienced intermittent outages, during backup path being down due to fibre maintenance. An SD channel on MFP was switching to the backup path; however the backup path was down due to a fibre cut. They experienced a total of 10 switches resulting in a total of approximately five minutes of outage.

The service was switching back and forth between the main channel and the backup channel and causing outage.

 Prerequisites

Service backup has been configured for the channel on DCM and backup service is not available or is degraded. Default setting for service backup as below for the I/O card where the output service exists.

picture.jpg

Background information

DCM configuration was controlled by ROSA VSM

Troubleshooting information

Logs were collected from the DCM and analysed.

DCM logs had several messages as below for the board which had problems:

Oct 17 13:38:54 board1 DCM_IO[2236]: ** TRA-INF: Service loss Board=1, Port=0, Ident=59, SID=7 triggered by scrambled at input

Oct 17 13:38:54 board1 DCM_IO[2236]: ** TRA-INF: TS loss Board=1, Port=0, Ident=59 triggered by service loss

Oct 17 13:38:55 board1 DCM_IO[2236]: ** TRA-INF: Main TS: Board=1, Port=0, Ident=59: Switch to backup source Board=1, Port=2, Ident=31

The service then reverts back to the main:

Oct 17 13:38:56 board1 DCM_IO[2236]: ** TRA-INF: Service loss Board=1, Port=0, Ident=59, SID=7 cleared.

Oct 17 13:39:55 board1 DCM_IO[2236]: ** TRA-INF: Main TS: Board=1, Port=0, Ident=59: Revert to higher prio source Board=1, Port=0, Ident=59

From the above messages, it is clear that the input service was going in to scrambled mode and this was causing service loss trigger to be activated and the service switching to backup. Once the main service went in to clear, it was reverting back to main service. This back and forth switching was causing the outages.

Cause

The cause of the issue was that an unused DPI PID was being periodically scrambled. Since the service backup setting on DCM had “PID list” triggers indicating a switch to backup if scrambling occurred, the switch was occurring even though the scrambled PID was not in use.

Although the backup service was not available, it was still switching to backup because on the Default Service Backup Setting for the card, the option “Don’t Switch to Backup Source in Service Loss” and as it was set to Revertive, it was reverting back to Main when the service loss alarm was cleared on the Main service.

Solution implemented

To correct the issue, in VSM, the same service loss template was applied to the backup streams as the main streams. This caused the backup service to alarm in the same way and thus prevented the DCM from switching to backup during the scrambling. Also, it allowed the “scrambling” trigger to remain, in the event that a used PID, e.g. video, became scrambled on the main and not on the backup.

Alternative Solution

The issue could also be prevented by simply ticking the option “Don’t Switch to Backup Source in Service Loss”. Thus, the service would not switch to backup if the backup service is not available.

 

CreatePlease to create content
Content for Community-Ad
August's Community Spotlight Awards