cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
18158
Views
25
Helpful
9
Replies

SDL Link Out of Service

TexasBrandon
Level 1
Level 1

Starting since Friday I have been noticing this error.  While I did get it occasionally (once every month or so), now it is becoming a recurring event at almost the same time everyday, 5:52 PM CDT. 

I am currently running CUCM 8.6.1.20000-1 in a virtualized envrionment with one other CUCM which is a subscriber. Below is the alert detail:

Current outstanding SDLLinkOOS events:

LocalNodeId : 1

LocalApplicationID : 100

RemoteIPAddress : 192.168.123.126

RemoteNodeID : 2

RemoteApplicationID : 100

LinkID : 1:100:2:100

AppID : Cisco CallManager

ClusterID : KIHQVMCM-Cluster

NodeID : KIHQVMCMP01

TimeStamp : Sun Apr 21 17:52:10 CDT 2013

The alert is generated on Sun Apr 21 17:52:45 CDT 2013 on node 192.168.233.22.

Now I have been pouring through the SDL and SDI logs trying to make heads or tails of what it is telling me but all I can really determine is the event did in fact happen.  I see no network issues and this affects my phones and MTP resources when the SDL goes out of service.  On top of me watching the network at the time it happens, I know for sure there are no power over ethernet issues as all of these phones are distributes between about 5 different switches.  I have ran pings to the phones, CUCM, and the switches as well as verify the power over ethernet and am now at the mercy of everyone here.  I think I might try to do a packet capture with wireshark tonight and hope for the best but I think I have exhausted all of my options up to now.  Below is some of the SDI logs.

17:52:10.751   |GenAlarm: Push_back offset 108 seq 108|*^*^*
17:52:10.751   |-->RISCMAccess::RemoteCMOutOfService(...)|*^*^*
17:52:10.751 |RemoteCMOutOfService: Ip   address: 192.168.123.126 remoteClusterId KIHQVMCM-Cluster|*^*^*
17:52:10.751 |SDLLinkOOS - SDL link to   remote application is out of service Local Node ID:1 Local Application ID:100   Remote Application IP Address:192.168.123.126 Remote Node ID:2 Remote   Application ID:100 Unique Link ID:1:100:2:100 App ID:Cisco CallManager   Cluster ID:KIHQVMCM-Cluster Node ID:KIHQVMCMP01|Alarm^*^*
17:52:10.751   |<--RISCMAccess::RemoteCMOutOfService(...)|*^*^*
17:52:10.750   |SipMcuControlInit::star_SdlLinkOOS - Cleaning up UCB Resources|0,0,0,0.0^*^*
17:52:10.750 |PstnFallBackD::   star_SdlLinkOOS SdllinkOOS nodeId = 2, appId = 100|0,0,0,0.0^*^*
17:52:10.750 |RoutePlanManager-   handleSdlLinkOOS|0,0,0,0.0^*^*
17:52:10.750 |ReplacesManager -   wait_SdlLinkISV, nodeId= 2|0,0,0,0.0^*^*
17:52:10.750 |SdllinkOOS nodeId = 2,   appId = 100|0,0,0,0.0^*^*
17:52:10.750 |star_SdlLinkOOS:   nodeId=2|0,0,0,0.0^*^*
17:52:10.750 |EpaMap for node 2   removed.|0,0,0,0.0^*^*
17:52:10.750   |ParkingLotD::star_SdlLinkOOS|0,0,0,0.0^*^*
17:52:10.750 |PRManager -   star_SdlLinkOOS|0,0,0,0.0^*^*
17:52:10.750 |PRManager - removing node 2   from the table|0,0,0,0.0^*^*
17:52:10.750 |MonitorManager   -::star_SdlLinkOOS nodeId = 2, appId = 100|0,0,0,0.0^*^*
17:52:10.750 |MRM::waiting_SdlLinkOOS -   Cleaning up MTP Resources|0,0,0,0.0^*^*
17:52:10.750 |MRM::waiting_SdlLinkOOS -   Cleaning up MOH Resources|0,0,0,0.0^*^*
17:52:10.750 |MRM::waiting_SdlLinkOOS -   Cleaning up UCB Resources|0,0,0,0.0^*^*
17:52:10.750 |MRM::waiting_SdlLinkOOS -   Cleaning up ANN Resources|0,0,0,0.0^*^*
17:52:10.750 |MRM::waiting_SdlLinkOOS -   Cleaning up RSVP Resources|0,0,0,0.0^*^*
17:52:10.750 |LocatorService -   star_SdlLinkOOS, nodeId= 2|0,0,0,0.0^*^*
17:52:10.750 |SdllinkOOS nodeId = 2,   appId = 100|0,0,0,0.0^*^*
17:52:10.750 |Delete entries from   FeatActTable, now this table has 41 entries|0,0,0,0.0^*^*
17:52:10.750 |CallBackManager - SDL Link   OOS|0,0,0,0.0^*^*
17:52:10.751 |AgentGreetingManager   -::star_SdlLinkOOS nodeId = 2, appId = 100|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(7)::waiting_SdlLinkOOS -- Link OOS   received|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(7)::sendMediaFailureDetection|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(7)::clearRSVPSessions|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(7)::clearDscpInfoTable|0,0,0,0.0^*^*
17:52:10.751 |DMMSStationD-SNRD:   (0000008) wait_SdlLinkOOS() from   nodeID=2, myPriority=0|0,0,0,0.0^*^*
17:52:10.751 |DMMSStationD-SNRD:   (0000009) wait_SdlLinkOOS() from   nodeID=2, myPriority=0|0,0,0,0.0^*^*
17:52:10.751 |SNRD:   (0000039) wait_SdlLinkOOS() from   nodeID=2, myPriority=0|0,0,0,0.0^*^*
17:52:10.751 |SNRD:   (0000040) wait_SdlLinkOOS() from   nodeID=2, myPriority=0|0,0,0,0.0^*^*
17:52:10.751 |SNRD:   (0000041) wait_SdlLinkOOS() from   nodeID=2, myPriority=0|0,0,0,0.0^*^*
17:52:10.751 |SNRD:   (0000042) wait_SdlLinkOOS() from   nodeID=2, myPriority=0|0,0,0,0.0^*^*
17:52:10.751 |SNRD:   (0000043) wait_SdlLinkOOS() from   nodeID=2, myPriority=0|0,0,0,0.0^*^*
17:52:10.751 |SNRD:   (0000044) wait_SdlLinkOOS() from   nodeID=2, myPriority=0|0,0,0,0.0^*^*
17:52:10.751 |SNRD:   (0000046) wait_SdlLinkOOS() from   nodeID=2, myPriority=0|0,0,0,0.0^*^*
17:52:10.751 |SNRD:   (0000047) wait_SdlLinkOOS() from   nodeID=2, myPriority=0|0,0,0,0.0^*^*
17:52:10.751 |SNRD:   (0000048) wait_SdlLinkOOS() from   nodeID=2, myPriority=0|0,0,0,0.0^*^*
17:52:10.751 |SNRD:   (0000049) wait_SdlLinkOOS() from   nodeID=2, myPriority=0|0,0,0,0.0^*^*
17:52:10.751 |SNRD:   (0000051) wait_SdlLinkOOS() from   nodeID=2, myPriority=0|0,0,0,0.0^*^*
17:52:10.751 |SNRD:   (0000054) wait_SdlLinkOOS() from   nodeID=2, myPriority=0|0,0,0,0.0^*^*
17:52:10.751 |SNRD:   (0000056) wait_SdlLinkOOS() from   nodeID=2, myPriority=0|0,0,0,0.0^*^*
17:52:10.751 |SNRD:   (0000058) wait_SdlLinkOOS() from   nodeID=2, myPriority=0|0,0,0,0.0^*^*
17:52:10.751 |SNRD:   (0000059) wait_SdlLinkOOS() from   nodeID=2, myPriority=0|0,0,0,0.0^*^*
17:52:10.751 |SNRD:   (0000061) wait_SdlLinkOOS() from   nodeID=2, myPriority=0|0,0,0,0.0^*^*
17:52:10.751 |SNRD:   (0000067) wait_SdlLinkOOS() from   nodeID=2, myPriority=0|0,0,0,0.0^*^*
17:52:10.751 |SIPStationD(1,100,65,97),   ATA24B65745CE69, 192.168.123.245:1052, primaryDN=2539043, LinkState:   SdlLinkOOS (node 2)|0,0,0,0.0^*^*
17:52:10.751 |SIPStationD(1,100,65,97),   ATA24B65745CE69, 192.168.123.245:1052, primaryDN=2539043, wait_SdlLinkOOS:   mLineRegisterReqsOutstanding=0|0,0,0,0.0^*^*
17:52:10.751 |SIPStationD(1,100,65,98),   ATA24B65745CF22, 192.168.123.171:1048, primaryDN=2539045, LinkState:   SdlLinkOOS (node 2)|0,0,0,0.0^*^*
17:52:10.751 |SIPStationD(1,100,65,98),   ATA24B65745CF22, 192.168.123.171:1048, primaryDN=2539045, wait_SdlLinkOOS:   mLineRegisterReqsOutstanding=0|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(264)::waiting_SdlLinkOOS -- Link OOS   received|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(264)::sendMediaFailureDetection|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(264)::clearRSVPSessions|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(264)::clearDscpInfoTable|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(265)::waiting_SdlLinkOOS -- Link OOS   received|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(265)::sendMediaFailureDetection|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(265)::clearRSVPSessions|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(265)::clearDscpInfoTable|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(266)::waiting_SdlLinkOOS -- Link OOS   received|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(266)::sendMediaFailureDetection|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(266)::clearRSVPSessions|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(266)::clearDscpInfoTable|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(289)::waiting_SdlLinkOOS -- Link OOS   received|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(289)::sendMediaFailureDetection|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(289)::clearRSVPSessions|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(289)::clearDscpInfoTable|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(290)::waiting_SdlLinkOOS -- Link OOS   received|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(290)::sendMediaFailureDetection|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(290)::clearRSVPSessions|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(290)::clearDscpInfoTable|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(291)::waiting_SdlLinkOOS -- Link OOS   received|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(291)::sendMediaFailureDetection|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(291)::clearRSVPSessions|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(291)::clearDscpInfoTable|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(292)::waiting_SdlLinkOOS -- Link OOS   received|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(292)::sendMediaFailureDetection|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(292)::clearRSVPSessions|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(292)::clearDscpInfoTable|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(293)::waiting_SdlLinkOOS -- Link OOS   received|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(293)::sendMediaFailureDetection|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(293)::clearRSVPSessions|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(293)::clearDscpInfoTable|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(294)::waiting_SdlLinkOOS -- Link OOS   received|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(294)::sendMediaFailureDetection|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(294)::clearRSVPSessions|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(294)::clearDscpInfoTable|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(295)::waiting_SdlLinkOOS -- Link OOS   received|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(295)::sendMediaFailureDetection|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(295)::clearRSVPSessions|0,0,0,0.0^*^*
17:52:10.751   |MediaTerminationPointControl(295)::clearDscpInfoTable|0,0,0,0.0^*^*
17:52:10.751 |StationD - adding   linestruct at index 1
|*^*^*
17:52:10.751 |StationD:   (0004229) Device SEP0024C4FC3816 of   line=1, UnRegisters with SDL Link to monitor NodeID= 2.|0,0,0,0.0^*^*
17:52:10.751 |StationD:   (0004229) Device SEP0024C4FC3816   restart0_SdlLinkOOS has lineReregisterCounter=1.|0,0,0,0.0^*^*
1 Accepted Solution

Accepted Solutions

Ayodeji Okanlawon
VIP Alumni
VIP Alumni

Brandon,

SDL link is the TCP connection that enables intra cluster communication between cucm servers. When the TCP link is broken, this laert is generated..This is definitelhy a network issue...Something is unstable somewhere in the path between the two servers...

Please rate all useful posts

"opportunity is a haughty goddess who waste no time with those who are unprepared"

Please rate all useful posts

View solution in original post

9 Replies 9

Ayodeji Okanlawon
VIP Alumni
VIP Alumni

Brandon,

SDL link is the TCP connection that enables intra cluster communication between cucm servers. When the TCP link is broken, this laert is generated..This is definitelhy a network issue...Something is unstable somewhere in the path between the two servers...

Please rate all useful posts

"opportunity is a haughty goddess who waste no time with those who are unprepared"

Please rate all useful posts

I may need to setup a packet capture between the two ports then, as nothing in my Solarwinds nor logs indicate an issue.  I appreciate the feedback and will continue working this issue.  Hopefully I can get a solution and post it for anyone else with this issue.

I wouldn't be surprised if you don't find anything strange between this ports you capture. 

I have the same problem and could relate this error with backup running on CUCM9 server. When backup starts network ports gets a little bit congested or cpu can't hadnle so many requests and it reports SDLL error. It is a gigabit eth, and i don't find any network errors on switches between, so persuming it's a server thing.

Did you ever get to the bottom of this? I am building a pre-production CUCM 9.1(2) CUCM cluster running on UCS C240 M3s and ESXi 5.1SU3. I am receving constnat SDL Links out of service and all the CUCM hosts are connected on a brand new Cisco switched LAN environment with no clustering over the WAN. I have checked the interfaces for the vNICs and used Cisco Unified Reporting to do a Cluster Overview whcih reports no errors at all.

I can't figure out what network issue could be causing these problems, any guidance you could give me would be much appreciated. 

Hi Brandon.

 

Did you ever get to the bottom of this? I am building a pre-production CUCM 9.1(2) CUCM cluster running on UCS C240 M3s and ESXi 5.1SU3. I am receving constnat SDL Links out of service and all the CUCM hosts are connected on a brand new Cisco switched LAN environment with no clustering over the WAN. I have checked the interfaces for the vNICs and used Cisco Unified Reporting to do a Cluster Overview whcih reports no errors at all.

I can't figure out what network issue could be causing these problems, any guidance you could give me would be much appreciated. 

TexasBrandon
Level 1
Level 1

Just an update incase someone else has this same problem in the future or currently.  It is definitely a network issue as you noted.  While I still cannot find anything wrong, my NOC is now reporting and showing that their Agent Desktop that interfaces with Contact Center Express is also dropping at the same time.

Brandon this is usually a tough one..look at your routers if you have a router betwwen the servers..look at the interfaces..Any resets? You may also want to use wireshark to see if you have bad TCP segments or packet loss in the network or the path that connect the servers

Please rate all useful posts

"opportunity is a haughty goddess who waste no time with those who are unprepared"

Please rate all useful posts

No flaps of interfaces, router/switch reloads, or any noticeable packet loss.  I am going to watch the switch tonight during the time period and if nothing shows up then I will move to a packet capture.  Our network is fairly simple, just a "DMZ" catalyst switch, ASA, Fortigate, and router for that portion of the network.  No changes were made to the network so that thrrew that idea out the window and no upgrades to any software on both the CUCM and IOS related equipment. The last thing in the log on my switch, which is closest to the CUCM cluster is from Friday and nothing since and it isn't related to the current situation.  This will most likely take me some time to figure out.

Any update regarding the issue above?

We have 3 clusters  on 2 esxi platform located in 2 different locatiıns

 One cluster have same issues. Sdllink out of service errors. Nearly 5 times per day. Version is 9.1.2 

The other 2 clusters are not having this issue. So i think its not a network problem?

The problem occured after a SVC node was down in one location esxi vm. Any ideas for fixing the problem?

Is it possible to understand whether file sytem is corrupted  or not?

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: