cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
989
Views
8
Helpful
8
Replies

dlsw - circuit problem

johnyoon75
Level 1
Level 1

Hi. all

I have several routers with dlsw feature.

Routers works properly yesterday but not today.

Someone complained to me why sna communication be delayed today?

So i tried to check such as dlsw peer, circuit.

i found instatble of the Circuit.

circuit's uptime be reset...

How can i check this symtom..?

Regard.

John

8 Replies 8

ehirsel
Level 6
Level 6

Check the log messages of the routers which have an interface in the dlsw circuit that is unstable.

Also do a show int on that interface to see if the interface was reset, or if there are any line errors.

Let me know what you find.

Did you do any hardware or software updates prior to the outage? If not, you may need to work with your provider and have them check to see what they can find.

Another item to check would be loose cabling, for example did someone move any equipment near the router?

mbinzer
Cisco Employee
Cisco Employee

Hi,

when you say you have seen reset circuit uptimes. Did the dlsw peer stay up? You can check the uptime on a show dlsw peer. If the peer stayed up and only a couple of circuits went down than you need to provide information about how many circuits? Are they all across the same circuit? Topology....

If a dlsw peer went down aswell than all circuits over this peer got reset as a consequence. Than you need to start investigating why the tcp session went away. Ie. check with the provider, maintenance work in the data center....

Depending on your ios version. If it is a recent one 12.2 or 12.3 the router keeps a history log of the dlsw circuits. By default only the last 32 circuits are cached. So it depends a bit on how many circuits you have.

show dlsw circuit history detail

gives you a list of the last 32 circuits with a couple of additional information. i.e. timestamps when the circuit got connected and disconnected, fsm state machine events ect.

Please have a look if you have a dlsw circuit history available for one of the circuits that went down. This will give you some basic starting points on where to look. I.e. local issue, WAN issue, ect. If you need help in interpreting the output than post it here and i have a look at it.

thanks...

Matthias

Thank you for replying.

The Topology is below.

HOST HOST

| |

----------

|

4006 Switch

|

-----------

| |

Router1 Router2

| |

ATM ATM

| |

\ /

\ /

\ /

Router3

|

L2 Switch

|

SNA G/W

There are dlsw peer between Router1 and Router 3 and

Router 2 and Router3.

The Router 3 was configured loadbalancing(circuit-count) for sna traffic.

Acually, i found some abnormal history by using "show dlsw circuit history detail".

469762913 0060.9436.d70a(04) 4848.2424.1212(04) 172.20.52.231

Created at : 10:59:11.230 KST Mon Dec 13 2004

Connected at : 10:59:11.314 KST Mon Dec 13 2004

Destroyed at : 11:46:50.699 KST Mon Dec 13 2004

Local Corr : 469762913 Remote Corr: 704643190

Bytes: 86335174/88147104 Info-frames: 106541/108372

XID-frames: 5/4 UInfo-frames: 0/0

Flags: Local created, Remote connected

Last events:

Current State Event Add. Info Next State

-------------------------------------------------------------------

CONNECTED DLC DataInd 0x0 CONNECTED

CONNECTED DLC DataInd 0x0 CONNECTED

CONNECTED DLC DataInd 0x0 CONNECTED

CONNECTED WAN ifcm 0x0 CONNECTED

CONNECTED WAN infoframe 0x0 CONNECTED

CONNECTED WAN infoframe 0x0 CONNECTED

CONNECTED WAN infoframe 0x0 CONNECTED

CONNECTED WAN infoframe 0x0 CONNECTED

CONNECTED DLC DataInd 0x0 CONNECTED

CONNECTED WAN infoframe 0x0 CONNECTED

CONNECTED WAN infoframe 0x0 CONNECTED

CONNECTED WAN infoframe 0x0 CONNECTED

CONNECTED WAN infoframe 0x0 CONNECTED

CONNECTED WAN halt-dl 0x0 HALT_PENDING

HALT_PENDING DLC DiscCnf 0x0 CLOSE_PEND

CLOSE_PEND DLC CloseStnCnf 0x0 DISCONNECTED

Hi,

i am not sure if i understand your topolgoie from your last post.

Router3 has a peer to router1 and

Router3 has a peer to router2

Are router1 and router2 connected to a common ethernet? Since you mention a 4006 Switch. If you have those two routers on a common ethernet this can cause problems, depending on the type of configuration you are using.

I would advise to open a case with the tac so that this can be worked on more in detail. If you want you can send me the case number and i can help the engineer who picks it up in providing the information we have already collected.

In respect to the show dlsw circuit history detail.

WAN halt-dl means that the other router, the dlsw peer, told us to close down this circuit. And the router complied successfull. So in this case the error is somewhere on the other end of the dlsw peer.

I will respond separate to your second post.

Matthias

603980660 0060.9437.033d(04) 4848.2424.12e2(04) 172.20.51.231

Created at : 09:15:23.070 KST Wed Dec 15 2004

Connected at : 09:15:23.250 KST Wed Dec 15 2004

Destroyed at : 20:42:27.593 KST Wed Dec 15 2004

Local Corr : 603980660 Remote Corr: 1358954590

Bytes: 770812317/781196859 Info-frames: 968574/980744

XID-frames: 5/4 UInfo-frames: 0/0

Flags: Local created, Remote connected

Last events:

Current State Event Add. Info Next State

-------------------------------------------------------------------

CONNECTED DLC DataInd 0x0 CONNECTED

CONNECTED WAN infoframe 0x0 CONNECTED

CONNECTED DLC DataInd 0x0 CONNECTED

CONNECTED DLC DataInd 0x0 CONNECTED

CONNECTED WAN infoframe 0x0 CONNECTED

CONNECTED WAN infoframe 0x0 CONNECTED

CONNECTED DLC DataInd 0x0 CONNECTED

CONNECTED DLC DataInd 0x0 CONNECTED

CONNECTED WAN infoframe 0x0 CONNECTED

CONNECTED DLC DataInd 0x0 CONNECTED

CONNECTED DLC DataInd 0x0 CONNECTED

CONNECTED WAN infoframe 0x0 CONNECTED

CONNECTED DLC DataInd 0x0 CONNECTED

CONNECTED ADM WanFailure 0x0 HALT_NOACK_PEND

HALT_NOACK_PEND DLC DiscCnf 0x0 CLOSE_PEND

CLOSE_PEND DLC CloseStnCnf 0x0 DISCONNECTED

I'm curious about the mean "HALT_NOACK_PEND".

because, IBM engineer told me that the sequence number mis-match. So the Host didn't accept sna traffic. But i'm not sure about it.

Because, the dlsw peer stable status...

Anyway, i removed the conguration "dlsw locad balancing" command yesterday.

But the symptom appear today...

How do i check to slove the problem...?

Hi,

the Event is ADM WanFailure, the result, the next state we go to is HALT_NOACK_PENDING. What that means is the dlsw peer went away and this circuit was dropped because we are not able to talk to the remote peer anymore. HALT_NOACK_PEND means if we close a circuit under regular circumstances we would tell the dlsw peer to close its end of the connection and wait for an ack from the peer. However WanFailure is so severe that we can not communicate with the other end anymore, as a consequence the router simply drops its end of the circuit and relies on a timeout or also a WanFailure event on the remote peer to clean up its end. We do not send any notification to the other peer since the WAN is disrupted. We also do not wait for any ack from the other end. We simply clean up.

Again i would advice to open a case with the tac and the work the issue in detail.

We need to see a show dlsw peer to see the uptime on all peers. We also need to see a show tech from the routers involved to understand the configuration details.

Additional a real topologie drawing, as detailed as possible to understand how your physcial connections are.

You can also do a show dlsw circuit history detail on the remote peer router to determine if you can find the circuit there and to check what this router thinks.

The first thing is to understand your topolgoie. In a sense that we can be sure there is no looping condition which can cause us problems.

Also from this show output i would double check the uptimes of the dlsw peers.

thanks...

Matthias

Thank you helping.

I'm curious abut the ADM WanFailure status.

Could you please tell the reason?

And the peer never went away...

Hi,

as far as i can tell the software is under the impression that the dlsw peer/tcp session is not available. Sorry there is not much more information to go on. There is more troubleshooting needed to understand what goes on, as i already tried to explain earlier.

thanks...

Matthias