cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1303
Views
0
Helpful
16
Replies

TLS CLOSEWAIT problem - can not connect to router

Hi,

We have a onep application which usually works just fine. Now, we are having a problem connecting to the router. It seems like the router does not clean up the state sufficiently. As can be seen below, there has been an error for a connection and now there are many TCP connections (port 15002 for TLS) in the CLOSEWAIT state.

It seems like a reboot of the router is necessary to get back to a normal state? Is this a known problem?

R2#show onep session all

R2#show onep statistics

Active sessions: 0

Established sessions: 18

Total session disconnects: 18

  Admin initiated disconnects: 0

  Remote disconnects: 0

  Error disconnects: 18

Total errors: 1

  Authentication errors: 0

  Duplicate application name error 1

  Memory errors 0

  Internal errors 0

Rate limiting:

  Total TCP connects: 37

  Rejected connects: 0

  Accepted connects: 0

  Unaffected connects: 37

Most recent failed connection attempts:

Connection #1 attempted Sun Sep 21 08:48:49 2014

  Remote host: 20.5.2.242

  Reason: Internal system error, API Channel failed to transition to Connecting state for session test.app-UCS-E-R2-9454

  Reason code: 0

  Connection sequence number: 37

R2#

R2#show tcp brief

TCB       Local Address               Foreign Address             (state)

21DD9EC8  20.5.2.241.15002           20.5.2.242.45802            CLOSEWAIT

C195FFDC  20.5.2.241.23              20.5.2.242.58036            ESTAB

3DD524E8  20.5.2.241.15002           20.5.2.242.45803            CLOSEWAIT

21E3D0E4  20.5.2.241.15002           20.5.2.242.45805            CLOSEWAIT

41158A64  20.5.2.241.15002           20.5.2.242.45804            CLOSEWAIT

40CD3424  20.5.2.241.15002           20.5.2.242.45800            CLOSEWAIT

C01E14A8  20.5.2.241.15002           20.5.2.242.45806            CLOSEWAIT

R2#

R2#show onep status

Status: enabled by: Config

Version: 1.2.0

Transport: tls; Status: running; Port: 15002; localcert: TP-self-signed-3937507470; client cert validation disabled

Certificate Fingerprint SHA1: 90F9692E 942D0DD4 274D7632 EDAC0467 5AE43F70

Transport: tipc; Status: disabled

Session Max Limit: 10

CPU Interval: 0 seconds

CPU Falling Threshold: 0%

CPU Rising Threshold: 0%

History Buffer: Enabled

History Buffer Purge: Oldest

History Buffer Size: 32768 bytes

History Syslog: Disabled

History Archived Session: 16

History Max Archive: 16

Trace buffer debugging level is info

Service Set: Base               State: Enabled     Version 1.2.0

Service Set: Vty                State: Disabled    Version 0.1.0

Service Set: Mediatrace         State: Disabled    Version 1.0.0

R2#

R2#show version

Cisco IOS Software, C2900 Software (C2900-UNIVERSALK9-M), Version 15.4(2)T, RELEASE SOFTWARE (fc1)

Technical Support: http://www.cisco.com/techsupport

Copyright (c) 1986-2014 by Cisco Systems, Inc.

Compiled Wed 26-Mar-14 14:14 by prod_rel_team

ROM: System Bootstrap, Version 15.0(1r)M16, RELEASE SOFTWARE (fc1)

R2 uptime is 2 weeks, 4 days, 18 hours, 14 minutes

Everyone's tags (2)
16 REPLIES 16
Hall of Fame Cisco Employee

Re: TLS CLOSEWAIT problem - can not connect to router

You mention that this is only one application.  Do you have other applications not exhibiting this problem?  How do you close the onePK application session in your code?

Re: TLS CLOSEWAIT problem - can not connect to router

Hi Joseph,

The router has not been rebooted yet, and in the current state no application can connect to the router, including tutorial applications. When an application is now started, the TCP connection is established (by TLS) as I can see both on the PC (netstat) and the router (show tcp), but nothing more happens. When the application is then terminated, the connection is cleaned up on the PC as it should, while on the router the TCP connection enters the CLOSEWAIT state, where it remains.

The onePK application session is terminated in different ways, including externally with Ctrl-C. It might be that some edge case has been triggered somehow. However, one would expect that no matter how the session is terminated, the router would clean up the resources and get back into normal operation?

Best regards

Viktor

Hall of Fame Cisco Employee

Re: TLS CLOSEWAIT problem - can not connect to router

Yes, I agree.  The router should not be seeing stale TCP sessions.  However, if you can determine what type of disconnect causes this, it will be easier to reproduce and file a bug.

Re: TLS CLOSEWAIT problem - can not connect to router

Hi Joseph,

I think it would be reasonable that someone have a look at the code running on the router relevant for the error message posted in the first email, that is:

Most recent failed connection attempts:

Connection #1 attempted Sun Sep 21 08:48:49 2014

  Remote host: 20.5.2.242

  Reason: Internal system error, API Channel failed to transition to Connecting state for session test.app-UCS-E-R2-9454

  Reason code: 0

  Connection sequence number: 37

More specifically, to see what happens in the code when the API Channel tries to transition to Connecting state, but fails. In other words, whether the required cleanup is performed in this case or not.

The error message seems to indicate a failure in between TCP connection establishment and higher level (TLS / internal) connection establishment. The failure might be what causes the router in the following to not accept any more connections from onep applications.

Best regards

Viktor

Hall of Fame Cisco Employee

Re: TLS CLOSEWAIT problem - can not connect to router

It does appear that the same cleanup may not be done in all cases when this error occurs.  Can you confirm that the stale sessions always come about with this error?

Re: TLS CLOSEWAIT problem - can not connect to router

I am not sure whether the stale TCP sessions always come about with this error or not. Based on the time of the error, it seems plausible that they are related. We will follow up if and when something similar shows up again. Please let us know if this is somehow identified in the code and fixed in an upcoming version.

Best regards

Viktor

Beginner

Re: TLS CLOSEWAIT problem - can not connect to router

Please also collect 'show onep trace/error' and/or 'debug onep server session level debug' log for more clues.

Seems the client socket is cosed abrupt, leaving close waits on server. The outstanding close wait socket seems small number but exhausted limit, that system can no more spawn ephemeral ports is little surprising. We need to check shutdown is used instead of close when session I/o fails. Please raise a bug to track this.

Hall of Fame Cisco Employee

Re: TLS CLOSEWAIT problem - can not connect to router

I filed CSCur07539 to track this pending the additional information from Viktor.

Highlighted

Re: TLS CLOSEWAIT problem - can not connect to router

OK, fine. Thanks Joseph.

Best regards

Viktor

Hall of Fame Cisco Employee

Re: TLS CLOSEWAIT problem - can not connect to router

Viktor, can you post the show and debug output Atul requested when you see this problem?  Thanks.

Beginner

Re: TLS CLOSEWAIT problem - can not connect to router

My hunch is close-wait might not be real issue in here. From the error message, seems its complaining about oneP state machine level error. 

I would like to understand in more details how this problem state was reached, and if the scenario is repeatable.

What is intriguing is why no other app would be able to connect..

Debug logs surely will help get more clues please.

Btw, you might want to try "conf t; no onep" instead of reboot of the device to come out of the problematic state.

Re: TLS CLOSEWAIT problem - can not connect to router

Yes, we will try to collect more information. We have not seen this error since we reported it. The router was rebooted, which as expected resolved the problem. Next time we will try "conf t; no onep" as suggested.

To speculate, it might be possible to provoke the failure by terminating the app/session after TCP connection establishment, but before TLS reaches the connected state, but we have not tried that so far.

Best regards

Viktor

Beginner

Re: TLS CLOSEWAIT problem - can not connect to router

Thanks Viktor, could you please fill me in, what version of image and SDK you have please.

Regards,

Atul.

Re: TLS CLOSEWAIT problem - can not connect to router

The SDK for development was sdk-c64-1.2.1.194. The router image was listed in the initial post for this thread:

R2#show version

Cisco IOS Software, C2900 Software (C2900-UNIVERSALK9-M), Version 15.4(2)T, RELEASE SOFTWARE (fc1)

Technical Support: http://www.cisco.com/techsupport

Copyright (c) 1986-2014 by Cisco Systems, Inc.

Compiled Wed 26-Mar-14 14:14 by prod_rel_team

"ROM: System Bootstrap, Version 15.0(1r)M16, RELEASE SOFTWARE (fc1)

R2 uptime is 2 weeks, 4 days, 18 hours, 14 minutes

Best regards

Viktor

Content for Community-Ad
August's Community Spotlight Awards
This widget could not be displayed.