cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3562
Views
0
Helpful
5
Replies

FTP 3 way handshake not completing for just a single host!

DAZN_Network
Level 1
Level 1

Hi,

I have a problem with our ACE load balancers. We run a public FTP server farm which is load balanced using the ACEs. I have come across a problem which is very peculiar and is only affecting a single host, one of our offices in Poland.

Basically, the servers support Passive and Active FTP. We have clients that access the FTP service every other second so we know the existing configuration works just fine. However, our office in Poland, which sits behind a public NAT on a firewall, is unable to access our FTP service.

I have taken good and bad traces and noticed that the final TCP ACK from the client that should complete the hand shake, never makes it past the ace. The ACE show me the connection in SYNSEEN and SYNACK but never make it to the ESTABLISHED state.

PERSLOACE1/FRONT_END# show conn address x.x.x.x netmask 255.255.255.255 detail | i :21

1770432    1  in  TCP   201  x.x.x.x:30951   87.83.27.52:21        SYNSEEN

934881     1  out TCP   200  y.y.y.y:21        x.x.x.x:30951          SYNACK

The ACE is carved into contexts and the same is seen on all of them.

The office in Poland can access other FTP sites on the Internet (i.e Mozilla) so we know the problem is not localised there. Also, various customers access our FTP service so we know there is nothing wring there either.

I have spent hours trying to find related issues on the Internet but haven't found any!

In looking at my traces, the only difference I can see is that healthy packets have the DF bit set on the IP header, whereas packets from Poland do not. Could it be related? Something to do with fragmentation and/or normalisation?

When issuing the show serverfarm command, I can see failure counters incrementing but I am unable to make a connection with a cause. The 'show np' outputs are not very clear.

PERSLOACE1/FRONT_END# show serverfarm Download_FTP detail

serverfarm     : Download_FTP, type: HOST

total rservers : 3

active rservers: 3

description    : -

state          : ACTIVE

predictor      : ROUNDROBIN

failaction     : -

back-inservice    : 0

partial-threshold : 0

num times failover       : 0

num times back inservice : 0

total conn-dropcount : 0

Probe(s) :

    FTP_DL,  type = FTP

---------------------------------

                                                ----------connections-----------

       real                  weight state        current    total      failures

   ---+---------------------+------+------------+----------+----------+---------

   rserver: FT Server 1

       x.x.x.x:0         8      OPERATIONAL  13         83477      15

         description          : -

         max-conns            : -         , out-of-rotation count : -

         min-conns            : -

         conn-rate-limit      : -         , out-of-rotation count : -

         bandwidth-rate-limit : -         , out-of-rotation count : -

         retcode out-of-rotation count : -

         load value           : 0

   rserver: FTP Server 2

       x.x.x.x:0         8      OPERATIONAL  17         1269       10

         description          : -

         max-conns            : -         , out-of-rotation count : -

         min-conns            : -

         conn-rate-limit      : -         , out-of-rotation count : -

         bandwidth-rate-limit : -         , out-of-rotation count : -

         retcode out-of-rotation count : -

         load value           : 0

   rserver: FTP Server 3

       x.x.x.x:0         8      OPERATIONAL  21         2378       23

         description          : -

         max-conns            : -         , out-of-rotation count : -

         min-conns            : -

         conn-rate-limit      : -         , out-of-rotation count : -

         bandwidth-rate-limit : -         , out-of-rotation count : -

         retcode out-of-rotation count : -

         load value           : 0

Here is the config below:

probe ftp FTP_DL

  description FTP Probe

  interval 60

  passdetect interval 60

  expect status 220 220

rserver host FTP Server 1

  ip address x.x.x.x

  inservice

rserver host FTP Server 2

  ip address x.x.x.x

  inservice

rserver host FTP Server 3

  ip address x.x.x.x

  inservice

serverfarm host Download_FTP

  probe FTP_DL

  rserver FTP Server 1

    inservice

  rserver FTP Server 2

    inservice

  rserver FTP Server 3

    inservice

sticky ip-netmask 255.255.255.255 address both FTP_DL

  timeout 20

  replicate sticky

  serverfarm Download_FTP

class-map match-any FTP_DL

  2 match virtual-address 87.83.27.52 tcp eq ftp

  3 match virtual-address 87.83.27.52 tcp eq ftp-data

  4 match virtual-address 87.83.27.52 any

policy-map type loadbalance first-match FTP_DL

  class class-default

    sticky-serverfarm FTP_DL

policy-map type loadbalance first-match FTP_DL_Active

  class class-default

    sticky-serverfarm FTP_DL

policy-map multi-match FTP_Download

  class FTP_DL

    loadbalance vip inservice

    loadbalance policy FTP_DL

  class FTP_DL_Active

    loadbalance vip inservice

    loadbalance policy FTP_DL_Active

    inspect ftp

interface vlan 201

  description ACE-FW Vlan for FWLB

  ip address "gw ip' x.x.x.x /y'

  alias x.x.x.x /y

  peer ip address x.x.x.x /y

  mac-sticky enable

  access-group input ANY

  access-group output ANY

   service-policy input FTP_Download

What is happening??!  Please help. Happy to provide more show outputs

Many thanks

5 Replies 5

Kanwaljeet Singh
Cisco Employee
Cisco Employee

Hi,

So it seems that client ACK is not making it back to the ACE or the servers. Have you got the pcaps which show that ACK packet came to the ACE and ACE didn't send it back to the server?

If it is sending the SYN i am not sure why it would have a problem with ACK.

Normalization could have caused a problem if ACK was seen ACE without seeing the corresponding SYN so doesn't look like a Normalization issue here.

Can you take a simultaneous front end and back end pcap which establishes the fact that ACK packet was given to ACE but ACE didn't send it back to server?

Regards,

Kanwal

Hi Kanwal

Thanks for your input. Yes, I have taken traces in both the incoming vlan (inbound) and outgoing vlan (inbound) and I can see the client ack being lost in the ACE somewhere. The packet needs to switch contexts to make it from client (perimeter side) to the server (back end side). COuld this be related?

As you can see from the show serverfarm output, the failures count is incrementing so I am convinced the ACE is dropping the ACK for some unknown reason.

Regards,

Simon

Hi Simon,

Inter -context traffic could be a problem but it is working for every other host and not just this one(i assume you have proper routing in place for inter-context communication) , correct?

I would suggest opening a TAC case for detailed investigation.

Regards,

Kanwal

Hi Kanwal,

Yes, routing is correct as the initial SYN and SYN-ACK packets make it correcty to the end hosts.

I am waiting for our support contracts to be mapped onto our account so I can't open a TAC case yet, but if there are any more suggestions, especially with show command outputs, they'd be welcome.

Thanks for your input

Regards,

Simon

Hi Simon,

The Connections Failures counter for a real server in a server farm may increment for one of the following reasons:

  • SYN timeout (the three-way handshake fails to complete)
  • RST received (a client sends an RST to the server)
  • Internal exception (internal software issue)

Now, strange thing is this is happening for only for one HOST and you have pcap which shows that client ACK was sent to the ACE.

Can you send me the front end as well as backend pcap which shows that SYN from ACE made it to the backend but client ACK didn't?

Is it possible for you to take a ACE Backplane capture during the time of issue and send it to me for analysis?

Do grab two instances of show tech with a time difference of 5 minutes during the time of issue from the affected context.

Do indicate what are the IP' i need to look at  like client IP, VIP etc.

I will have a look and see what comes out. I have a feeling that this might need to go to development and without TAC case that won't be possible. If we do see ACK making it to the ACE and lost in ACE then we will need to have all that information (requested above) , open a TAC case and go to development for analysis.

But let's see.

You can also try and take capture on ACE itself. One for working client and one for non working client would be very helpful.

Regards,

Kanwal

Review Cisco Networking for a $25 gift card