Showing results for 
Search instead for 
Did you mean: 

Datacenter troubleshooting guide - day 8

Gilles Dufour
Cisco Employee

"Datacenter troubleshooting guide” – a blog by  Gilles Dufour.

Day 8 -  Understanding me-stats (continued)

Today we continue our progression through the list of me-stats counters.

We covered the following MR already :

  • RX - packets buffering
  • FP - Fastpath - packets forwarding for existing connections
  • ICM - Inbound connection Manager  - L3 rule matching and inbound ACL

Today, let's start with the TCP microengine.

When traffic needs to be terminated on the ACE device (for inspection of content/data), ICM forwards the traffic to the TCP module to complete the 3-way handshake and handle all TCP options and requirements.

switch/Admin# show np 1 me-stats "-stcp -v"
TCP Statistics: (Current)
TCP RX messages received:                     18957             0
TCP RX unknown messages:                          0             0
TCP RX racing messages (fin):                     0             0
TCP RX racing messages (forward):                 0             0
TCP RX racing messages (conn create):             0             0
TCP TX messages received:                     90864             2
TCP TX Hi Priority messages received:             1             0
TCP TX unknown messages:                          0             0
TCP TX racing messages (connect):                 0             0
TCP TX racing messages (data):                    0             0
TCP TX racing messages (proxy):                   0             0
Reproxy message received:                         0             0
Data messages received:                           0             0
TCP connect message received:                     0             0
Ack trigger message received:                     0             0
Unproxy req. message received:                    0             0
Unproxy rsp. message received:                    0             0
TCP accepted msgs sent:                        6513             0
TCP connected msgs sent:                          0             0
Conn_ctrl msgs sent:                              1             0
Buffer alloc failed:                              0             0
Invalid msg ring id:                              0             0
Start retrans timer:                           6513             0
Start ackdelay timer:                             0             0
Start persist timer:                              0             0
Start timewait timer:                             0             0
Delete act timer:                                 0             0
Delete rtp timer:                              6513             0
Connections unproxying:                           0             0
Connections unproxying canceled by TCP:           0             0
Connections unproxying canceled by app:           0             0
Connections unproxying immediate reproxy          0             0
Connections unproxying flush retransq:            0             0
Connections unproxying flush inputq:              0             0
Connections unproxied:                            0             0
Connections reproxied:                            0             0
Drop reproxy msg queue full:                      0             0
Drop control msg:                                 0             0
Drops due to FastTX queue full:                   0             0
Drops due to Fastpath queue full:                 0             0
Drops due to HTTP queue full:                     0             0
Drops due to SSL queue full:                      0             0
Drops due to AI queue full:                       0             0
Drops due to Fixup queue full:                    0             0
Drops due to packet size exceed MSS:              0             0

Unproxy rsp post failed:                          0             0
Drops due to invalid proxy id:                    0             0
Drops due to UDP buffer share limit:              0             0
(Context ALL Statistics)
Handshakes completed:                          6513             0
Handshakes failed:                                0             0
Packets received to app:                       5931             0
Packets sent to network:                      12443             0
Segs outside window:                              0             0
ACK past SEQ:                                     0             0
Dup ACKs received:                                0             0
Dup ACK limit met:                                0             0
Malformed TCP options:                            0             0
Reassemble segs:                                  0             0
Nagled data segs:                                 0             0
Retransmitted data segs:                          0             0
Round-trip timeouts:                              0             0
Round-trip timeout limit met:                     0             0
Persist timeouts:                                 0             0
Persist timeout limit met:                        0             0
Ack delay timeouts:                               0             0
Timewait timeouts:                                0             0
Reassembly timeouts:                              0             0
Connection shutdown FIN:                          0             0
Connection shutdown RST:                          0             0
SYNs received:                                 6513             0
FINs received:                                    0             0
ACKs received:                                12443             0
RSTs received:                                    1             0
PSHes received:                                5929             0
SYNs transmitted:                              6513             0
FINs transmitted:                                 0             0
ACKs transmitted:                             12442             0
RSTs transmitted:                                 0             0
PSHes transmitted:                                0             0


Interesting counters are all the drop statistics which can give you information on what microengine is slow to respond up the path.

It is also important to note that when you terminate a connection on the ACE device for content inspection, each data packet will be kept in memory/buffers until we have all the data that we need to perform the required action.  To avoid the problem of one connection holding all the buffers there is a limit to the quantity of information one connection can hold.  This is the buffer-share limit or UDP buffer share limit.

The same buffers are used for TCP or UDP, so the term UDP here does not mean this is specific to UDP.  It should actually be IP buffers.

The default value is 32kb.

That does not mean your application can't send more than 32kb or that ACE will buffer 32kb before forwarding your data.

When inspections is required, like to find a cookie inside an HTTP request, ACE starts buffering until it detects the cookie.  If we reach the limit of 32kb on a single connection, we start dropping packets for that connection.

If you have very long HTTP header, it might be necessary to increase this limit.

This can be done with a parameter-map.

TCP Buffer Share configuration
switch/Admin# conf t
Enter configuration commands, one per line.  End with CNTL/Z.
switch/Admin(config)# parameter-map type connection buffers
switch/Admin(config-parammap-conn)# set tcp buffer-share ?
  <8192-262143>  Enter buffer-share size
switch/Admin(config-parammap-conn)# set tcp buffer-share

If you remember a previous blog post, I talked about FP and ICM drops due to buffer limit thresholds reached.

When the system is running out of buffers, it starts dropping new connections to avoid congesting the box completely.

So, if you increase the number of buffer per connection with the above parameter-map, you could potentially hit the buffer threshold limit faster than normal.

It is therefore recommend to be extra cautious when playing with this command.

Unfortunately, there is no best value that would magically fit all networks.

If you haven't run into any buffer issues, simply do not change the default value.

After the TCP module comes the HTTP or SSL microgine.  I'll skip them for now and jump to the LB  module.

The LB module will make the loadbalancing decision.

Initially, LB was a module like the other ME's.  But the code size of the LB engine grew so large, we had to actually move it to the Xscale processor that is also present on the IXP.

The good news is even if LB is not a module anymore, you can still collect stats using the same me-stats command.

switch/Admin# show np 1 me-stats "-slb -v"
LB Perf stats at address 0x82e05000
LB Perf stats at address 0x82e05000
LB Statistics
(Context ALL Statistics)
L4 LB Decisions:                                  0             0
L4 Rejected Conns:                                0             0
L7 LB Decisions:                               6561             0
L7 Rejected Conns:                                1             0

No Real Server:                                   0             0
No Policy:                                        0             0
No Policy Match:                                  0             0
Config Version Mismatch:                          0             0
ACL denied:                                       0             0
FT Invalid Id:                                    0             0
FT Idmap Lookup Failures:                         0             0
Proxy Close Drops:                                0             0
Misc Drops:                                       0             0
L4 Close Before Process:                          0             0
L7 Close Before Parse:                            0             0
Close For Valid Real:                             0             0
Close For Invalid Real:                        6560             0
Max Parse Len Rejects:                            0             0
L7 Parser Error Rejects:                          0             0

Out of Memory Rejects:                            0             0
Config Mismatch Rejects:                          0             0
HA Send Failure:                                  0             0
HA Packets Sent:                                  0             0
HA Entries Shared:                                0             0
HA Receive Chk Failures:                          0             0
HA Packets Received:                              0             0
HA Entries Dropped:                               0             0

Num Stolen For Reuse:                             0             0
Num Active Sticky Entry:                          2             0
Num Active Reverse Sticky Entry:                  0             0
Active Conn Count:                                0             0
Free Sticky Entry Count:                    1794109             0
Num Grp or Timeout Nodes:                         4             0
Static Entry List Count:                          2             0
Num Entry Configured:                       1794111             0
Prev Resources Req:                         1794111             0
Drop Max Remote Stky:                             0             0

RTSP sessions allocated:                          0             0
RTSP sessions failed:                             0             0
RTSP sticky entries added:                        0             0
SIP sessions allocated:                           1             0
SIP sessions failed:                              0             0
SIP sticky entries added:                         0             0

Free Proxy Mapping:                           32768             0
Alloc Proxy Mapping:                              1             0
Alloc Proxy Mapping Failed:                       0             0
Release Proxy Mapping:                            1             0

The first 4 counters are very information.  They tell you what part of your traffic is L7 vs L4.

You can also see if connections were rejected.

The most common rejection reasons are also listed : "no real" or "no policy".

Further down you will find 2 more rejection reasons :

  • Max Parse Len  Rejects   --> header/body parse length reached
  • L7 Parser Error Rejects --> Content does not respect RFC

The Max Parse Len problem  can be solved by increasing the size of the header/body parse length with a parameter-map.

(Do not get confused with the buffer-share .... the buffer share is a limit at TCP level while the max_parse_length is a limit at Application level).

Changing MAX Parse Length
switch/Admin#  conf t
Enter configuration commands, one per line.  End with CNTL/Z.
switch/Admin(config)# parameter-map type http PARSE-LEN
switch/Admin(config-parammap-http)# set ?
  content-maxparse-length      Configure content maxparse length
  header-maxparse-length       Configure header maxparse length
switch/Admin(config-parammap-http)# set header-maxparse-length ?
  <1-65535>  Enter max-parse length for header

Again, there is no magic value here and if you do not have parse length error, there is no need to change the default settings.

Should you need to increase the value, try to increase it slowly (1k) to avoid running into memory issue.

The last interesting counter is the num stolen for reuse that tracks the number of time we ran out of free sticky entries.

When you don't have enough sticky resources, ACE will "reuse" old entries to save the new sticky information.

This can be undetected or can cause users to not stay connected with the same server.

I have covered this on the "DAY 5" of my blog with the max remote sticky counter.

Thank you all for following this blog.

Gilles Dufour.

Content for Community-Ad