cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Announcements

Cisco Community Designated VIP Class of 2020

35809
Views
46
Helpful
51
Replies
Cisco Employee

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Philip,

It has to be on the server itself, changing the default gateway on server to ACE ip should work here.

Regards,
Siva

Beginner

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Is or will ACE loadbalancer be capabel to deal with WebSocket protocoll as defined in RFC 6455 ?

How to deal with stickiness in this area ? My on lab experiments are showing that ip based stickniess is working with ACE software version A4(1.0) - but SessionID based stickiness is not possible.

Cisco Employee

Re: Ask the Expert: Understanding and Troubleshooting ACE Loadba

Hi,

Thanks for your question.

There are no immediate plans to support websocket on ACE and no roadmap available yet. I can tell from previous documented cases and from my personal experience on cases I've handled, there is a particular requirement which seems to be very important for WebSocket traffic.

As WebSocket requires stickiness, to enable all connections from a single user to stick to one server and is particularly effective (and sometimes strictly necessary) when the application requires user authentication, as otherwise,
traffic would be bouncing between two or more servers.

The type of stickiness that you would implement depends entirely on your network requirements.

Since ACE does not have any specific knowledge of the WebSocket protocols, it doesn't have the capability to do deeper protocol inspection but it seem to work for generic Connection based Level 3 and 4 load balancing which I believe you have already tested in your LAB.

You can also get in touch with your cisco internal contact, share the use case and more details to help assist on your requirement.

Regards,

Siva

Beginner

Re: Ask the Expert: Understanding and Troubleshooting ACE Loadba

Hi Siva,

Good to see the ACE discussion in the Experts Corner. My query is if there is any permanent fix to CSCsz65679 which causes ACE-20 to crash couple of times in a year ? I have noticed that RMA is not a fix for the problem neither the image upgrade. One of our customer had 10's of ACE-20s and neither RMA nor the upgrades fixed the 'NP Control Store Parity Error', so far they have observed around 10 total ACE-20 crashes on different modules in 3 years of time. The upgrades only reduces the crash frequency, probably due to explicit reload during upgrades which refreshes all the buffers.

I believe this might be an issue with the ACE-20 architecture ? similar issues have not been observed on ACE-30.

Regards,

Akhtar

Cisco Employee

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Akhtar,

Thanks for your question.

Sorry your customer had to experience too many crashes due to parity issue.

First let me expalain few things about SRAM. SRAM parity error presented in the core file is not due to a software issue. The issue is the result of a "bit-flip" within the SRAM itself which can occur as a result of environmental conditions. This "bit-flip" is rectified by a simple reboot of the system, which would occur with the generation of the core file.Our testing has shown that these type of issues can occur with very low frequency and if a particular module experiences a significantly higher failure rate and you are running a version which has all the possible workarounds for CSCsz65679 then a proactive RMA could be in order.

ACE20 is susceptible to this because of the way it uses SRAM to store  control information and packet data as  opposed to scratch-pad storage.  Almost any 1-bit flip will be detected as a parity error.

Unfortunately, SRAM's are very sensitive to light, dust, radiation,  shock, temperature,... so it is possible to get an SRAM parity error on  an healthy ACE.

You are right about ACE30, neither ACE4710 or ACE30 are affected by these issues as the design does not use sram  or nitrox.

Also note that we have EOL notice for ACE10/20:

http://www.cisco.com/en/US/prod/collateral/modules/ps2706/end_of_life_c51-674430.html

Regards,

Siva

Beginner

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Siva

i have two ace module , the standby one is reload sudden , how can i know the cause of this

Cisco Employee

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi,

Thanks for your question.

I understand the standby ACE had an unexpected reload, do you see any crash info generated under "dir core:" after reload. If so please send those files to me to determine the reason for reload.

Otherwise you can raise a tac case and attach the following information for our analysis to determine the root cause.

1- 'show tech' on the switch

2- 'show tech' on the Admin context on the ACE

3- Logs on the switch covering the period when the reload happened.

4- Crash files from ACE located under "dire core:"

Let me know if you have any qusetions.

Regards,

Siva

Beginner

Re: Ask the Expert: Understanding and Troubleshooting ACE Loadba

Hi Siva

I atteched the requied files , but regarding to the crash info , i didnt find crash info for the reload date ( 18 Aug 2012 11:41 PM )

Thanks Siva

Best Regards

Mohamed Abd EL Razik

Cisco Employee

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi,

Thanks for providing the data.

This looks like a silent reboot and SUP initiated the reload.

However the information doesn't really explain why it happened. Silent reboots are tricky as they don't leave much data to work with.

Here is the defect that we logged to track the silent reboot. With high probability a SW upgrade will be necessary as few bugs related to silent reloads have been fixed in A2(3.3) and current version is A2(3.5)) and then monitor device.

http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCsy91540

There is an action plan to determine if this was traffic related L7 or management traffic ANM, XML, SNMP... which may be filling up the resources on ACE that caused the reload.

I can send you the detailed action plan via PM if reqiured.

Let me know if you have any questions.

Regards,
Siva

Highlighted
Beginner

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Siva,

Just a note on versions available. We recently appear to have run into the following Bug and had to downgrade to version A2(3.3) as removing our HTTP health probes did not seem like a workable solution for us.

http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCtz47825

Once downgraded the paired modules stabalised (no longer re-loaded continuously). Both modules were in this state.

Just thought would provide some input.

Thanks.

Paul

Cisco Employee

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Paul,

Thanks for your question.

Its good to know that the devices are stable now after downgrading to A2(3.3) and I am able to track down the TAC case you reported recently on this issue.

Looking into the bug, we had this issue reported mainly on version A2(3.5) in the past and we are working on reproducing the issue on different code versions to find out the reason for memory corruption.

We will have the fix after we successfully reproduce the problem and it has been updated with fixed version as A2(3.7).

Let me know if you have any questions.

Regards,
Siva

Beginner

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Siva

thanks for the information

kindly send me the detailed action plan to determine if this was traffic related L7 or management traffic ANM, XML, SNMP

Regards,

Mohamed

Cisco Employee

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Mohamed,

Sent you the information via PM. Please check.

Regards,
Siva

Beginner

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Thanks Siva for your support

Regards

Mohamed

Beginner

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Siva,

I am attaching the running-config of the ACE which is currently under test in the lab.

As you can see VLAN - 20 is configured to the Client Side & VLAN-30 is configured on the server side.

I am not able to ping the ACE Interface IP address : 2092:dead:beef:cafe::3 from the Cisco Switch ( 7k ) whose interface is connected to the ACE on VLAN-20.

Any idea if this is normal behavior (or) is there any configuration mistake ?

Thanks !!

hostname ACE-4710

interface gigabitEthernet 1/1

  description *** Interface connecting to the UUT-Switch-7k (WS-C7206X) ***

  switchport access vlan 20

  no shutdown

interface gigabitEthernet 1/2

  description *** Interface connecting to the serverfarm ***

  switchport access vlan 30

  no shutdown

interface gigabitEthernet 1/3

  description *** UNUSED***

  no shutdown

interface gigabitEthernet 1/4

  description *** UNUSED***

  no shutdown

access-list everyone  extended permit ip any any

access-list everyone  extended permit pim any any

access-list everyone  extended permit icmp any any

rserver host CNR

ip address 2092:dead:beef:cafe::90

inservice

rserver host CNR-IPv4

ip address 172.27.167.13

inservice

rserver host NMS

ip address 2092:dead:beef:cafe::999

inservice

serverfarm host LABSERVERS

rserver CNR

inservice

rserver CNR-IPv4

inservice

rserver NMS

inservice

! Layer-3 Traffic

class-map type management match-any MGMT

match protocol telnet any

match protocol https any

match protocol http any

match protocol xml-https any

match protocol ssh any

match protocol icmp any

! Layer-4 Traffic

class-map match-all slb-vip-LABSERVERS

match virtual-address 2092:dead:beef:cafe::1 any

! Layer-3 Class-Map defining source traffic. This traffic macthes server initiated

policy-map type management first-match MGMT_POLICY

class MGMT

permit

policy-map type loadbalance first-match LB_POLICY_LABSERVERS

class class-default

serverfarm LABSERVERS

policy-map multi-match CLIENT-VIPS_LABSERVERS

class slb-vip-LABSERVERS

loadbalance vip inservice

loadbalance policy LB_POLICY_LABSERVERS

loadbalance vip icmp-reply active

loadbalance vip advertise active

interface vlan 20

  description "Client Interface"

  bridge-group 1

  access-group input everyone

  service-policy input CLIENT-VIPS_LABSERVERS

  service-policy input MGMT_POLICY

  no shutdown

interface vlan 30

  description "Server Farm"

  bridge-group 1

  service-policy input CLIENT-VIPS_LABSERVERS

  service-policy input MGMT_POLICY

  no shutdown

interface bvi 1

  ipv6 enable

  ip address 2092:dead:beef:cafe::3/64

  description "Client-Server Bridge Group"

  no shutdown

ip route ::/0 2092:dead:beef:cafe::2

username admin password 5 $1$Hh4K/EuN$J9mu8qUJbebWixnC5Wxpo1  role Admin domain

default-domain

username www password 5 $1$9yHPLof8$RZrtAsMV26WtOp/q8Ou8L.  role Admin domain de

fault-domain

*******************************************************************

On the 7200 switch which is connecting to the ACE :

!

interface GigabitEthernet0/3

description Connected to ACE-E1

no ip address

ip pim sparse-mode

ip igmp version 3

ip ospf 1 area 0

shutdown

duplex auto

speed auto

media-type rj45

negotiation auto

ipv6 enable

ipv6 ospf 1 area 0

!

interface GigabitEthernet0/3.20

encapsulation dot1Q 20 native

ipv6 address 2093:DEAD:BEEF:CAFE::2/64

!

ipv6 route 2092:DEAD:BEEF:CAFE::/64 2092:DEAD:BEEF:CAFE::1

*********************************************************************************************************

I am setting it up for a basic management setup & later on progress to enable more functionalities in the ACE.

Please let me know if there are any mistakes (or) corrections which I might need to make in the configuration.

Thanks !

CreatePlease to create content
Content for Community-Ad
FusionCharts will render here