cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1507
Views
10
Helpful
18
Replies

CSM Real server failover delay

Dear All

We are facing an issue with a cisco CSM Load Balancer Module running version 3.1(4).

The scenario is that we have two IBM WebSphere Application Servers that we are trying to load balance using CSM. During perfromance testing, the systems team tried to simulate 50 simulaneous users to the Load Balanced VIP IP (10.10.10.1) on port 443. When both real servers are UP and running, we saw the connections were equally distributed among the two real servers (10.10.10.2 and .3).

However when they shut down one of the real servers NIC, we see that CSM marks the server as operationally down (displaying PROBE_FAILED status). But as soon as one of the real server fails, the CSM stops all requests for the virtual server, and the client application shows that no requests are reaching any of the real servers for about 10 seconds. After this period normal connections are established with ONE Active Real Server. We thought that if one real server goes, at least half of the connections should still continue to work.

Please see attached graph for more information. What could be the reason for this issue? Below is the relevant configuration:

probe ICMP icmp

interval 2

retries 3

failed 3

receive 10

!

serverfarm Server-Real

nat server

no nat client

real 10.10.10.2

inservice

real 10.10.10.3

inservice

probe ICMP

!

vserver Server-LB

virtual 10.10.10.1 tcp https

serverfarm Server-Real

no persistent rebalance

inservice

Regards

Anser Khan

18 Replies 18

Gilles Dufour
Cisco Employee
Cisco Employee

That's not a normal CSM behavior.

It should be able to work with a single real without having to wait for any time.

Get a sniffer trace.

Verify that this is indeed the CSM stopping the connections and not a problem with your test tool.

Also, we are now at CSM version 4.2.12.

Why test 3.1.x which is 5 years old ?

Gilles.

What is the method for capturing and on which device should we collect the capture ?

Which CSM version do you recommend to upgrade from CSM 3.1(4). We are afraid that CSM version 4.2.12 will be an early deployment release with a lot of bugs.

We are using 6513 with (s72033_rp-ADVENTERPRISEK9_WAN-M), Version 12.2(18)SXF7, RELEASE SOFTWARE

Anser

you have release -4- and 4.2.12 is release -12-.

So 4 <> 12 !! 12 is more mature than 4.

To capture a sniffer trace, you should identify the CSM portchannel - 256 + slot number.

Then you can capture traffic to/from that portchannel.

This will be all traffic going in and out of the CSM.

You can then see if the CSM is not forwarding some traffic.

But first, go to 4.2.12.

Gilles.

Dear Gilles,

We have upgraded the CSM software from 3.1(4) to 4.2(12).After upgrading the CSM software and reset, CSM module is in unknown state as shown in the following output:

Switch#show module 8

Mod Ports Card Type Model Serial No.

--- ----- -------------------------------------- ------------------ -----------

8 4 SLB Application Processor Complex WS-X6066-SLB-APC SAD……..

Mod MAC addresses Hw Fw Sw Status

--- ---------------------------------- ------ ------------ ------------ -------

8 0015.2b3d.cbbb to 0015.2b3d.cbbb 1.7 Unknown Unknown Other

After sometime Other state change to Power Down.

Switch#show module 8

Mod Ports Card Type Model Serial No.

--- ----- -------------------------------------- ------------------ -----------

8 4 SLB Application Processor Complex WS-X6066-SLB-APC SAD……

Mod MAC addresses Hw Fw Sw Status

--- ---------------------------------- ------ ------------ ------------ -------

8 0015.2b3d.cbbb to 0015.2b3d.cbbb 1.7 Unknown Unknown PwrDown

Mod Online Diag Status

---- -------------------

8 Not Applicable

Following are the logs we are getting:

Sep 6 14:32:15.061 : %C6KPWR-SP-4-DISABLED: power to module in slot 8 set off (Module Failed SCP dnld)

Sep 6 14:33:57.222 : %ONLINE-SP-6-TIMER: Module 8, Proc. 0. Failed to bring online because of timer event

Sep 6 14:33:57.222 : %C6KPWR-SP-4-DISABLED: power to module in slot 8 set off (Module Failed SCP dnld)

Sep 6 14:34:14.445 : %C6KPWR-SP-4-DISABLED: power to module in slot 8 set off (Reset)

We also physically removed the CSM module and insert it again but issue is still same.

We are also unable to access the csm module by the following command:

Switch#session slot 8 processor 0

The default escape character is Ctrl-^, then x.

You can also type 'exit' at the remote prompt to end the session

Trying 127.0.0.80 ...

We have also enabled the power serveral times by the following command:

Switch(config)#power enable module 8

How can I rollback(downgrade) the CSM software to the previous software.

We are using IOS version "IOS (tm) s72033_rp Software (s72033_rp-ADVENTERPRISEK9_WAN-M), Version 12.2(18)

SXF7, RELEASE SOFTWARE (fc1)".

Reply soon please

you probably uploaded an image that was corrupted. CSM can't boot anymore.

You need to unscrew the front panel cache to get access to the console port.

Connect to this port during boot and force access to rommon.

There is an option there to select previous image.

You should probably open a service request to get guidance on the exact procedure.

Here is the exact steps:

1 - Stop the boot process by pressing a key at "Press any key to stop

auto-boot..."

(I noticed that the customer has already tried this out)

2 - At the "[vxWorks Boot]" prompt, type "R"

3 - When the prompt returns, type "@" to boot the board

Please note that once step (1) is done, steps (2) and (3) must be

done right away. Waiting too long will cause the supervisor to cut

power from the CSM.

This process will allow the previous image to boot.

Gilles.

Dear Giles

There is an update, when we remove this CSM module (slot 8) and insert it onto another slot (say slot 3), the module boots properly, without any issues. But ofcourse there is no configuration for this module in the running configuration. The 'show module' output on the core switch display the new version correctly 4.2(12). I can also do a 'session slot 3 proc 0' successfully. I have also checked the file we downloaded via an MD5 checker, and the file is not corrupted.

Could this be an issue with the CSM's config file? How can we clear the configuration? I tried 'module clear-config' but it did not work.

Also we unscrewed the face place and tried consoling into the CSM, but the console does not work. Does it require any specific non-standard setting (like baud rate)?

Regards

Farrukh

Dear Giles

There is an update, now the module won't boot in any slot. Same error as before for slot 8.

Attached is the output of "diagnostic start module 3 test complete"

Regards

Farrukh

What IOS version do you have ?

4.2.x requires a minimum level for ios version.

Check release-note.

Also, did you try the procedure I gave you to go back to previous CSM version ?

Gilles.

We are running version 12.2(18)SXF7, this is OK as per the release notes. please see below output.

http://www.cisco.com/en/US/docs/interfaces_modules/services_modules/csm/4.2.x/release/notes/ol_6897.html#wp153414

As mentioned in the previous post, once we connect the console to the CSM module's front panel, no output comes! We tried reboot,boot, power-on, re-insert (physical). Does the CSM console require any specail settings (BAUD Rate etc?), or a special cable besides the regular light-blue color rollover console cable?

We also tried to console into another chassis's working CSM, the console port does not display any output.

show module 7

Mod Ports Card Type Model Serial No.

--- ----- -------------------------------------- ------------------ -----------

7 2 Supervisor Engine 720 (Active) WS-SUP720-3B SAL.

Mod MAC addresses Hw Fw Sw Status

--- ---------------------------------- ------ ------------ ------------ -------

7 0013.c42e.bb50 to 0013.c42e.bb53 4.4 8.1(3) 12.2(18)SXF7 Ok

Mod Sub-Module Model Serial Hw Status

---- --------------------------- ------------------ ----------- ------- -------

7 Policy Feature Card 3 WS-F6K-PFC3B SAL. 2.1 Ok

7 MSFC3 Daughterboard WS-SUP720 SAL. 2.3 Ok

Regards

Farrukh

you need a straight ethernet cable - not the regular cisco console cable.

G.

Dear Gilles,

We are planning to upgrade the IOS for WS-C6513.

We are currently using IOS (tm) s72033_rp Software (s72033_rp-ADVENTERPRISEK9_WAN-M), Version 12.2(18)SXF7, RELEASE SOFTWARE (fc1).

I have attached the "show module" command output in which you can find the current HW & SW models and software.

Please suggest us which IOS release should we go to fix CSM upgrading issue and secondly should we also need to upgrade other modules software for example FWSM, IDSM etc for the compatibility after upgrading IOS?

Regards,

Anser

you actually do not need to upgrade the ios version. This is actually what I also run in my lab.

Gilles.

Dear Gilles,

We have rollback the software of CSM to 3.1(4) by replacing the slot and reseting the module several times.

We still want to upgrade the software but we donot want to go from the same method as we did. We checked the CSM software by MD5 before upgrading.

How can we upgrade the CSM software to 4.2.x?

Kindly advice

there is no other method.

The 'upgrade' command should work.

I never had any issue with it in 10 years.

G.

Review Cisco Networking for a $25 gift card