cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
797
Views
5
Helpful
7
Replies

CSM Active-Active State on 6513

admin_2
Level 3
Level 3

I have this issue of CSM's going into active active state on the distribution layer boxes. The CSM's are configured in a bridged mode and cater to 66 Real Servers. STP and HSRP are configured accordingly to have Link Level as well as Box Level Redundancy for access switches and End USers. Now Whenever there is a link failure STP recalculation causes the CSM to go into Active Active mode thereby causing Secondary CSM to send a gratuitous arp . This causes the primary MSFC to replace the arp tables with mac address of Vserver on the secondary CSM.So End users lose connection to the Vserver during this stage forcing a arp flush to restore connectivity. The gratuitous arp works fine when the CSM's are actually in active standby mode even during any kind of STP or HSRP recalculation. In this case any kind of layer 2 or HRSP recalculation casuses the CSM to go into active active mode. Please advise

7 Replies 7

Gilles Dufour
Cisco Employee
Cisco Employee

you should have the FT vlan ports with portfast configured so spanning tree convergence does not affect the transmission of heartbeat.

Regards,

Gilles.

hi

well the FT vlans are running portfast, uplinkfast in the access and backbone fast across access and distribution. Still any STP recal or HRSP recal puts the CSM into active active mode. Some the hellos on FT vlan are getting lost thereby causing this problem. We tried pruning FT vlan across the etherchannel between 2 distribution switches which is kinda of a workaround.

But we need a more robust solution as to why the CSM infirst place goes into active active state which is abnormal. The root of STP is the distribution switches and HRSP active router and CSMactive are in the same chassis.

Regards

arun

hi.

We have our FT VLAN on a totally seperate VLAN and we run a 2-port gig-channel between our distribution switches. The only VLAN that is carried on that two-port gig-channel is our FT VLAN. It may sound like a bit of overkill - but we were trying to avoid the scenario you describe where both CSM's come active at the same time. We run 1 port of the ether channel on the supervisor in slot 1 and the other port of the channel on the supervisor in slot 2. So far (***fingers crossed***) we've not seen the problem you describe with our setup.

We have another set of trunks between our distribution switches that carry 'normal' (i.e. non-FT) traffic.

I *think* you could get away with using 100Mb links if you wanted to do this a little cheaper. I believe the cisco recommendation is for a minimum of 400Mb of bandwidth (especially if you are replicating sessions with the 'csrp replicate' command).

Hi,

in regards of FT detection you may play with the ft-timers (heartbeat and failover) I would start with the failover timer and set it to a value of 15 seconds and adjusting the heartbeat to 5 seconds. The default for heartbeat is every second and failover occurs after 3 seconds (three missed hellos).

This would make the fault dectection of a failed CSM slower but would not cause an active active if STP kicks in. I think the times have to be adjusted properly but I would try it that way or most expensive use a direct redundant link between the two CSMs but be aware the for ft-informations you exchange the more bandwidth is needed.

Kind Regards,

JOerg

Robert, that's the way to go.

Cisco recommends to used a dedicated link for the FT vlan. This link should connect the 2 chassis directly - no L2 device in the middle.

Regarding the link BW, if you are doing statefull redundancy [CSRP replicate] and expect to have a lot of connections that need to be replicated to the standby CSM, then we recommend 1 Gig.

Otherwise, FastEthernet is more than enough.

Regards,

Gilles.

Assume that you have two CSMs (router mode) with a direct FT connection between them and separate links to distribution switches which are themselves trunked together.

csm1 ---- csm2

| |

sw1 ==== sw2

If the distribution switch that the active switch is connected to fails the standby continues to receive the heartbeats and does not become active.

CSM1 and CSM2 are inside switches which should have trunk between them on 1 or more link and a FT link a separate one.

If this switch lose connectivity with the switch below, it still has a trunk to the other main switch.

....+-------+...........+---------+

....|...S1..|---trunk---|....S2...|

....|..CSM1.|---FT_Vlan-|...CSM2..|

....+-------+...........+---------+

........|....................|

........|....................|

.......SW1------------------SW2

The reason you need a separate link, is that if you have too much traffic on the trunk, you may lose heartbeat and failover the CSM.

Also, if you do connection replication, a lot of traffic will be generated on the ft link and once again you want to avoid congestion.

Gilles.

Review Cisco Networking for a $25 gift card