cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1736
Views
0
Helpful
3
Replies

VIP flapping/fluctuation between 2 load balancers

gnizamuddin
Level 1
Level 1

Hi

I have 2 load balancers which are supposed to work in active-standy fashion. The load balancers had been working fine till last 2 days (for a couple of years). However, I faced an outage because of continuous VIP flapping on these load balancers on Saturday. This VIP is the default gateway for my servers; hence because of flapping the load balanced services were inaccessible for these servers. I could only see these messages in logs:

LPKPK11# sh log sys.log

  sys.log                         OCT 13 19:49:38        4200058

  sys.log.prev                    NOV 12 21:07:38        9999963

LPKPK11# sh log sys.log

OLD LOG MESSAGES HAVE BEEN SAVED in associated .prev file.

NOV 12 21:07:34 1/1 1425645 IPV4-4: Duplicate IP address detected: 172.16.13.5 68-ef-bd-a0-ad-ba

NOV 12 21:07:34 1/1 1425646 IPV4-4: Duplicate IP address detected: 172.16.13.5 68-ef-bd-a0-ad-ba

NOV 12 21:07:34 1/1 1425647 IPV4-4: Duplicate IP address detected: 172.16.13.5 68-ef-bd-a0-ad-ba

NOV 12 21:07:34 1/1 1425648 IPV4-4: Duplicate IP address detected: 172.16.13.5 68-ef-bd-a0-ad-ba

NOV 12 21:07:34 1/1 1425649 IPV4-4: Duplicate IP address detected: 172.16.13.5 68-ef-bd-a0-ad-ba

NOV 12 21:07:34 1/1 1425650 IPV4-4: Duplicate IP address detected: 172.16.13.5 68-ef-bd-a0-ad-ba

No virtual router was being seen by both load balancers:

LPKPK11# sh redundant-vips

Redundant-Vips:

Interface Address: 172.16.13.5      VRID: 31

  Redundant Address: 172.16.13.27       Range:       1

  State:             No Virtual Router  Master IP:   172.16.13.5

  State Changes:     3,562              Last Change: 10/13/2012 19:45:35

Interface Address: 172.16.197.122   VRID: 10

  Redundant Address: 172.16.197.81      Range:       1

  State:             No Virtual Router  Master IP:   172.16.197.122

  State Changes:     3,563              Last Change: 10/13/2012 19:45:35

Interface Address: 172.16.197.122   VRID: 10

  Redundant Address: 172.16.197.82      Range:       1

  State:             No Virtual Router  Master IP:   172.16.197.122

  State Changes:     3,563              Last Change: 10/13/2012 19:45:35

Interface Address: 172.16.197.122   VRID: 10

  Redundant Address: 172.16.197.84      Range:       1

  State:             No Virtual Router  Master IP:   172.16.197.122

  State Changes:     3,563              Last Change: 10/13/2012 19:45:35

Interface Address: 172.16.197.122   VRID: 10

  Redundant Address: 172.16.197.85      Range:       1

  State:             No Virtual Router  Master IP:   172.16.197.122

  State Changes:     3,563              Last Change: 10/13/2012 19:45:35

CPU and memory status was normal:

# sh system-resources

System Resources for CSS501-SCM-INT:

Installed Memory:   268,435,456 (256 MB)

Free Memory:        137,464,384 (131 MB)

CPU:                0% (5Sec)     5% (1Min)     3% (5Min)

Buffer Statistics:

Buffer Pool: 0

   Size:2048  Total:3072  Available:2776  Failures:  0  Low Buffer Count: 2084

Buffer Pool: 1

   Size:2048  Total:3072  Available:2832  Failures:  0  Low Buffer Count: 2832

Buffer Pool: 2

   Size:2048  Total:2048  Available:1944  Failures:  0  Low Buffer Count: 1856

I observed preempt (but with different priority) on different Circuit VIPs on both LBs which I turned off on secondary LB. The problem didn't resolve.

I turned off the critical services. That also didn't help.

I rebooted both LBs; but also in vain.

Eventually, I had to shut down one LB to restore the service. Here are the IP configs on both LBs:

LB1:

===

!************************* INTERFACE *************************

interface e1

  bridge vlan 31

interface e2

  description "Signaling uplink on LPKPK11 towards S9309-A1 Eth-1/0/33"

  bridge vlan 40

interface e5

  description "Inter-CCS Communication towards other LB Master-Slave Mode"

  isc-port-one

interface e6

  description "Inter-CCS Communication towards other LB Master-Slave Mode"

  isc-port-two

interface e7

  trunk

  vlan 2

  vlan 3

interface e8

  description "to SW11-Gi0/18"

  bridge vlan 3

!************************** CIRCUIT **************************

circuit VLAN31

  ip address 172.16.13.5 255.255.255.0

    ip virtual-router 31 priority 110 preempt

    ip redundant-interface 31 172.16.13.4

    ip redundant-vip 31 172.16.13.27

    ip critical-service 31 upstream_downstream

circuit VLAN3

  ip address 172.16.197.122 255.255.255.192

    ip virtual-router 10 priority 110 preempt

    ip redundant-interface 10 172.16.197.120

    ip redundant-vip 10 172.16.197.85

    ip redundant-vip 10 172.16.197.84

    ip redundant-vip 10 172.16.197.81

    ip redundant-vip 10 172.16.197.82

    ip critical-service 10 upstream_downstream

circuit VLAN40

  ip address 172.31.96.5 255.255.255.240

    ip virtual-router 40 priority 110 preempt

    ip redundant-interface 40 172.31.96.4

LB2:

===

!************************* INTERFACE *************************

interface e1

  bridge vlan 31

interface e2

  description "Signaling uplink on LPKPK12 towards S9309-A2 Eth-1/0/33"

  bridge vlan 40

interface e5

  description "Inter-CCS Communication towards other LB Master-Slave Mode"

  isc-port-one

interface e6

  description "Inter-CCS Communication towards other LB Master-Slave Mode"

  isc-port-two

interface e7

  trunk

  vlan 2

  vlan 3

interface e8

  bridge vlan 3

!************************** CIRCUIT **************************

circuit VLAN31

  ip address 172.16.13.7 255.255.255.0

    ip virtual-router 31 priority 90

    ip redundant-interface 31 172.16.13.4

    ip redundant-vip 31 172.16.13.27

    ip critical-service 31 upstream_downstream

circuit VLAN3

  ip address 172.16.197.121 255.255.255.192

    ip virtual-router 10 priority 90

    ip redundant-interface 10 172.16.197.120

    ip redundant-vip 10 172.16.197.85

    ip redundant-vip 10 172.16.197.84

    ip redundant-vip 10 172.16.197.81

    ip redundant-vip 10 172.16.197.82

    ip critical-service 10 upstream_downstream

circuit VLAN40

  ip address 172.31.96.6 255.255.255.240

    ip virtual-router 40 priority 90

    ip redundant-interface 40 172.31.96.4

I need urgent help from Cisco CSS experts here. Awaiting for your kind feedback.

Regards

Nizam

3 Replies 3

chrhiggi
Level 3
Level 3

Nizam-

  Depends on what code you are running....but sounds like a match for CSCek37489 where the standby CSS forgets it is supposed to be standby after 828 days of uptime.

Regards,

Chris Higgins

Cisco ANS TAC Escalation

Thanks Chris. Can you send me a reference for this bug's fix?

Also, I rebooted both load balancers but the problem prevailed. If the problem is due to this bug, the reboot should have refreshed the uptime counter of both load balancers and the problem should have been fixed.

BR

Nizam

Symptom:

Backup CSS may send VRRP packets suddenly and then both CSSes goes Master status.

Conditions:

This symptom is observed on CSS that runs about 828 days.

Workaround:

Rebooting the CSS.

Fixed-In:

Fixed-in7.50(3.3)

8.10(2.5)

8.20(0.1)

5.0(6.16)S

7.50(2.11)S

8.10(1.11)S

6.10(4.18)S

7.40(3.9)S