HOSTFLAPPING / load on switch CPU

STUART KENDRICK · ‎09-01-2011

I'm experimenting with a load-balancing mode on a Linux host -- two NICs, configured with Linux bonding mode 'balance-rr'

From

http://www.kernel.org/doc/Documentation/networking/bonding.txt

balance-rr or 0

          Round-robin policy: Transmit packets in sequential
          order from the first available slave through the
          last.  This mode provides load balancing and fault
          tolerance.

From /etc/sysconfig/network-scripts/ifcfg-bond0:

DEVICE=bond0
IPADDR=10.15.210.123
NETMASK=255.255.254.0
BROADCAST=10.15.211.255
GATEWAY=10.15.210.1
BONDING_OPTS='mode=balance-rr arp_interval=1000 arp_validate=all arp_ip_target=10.15.210.1'

When I do this, I see the following on my loghost ... almost a thousand such messages per hour:

Sep  1 04:13:24 cluster-5-esx 9457: 009452: Sep  1 04:13:23.116 pdt: %C4K_EBM-4-HOSTFLAPPING: Host 00:04:23:CD:53:E2 in vlan 210 is flapping between port Te1/50 and port Te1/49

I believe I understand what is happening: the host uses each NIC to transmit, alternating with each outbound transmission. It uses the same MAC address whenever it transmits. So the switch frantically udpates its CAM table every time the host transmits a frame. I doubt the switch logs this HOSTFLAPPING message /every/ time it updates its CAM table ... there must be some dampening algorithm ... but still, it logs it frequently.

What price am I paying on the switch for this behavior? Is the CAM update process performed in hardware, in which case, perhaps I don't care that I'm "working" the switch like this. Or is it performed in software, in which case, at some level of throughput (and I would like to roll this out to lots of hosts on lots of switches), I'll peg the CPU on the switch?

Mostly, I use Catalyst 45xx w/Sup V and Catalyst 65xx w/Sup720

--sk

Stuart Kendrick

FHCRC

IAN WHITMORE · ‎09-01-2011

Well everytime the switch has to learn a new MAC address it broadcasts it on the network. So if you plan to do this design with many servers, you are going to have broadcasts all over your network and this could affect performance of the switches. This will have an impact on network performance and traffic flows, CPUs etc.

I don't think this is a very good design in an active-active server configuration unless you can configure a port-channel between the server and the switch.

Ian

STUART KENDRICK · ‎09-05-2011

Hi Andras,

I'm skeptical about your statement: "everytime the switch has to learn a new MAC address it broadcasts it on the network".

If you had written, "everytime the switch has to learn a new MAC address, it updates its CAM table", I would buy that. But I don't, myself, see the broadcast behavior you mention, as I sit here with Wireshark running on one host while plugging a new host into the switch. What evidence would you offer to support the behavior you describe?

--sk

andtoth · ‎09-05-2011

Hi Stuart,

That statement was actually written by Ian. I think he meant that if a packet is sent to an unknown MAC address, the switch will have to flood the packet as a best effort behavior on all ports in the same vlan, this process is called Unknown Unicast Flooding.

This in turn can cause high CPU usage if packets require continuous flooding due to unknown MAC address. In addition to that, MAC learning itself requires CPU cycles on Catalyst 4500 switches and frequent MAC learning can cause high CPU usage as I explained.

Best regards,

Andras

andtoth · ‎09-04-2011

Hi Stuart,

If you do link bonding on the server side, you will need to use EtherChannel (Port-channel) on the Cisco side in order to avoid hostflapping. Otherwise host flapping will indeed cause high CPU usage due to frequent MAC learning.

Please refer to the following link for more details and configuration for EtherChannel on 4500 switches:

http://www.cisco.com/en/US/docs/switches/lan/catalyst4500/12.2/54sg/configuration/guide/channel.html

Best regards,

Andras

STUART KENDRICK · ‎09-05-2011

Hi Ian, hi Andras, thank you for the responses.

Got it, if I wanted this to work, I could combine 'balance-rr' with EtherChannel on the switch side.

Or employ a different 'mode' on the host side: one which uses an ARP mechanism rather than LACP, e.g.:

DEVICE=bond0
IPADDR=10.15.210.123
NETMASK=255.255.254.0
BROADCAST=10.15.211.255
GATEWAY=10.15.210.1
BONDING_OPTS='mode=balance-alb arp_interval=1000 arp_validate=all arp_ip_target=10.15.210.1'

But I'm hoping to focus your attention on the 'what price am I paying' question:

"What price am I paying on the switch for this behavior? Is the CAM update process performed in hardware, in which case, perhaps I don't care that I'm "working" the switch like this. Or is it performed in software, in which case, at some level of throughput (and I would like to roll this out to lots of hosts on lots of switches), I'll peg the CPU on the switch?"

In the real world, I have sys admins configuring hosts in ways which instigate lots of HOSTFLAPPING messages from the local switch. Are these hosts functional? Probably not. But I'm focused on the cost to the switch: what price am I paying (if any) on the switch side, if it is updating its CAM table rapidly?

--sk

andtoth · ‎09-05-2011

Hi Stuart,

If you are planning to use link bonding (link aggregation) on the host, consider using Etherchannel, otherwise MAC flapping might occur leading to high CPU usage, unknown unicast flooding, or even a L2 loop might form which can melt down your network. If link bonding (etherchannel) is configured on only one side of a link, it's considered a misconfiguration, therefore enable it on both sides.

Best regards,

Andras