cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2083
Views
0
Helpful
2
Replies

Can LLDP frames disrupt FCoE in converged switches?

fwrusso
Level 1
Level 1

Hello,

I had an interesting - and terrible - experience with DELL-Force10 S5000 switches, where hosts FCoE paths were simultaneously going down, disrupting storage connectivity. This was happening on a VMware cluster with 10 DELL R620 hosts, mounting QLogic QLE8262 converged adapters.

These CNAs have 2 ports, each connected to a S5000 switch. Each switch also connects a number of storage devices with native FC ports.

What happened is that, randomly, hosts were losing storage paths on both FCoE adapters, at least for a few minutes. After a while, this issue could disappear on one host and start on another one - in practice bringing the whole cluster down.

The diagnosis made by DELL technical support identified the cause in LLDP frames received by their switches, from VMware host ports, which were conflicting with DCBx frames - used by DCB to negotiate FCoE attributes, therefore bringing down DCB and FCoE paths on the port.

The diagnosis was mainly driven by messages like the following, on the switches log:

%STKUNIT0-M:CP %LLDP-5-LLDP_MULTIPLE_PEER_DETECTED: DCBX operationally disabled due to more than one PEER being present on interface Te 0/52

According to DELL, such LLDP frames could come from VMware dVS, where we had enabled LLDP, but also from VM - they mentioned that Windows 2012 and later by default enables LLDP on network interfaces.

I was really astonished! I could not imagine that enabling LLDP on a dVS could bring a virtual infrastructure to its knees, and even more to hear that this could be caused by a VM - breaking all the assumptions of isolation of a VM in a virtual environment. On the other hand, DELL says that this is the DCBx behavior by design - LLDP frames from more than one source would block it.

I therefore submit this important question to the community. Is anyone aware of this issue? How is DCBx implemented on Nexus switches, and is it also potentially vulnerable to such problems? Could it ever react to regular LLDP frames from a dVS or VM (if LLDP frames generated by VMs can go through a VS - actually I wonder about that...)?

Thanks you very much for your replies and insight!

Francesco Russo

2 Replies 2

denjolras
Level 1
Level 1

Hello.

Have you solved this ?

 

I'm still trying to find how to ...

AliA8
Level 1
Level 1

This is correct - for Dell servers using force10 switches will cause an issue after we enable LLDP on one of our Distributed switches, where HP and Dell servers were connected - DELL got impacted, while HP servers blades were ok - 

After disabling LLDP - Issue resolved.

Fun fact - It seems to be caused by advertised/Both and Not Listen or maybe that was just purely coincidence that Listen was ok for us at least - 

If LLDP is really required, perhaps go with Dell recommendations on how to enable it without impacting the whole Productions.

however, i am not sure about the VM can cause that issue, thats scary !

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: