06-14-2013 11:27 PM - edited 03-07-2019 01:54 PM
I got a few esxi server that are plugged into 2, nexus 5548 ports in a port channel configuration.
The port channel for the management interface nics is configure on vlan 51. I have a pxe/dhcp server on vlan 51 but pxe booting the server does not get any response. However after the host boots up into esxi, the server can get dhcp address from the pxe server.
any idea why I cannot pxe boot the host?
the port channel config is
interface port-channel20
description ESX Management
switchport access vlan 51
spanning-tree port type edge
interface Ethernet103/1/1
description **vmnic0**
switchport access vlan 51
channel-group 20
interface Ethernet104/1/1
description **vmnic4**
switchport access vlan 51
channel-group 20
06-14-2013 11:43 PM
Hi,
I think I'm missing your exact setup here so please excuse the questions.
- Is the port-channel between the ESXi server and the Nexus 5548?
- Is it the ESXi server you're trying to PXE boot and build, or PXE boot a VM on an already operational ESXi server?
If you have the port-channel between the switch and the ESXi server, and you're trying to PXE boot to build the ESXi server, then this is likely to fail.
PXE doesn't support link aggregation so only one of the two NICs will send/receive traffic. If the links of both NICs are up and you don't have LACP configured on the port-channel, then the switch will see both links operational within the port-channel.
In that case it's entirely possible the switch is sending the return traffic to the server to the NIC that isn't being used by the server for PXE boot.
Regards
06-14-2013 11:47 PM
yes the port channel is between the esxi server and the 5548. I believe this is enhanced vpc where both links should be active. I am not sure if I have lacp configured. how do I check?
thanks
06-14-2013 11:54 PM
i checked the config and there is
feature lacp
if the esxi host is booted up, I can assign a vm into the same portgroup as the management vswitch. Then I can pxeboot that vm. so the dhcp/pxe sever works in that subnet. problem is pxebooting the host. In the bios I enabled pxebooting on the nics already. There is no vlan tagging needed on the host since a native vlan was already set on the switch ports.
any idea?
06-15-2013 12:09 AM
Hi,
If you run the show port-channel summary command you'll find that the Protocol column will show as None. To make LACP operate you need to configure the interface with the channel-group 20 mode [active|passive] command. The problem is that ESXi doesn't support LACP unless you have the Nexus 1000V or ESXi 5.1 with the Distributed Virtual Switch.
I think the problem you're seeing is as I described above i.e., the switch sees both links operational when you're trying to PXE boot the ESXi server. This is exactly the reason we moved away from using link aggregation (route based on IP hash) on our ESXi servers as it meant rebuilding them via PXE boot was painful.
On the switch try taking one of the links out of the port-channel while PXE booting the server and see if that works OK.
Regards
06-15-2013 12:00 PM
hi
yes taking one of the links out of the po and pxe started to work.
I am using dvswitch in esxi. so if I enabled lacp i can add the link back into the po and pxe should work again?
thanks
06-15-2013 12:05 PM
when i try to do
channel-group 20 mode active
i got the error
Cannot add active-mode port to on-mode port-channel20
06-15-2013 02:46 PM
ok I was able to remove both ports from the port channel and added them back in with the
"mode active"
but pxe still failed to work.
sh port-channel sum
shows
20 Po20(SD) Eth LACP Eth103/1/1(I) Eth104/1/1(I)
06-16-2013 12:41 AM
Hi Tony,
From the show port-channel summary we can see the port states show with an I flag, for Individual. This means the ports are not receiving LACPDU and hence not operating as part of the port-channel. This is expected when trying to run LAG to a server that is PXE booting as PXE doesn't support link aggregation.
Do you see the server MAC address associated with one of the two switch interfaces? If you run the CLI command show mac address interface eth103/1/1 and show mac address interface eth104/1/1 is the MAC address of the server NIC that's being used for PXE booting seen on the correct switch interface? Can you also add the command logging event link-status to the two interfaces please so we can the interface transitions during the boot process?
Regards
06-16-2013 08:47 AM
I tried pxe booting again and
sh mac address interface 103/1/1 shows
sh mac address-table interface ethernet 103/1/1
Legend:
* - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
age - seconds since last seen,+ - primary entry using vPC Peer-Link
VLAN MAC Address Type age Secure NTFY Ports/SWID.SSID.LID
---------+-----------------+--------+---------+------+----+------------------
* 51 782b.cb3c.cd06 dynamic 70 F F Eth103/1/1
but it did work. i was able to pxeboot. after booting into esxi
I still see both ports shown as (I) and sh vpc shows down.
06-16-2013 03:22 PM
Hi Steve
in esxi, I keep losing connectivity to the esxi host when i move my management vmknic into the vsphee 5.1 dvs.
I did enable lacp on the 5.1 vsphere switch and set load balancing policy to ip hash. I moved one nic into the dvs and proceeded to move the vmk into the dvs but keep losing connectivity.
any idea about this?
thanks
06-16-2013 10:53 PM
Hi Tony,
For the links that are still showing as Individual, this is due to the fact they are not receiving LACPDU from the VMware switch. You can use the show lacp counters to see how many LACPDU have been received and I believe you'll see this value is not incrementing.
As for loosing connectivity to your management vmknic, I've never configured the distributed switch in VMware so unfortunately not.
Can I take a step back and ask why you are trying to use LAG between the ESXi hosts and the Nexus? As you can see, this can be problematic, and the only real advantage of a LAG vs. active/active NIC teaming using route based on originating port ID is that a single VM can use more than the capacity of a single physical link. This of course assumes that the XOR of the IP addresses is such that it would balance multiple sessions across the two links of the LAG.
Regards
06-17-2013 05:27 PM
I have taken the management network links out of a port channel
but the problem now is whenever I move the management network vmknic to the 5.1 dvswitch, the host loses connectivity
06-18-2013 11:09 PM
Hi Tony,
As I mentioned above, I've not configured the VMware vNetwork Distributed Switch so I can't really provide any useful help on the issues you're now seeing.
I think if you wish to progress with using the vDS then you'd probably be better served opening a new post specifically for that, perhaps in the Data Centre Server Networking forum.
I would still like to understand the rationale for using link aggregation to the ESXi server. I think link aggregation is a great mechanism, but in this particular use case I believe the benefit gained is minimal. When you then consider the potential issues and complexity it adds I really believe a simpler load balancing option is a better approach.
Regards
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide