cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
5493
Views
0
Helpful
13
Replies

pxe boot not working on nexus 5548 ports

Dragomir
Level 1
Level 1

I got a few esxi server that are plugged into 2, nexus 5548 ports in a port channel configuration.

The port channel for the management interface nics is configure on vlan 51. I have a pxe/dhcp server on vlan 51 but pxe booting the server does not get any response. However after the host boots up into esxi, the server can get dhcp address from the pxe server.

any idea why I cannot pxe boot the host?

the port channel config is

interface port-channel20

  description ESX Management

  switchport access vlan 51

  spanning-tree port type edge

interface Ethernet103/1/1

  description **vmnic0**

  switchport access vlan 51

  channel-group 20

interface Ethernet104/1/1

  description **vmnic4**

  switchport access vlan 51

  channel-group 20

13 Replies 13

Steve Fuller
Level 9
Level 9

Hi,

I think I'm missing your exact setup here so please excuse the questions.

- Is the port-channel between the ESXi server and the Nexus 5548?

- Is it the ESXi server you're trying to PXE boot and build, or PXE boot a VM on an already operational ESXi server?

If you have the port-channel between the switch and the ESXi server, and you're trying to PXE boot to build the ESXi server, then this is likely to fail.

PXE doesn't support link aggregation so only one of the two NICs will send/receive traffic. If the links of both NICs are up and you don't have LACP configured on the port-channel, then the switch will see both links operational within the port-channel.

In that case it's entirely possible the switch is sending the return traffic to the server to the NIC that isn't being used by the server for PXE boot.

Regards

yes the port channel is between the esxi server and the 5548. I believe this is enhanced vpc where both links should be active. I am not sure if I have lacp configured. how do I check?

thanks

i checked the config and there is

feature lacp

if the esxi host is booted up, I can assign a vm into the same portgroup as the management vswitch. Then I can pxeboot that vm. so the dhcp/pxe sever works in that subnet. problem is pxebooting the host. In the bios I enabled pxebooting on the nics already. There is no vlan tagging needed on the host since a native vlan was already set on the switch ports.

any idea?

Hi,

If you run the show port-channel summary command you'll find that the Protocol column will show as None. To make LACP operate you need to configure the interface with the channel-group 20 mode [active|passive] command. The problem is that ESXi doesn't support LACP unless you have the Nexus 1000V or ESXi 5.1 with the Distributed Virtual Switch.

I think the problem you're seeing is as I described above i.e., the switch sees both links operational when you're trying to PXE boot the ESXi server. This is exactly the reason we moved away from using link aggregation (route based on IP hash) on our ESXi servers as it meant rebuilding them via PXE boot was painful.

On the switch try taking one of the links out of the port-channel while PXE booting the server and see if that works OK.

Regards

hi

yes taking one of the links out of the po and pxe started to work.

I am using dvswitch in esxi. so if I enabled lacp i can add the link back into the po and pxe should work again?

thanks

when i try to do

channel-group 20 mode active

i got the error

Cannot add active-mode port to on-mode port-channel20

ok I was able to remove both ports from the port channel and added them  back in with the

"mode active"

but pxe still failed to work.

sh port-channel sum

shows

20    Po20(SD)    Eth      LACP      Eth103/1/1(I)  Eth104/1/1(I)

Hi Tony,

From the show port-channel summary we can see the port states show with an I flag, for Individual. This means the ports are not receiving LACPDU and hence not operating as part of the port-channel. This is expected when trying to run LAG to a server that is PXE booting as PXE doesn't support link aggregation.

Do you see the server MAC address associated with one of the two switch interfaces? If you run the CLI command show mac address interface eth103/1/1 and show mac address interface eth104/1/1 is the MAC address of the server NIC that's being used for PXE booting seen on the correct switch interface? Can you also add the command logging event link-status to the two interfaces please so we can the interface transitions during the boot process?

Regards

I tried pxe booting again and

sh mac address interface 103/1/1 shows

sh mac address-table interface ethernet 103/1/1

Legend:

        * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC

        age - seconds since last seen,+ - primary entry using vPC Peer-Link

   VLAN     MAC Address      Type      age     Secure NTFY   Ports/SWID.SSID.LID

---------+-----------------+--------+---------+------+----+------------------

* 51       782b.cb3c.cd06    dynamic   70         F    F  Eth103/1/1

but it did work. i was able to pxeboot. after booting into esxi

I still see both ports shown as (I) and sh vpc shows down.

Hi Steve

in esxi, I keep losing connectivity to the esxi host when i move my management vmknic into the vsphee 5.1 dvs.

I did enable lacp on the 5.1 vsphere switch and set load balancing policy to ip hash. I moved one nic into the dvs and proceeded to move the vmk into the dvs but keep losing connectivity.

any idea about this?

thanks

Hi Tony,

For the links that are still showing as Individual, this is due to the fact they are not receiving LACPDU from the VMware switch. You can use the show lacp counters to see how many LACPDU have been received and I believe you'll see this value is not incrementing.

As for loosing connectivity to your management vmknic, I've never configured the distributed switch in VMware so unfortunately not.

Can I take a step back and ask why you are trying to use LAG between the ESXi hosts and the Nexus? As you can see, this can be problematic, and the only real advantage of a LAG vs. active/active NIC teaming using route based on originating port ID is that a single VM can use more than the capacity of a single physical link. This of course assumes that the XOR of the IP addresses is such that it would balance multiple sessions across the two links of the LAG.

Regards

I have taken the management network links out of a port channel

but the problem now is whenever I move the management network vmknic to the 5.1 dvswitch, the host loses connectivity

Hi Tony,

As I mentioned above, I've not configured the VMware vNetwork Distributed Switch so I can't really provide any useful help on the issues you're now seeing.

I think if you wish to progress with using the vDS then you'd probably be better served opening a new post specifically for that, perhaps in the Data Centre Server Networking forum.

I would still like to understand the rationale for using link aggregation to the ESXi server. I think link aggregation is a great mechanism, but in this particular use case I believe the benefit gained is minimal. When you then consider the potential issues and complexity it adds I really believe a simpler load balancing option is a better approach.

Regards

Review Cisco Networking for a $25 gift card