08-18-2015 08:40 PM - edited 03-01-2019 12:19 PM
Hello All,
I am new to this technology & facing a peculiar issue with the UCS blade networking.
Two UCS B200 M3 blades are configured as ESXi hosts. Static IPs have been configured.
Both the blades are able to connect to the external network but they are not able to connect to each other.
Due to this I am able to add them to vCenter but can not run the HA agent.
Can anybody support me to resolve the issue?
Warm Regards,
Jitendra
08-18-2015 11:36 PM
Hi Jitendra
Are the 2 ESX in the same IP subnet and vlan ?
I assume youcan ping both ESX from the outside, as well as ping from ESX to the outside ?
How is the UCS system connected to the outside ?
Walter.
08-18-2015 11:45 PM
Hi Walter,
Yes, they are in the same VLAN and ping from ESX to outside is OK as well as ping from outside to ESX. UCS is connected through FI to server farm switch.
-Jitendra
08-18-2015 11:59 PM
- are you using vswitch or DVS or N1k ?
- is this one server farm switch, which connects both FI ? or 2 ?
- in case you have 2 North bound server farm switches, is this vlan configured between the 2 ?
08-19-2015 12:07 AM
- external network is physical switch
- there are 2 farm switches
- yes VLAN is configured between the two
The issue is why two servers can't talk to each other when they can with the external world
08-19-2015 12:59 AM
I assume/hope you are in Ethernet End Host mode ?
In this case, the 2 FI are separate fabrics, no data traffic flows between FI.
If you have a vswitch with 2 interfaces connecting to fabric A and B, in can happen, that even within the same vlan, that server 1 sends traffic on fabric A, and server 2 receives the traffic on fabric B.
Which means traffic has to go North bound exiting UCS and return into UCS again. Therefore, if you have no proper Vlan connectivity outside your UCS domain, it will not work.
08-19-2015 01:56 AM
Actually I have two other servers in the same chassis and they are able to ping each other. The issue is with these new servers only. Thus, I believe this has something to do on UCS side rather than external network.
08-19-2015 02:19 AM
I would do the following test (hope you have setup HA):
Disconnect the connectivity on one FI to the Northbound switch, and test again.
Is ESX installed on the local disk, or boot over network ?
08-19-2015 03:56 AM
ESX are installed on local disk
08-19-2015 04:11 AM
Are there any ESX server in the same chassis, which work ok; or have all the servers in a particular chassis this same issue ?
08-19-2015 04:31 AM
There are two more servers. But they are Windows hosts and are working fine.
Actually there are two chassis and the ESX in discussion are located in two different chassis. The two chassis are connected to the primary & subordinate FI.
08-19-2015 04:34 AM
Interesting. We started having an issue with this too in the past few weeks. All b200m3 esxi blades management, which sits on an isolated VLAN from production servers, can't talk to each other if their uplink port is on the opposite FI. vcenter VM sits on same VLAN. When hosts are on opposite fabric from vcenter vm, vcenter isn't able to talk to them (unable to ping. Verified arp was correct though). However can ping all of them from the outside of the VLAN without issue.
We used evac mode to test if everything went to A if everything worked and it does. But as soon as we go splitting the fabric we lose connection to whatever sits on the B fabric (vcenter is on A for now).
Our northbound switches are N5Ks that are only layer 2. There's a VPC setup and we have verified the config is right. I've been up and down the ucs config and can't find anything. My last chance hope is a modification we made months ago (problem didn't surface after change though) to the nic policy which was to publish the Mac address to all VLANs (in hopes of making the ESXi dvswitch health check work).
We run our management over a pair of vnics that are using a standard vswitch. No modifications have been made to the vswitch.
Its really strange. I'm in the process of ordering a network tap so I can see if the ucs is even sending traffic northbound or not. If it's not then I'm really stumped. If it is then I think it's a 5k problem
But it sounds like your issue is very similar. Stuff on A can't talk to B. In my case though it appears to be only certain nics or certain VLANs. I tried placing a VM on another VLAN on fabric A and a physical UCS b200m3 windows server on the B side, same VLAN, and they never lost connectivity. In my case it appears to be a specific VLAN. I really don t get it though. Its the same VLAN. All later 2 stays within the FI or within the 5K. It never hits a router or firewall.
I'm gojng to plug away at it more today but if I can't find a solution I'm opening a tac case.
08-19-2015 04:55 AM
Try to shut the uplinks on one FI, and see if ESX on Fabric A and B do communicate properly !
any explanation why windows server work ok, the problem might be ESX specific ?
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide