Cisco UCS network uplink on aggregation layer

Danny Sandner · ‎07-16-2014

Hello Cisco Community,

we are using our UCS (Version 2.21d) for ESX Hosts. Each host has 3 vnics as follows:

vnic 0 = VLAN 10 --> Fabric A, Failover Fabric B
vnic 1 = VLAN 20 --> Fabric B, Failover Fabric A
vnic 2 = VLAN 100 --> Fabric A, Failover Fabric B

Actually UCS is connected to the Access Layer (Catalyst 6509) and we are migrating to Nexus (vPC). As you know, Cisco UCS Fabric Interconnects can handle layer2 traffic itself. So we are planning to connect our UCS Fabric Interconnects directly to our new l3 nexus switch.

Does anyone have connect the UCS directly to l3? Do we have to pay attention to something? Are there some recommendations?

thanks in advance

best regards

/Danny

Marcin Latosiewicz · ‎07-16-2014

Certain designs might require a L3 capable platform vide figure 1.

http://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/UCS_CVDs/esxi51_n7k_metrocluster.pdf

What are you trying to solve?

Danny Sandner · ‎07-16-2014

Thanks Marcin,

I just want to know, if I have to pay attention to something. vpc and so on. But if it is simple and I just have to connect my uplinks to the l3 nexus, its ok.

Just to collect some informations / experiences from other ucs users about connect ucs to l2 or directly to l3 :)

Marcin Latosiewicz · ‎07-16-2014

Ha! For most typical designs (Flexpod, vblock) I would say, don't do it. L2 is typically faster.

It really depends on your business drivers. Talk to your SE, maybe it's better to sit down and divide and conquer.

Walter Dey · ‎07-16-2014

Hi Danny !

Which softswitch are you using on ESXi ? vswitch, VMware DVS and / or N1k ?

my 2c

It is not best practise to use hardware fabric failover. Instead, use pair's of vnics, one connected to fabric A, resp B, and then let ESXi vswitch do the loadbalancing and failover.

Also, vPC is best practise to connect your FI's to Nexus (with L2 of course)

Walter.

Danny Sandner · ‎07-17-2014

Hi wdey, thanks for you answer.

we are using ESXi 5.5 with dvswitche (distributed vswitch). In our cisco ucs powerworkshop, we discuss pros and contras of hard- and softwarefailover and we commit to use hardwarefailover. It is very fast and we have no problems actually.

Yesterday I updated UCS and restarted the FIs. No interruption of the VMs ;)

But about L3:

We only use "VLAN 10 / VLAN 20" in our vmware enviroment (UCS). VLAN 100 is not important, its management vlan. So if traffic has to leave the VM enviroment, it has to be routed in another vlan. So why it is not recommend to attach the ucs directly to the aggregation switch?

Walter Dey · ‎07-17-2014

we are using ESXi 5.5 with dvswitche (distributed vswitch). In our cisco ucs powerworkshop, we discuss pros and contras of hard- and softwarefailover and we commit to use hardwarefailover. It is very fast and we have no problems actually.

This is a neverending and misunderstood story: your design should provide load balancing AND failover. Hardware failover only gives you the latter.In your design, you just use one fabric per Vlan, what a waste !

And think about the situation of a failover on a ESXi host with 200 VM's; one has to send out at least 200 GARP messages, and this is load on the FI CPU. Most likely dozens or more ESXi server are impacted......

Cisco best practise: if you use softswitch, let it do the loadbalancing and failover, don't use hardware failover.

see attachment (the paper is not up to date,

For ESX Server running vSwitch/DVS/Nexus 1000v and using Cisco UCS Manager Version 1.3 and below, it is recommended that fabric failover not be enabled, as that will require a chatty server for predictable failover. Instead, create regular vNICs and let the soft switch send
gARPs for VMs. vNICs should be assigned in pairs (Fabric A and B) so that both fabrics are utilized.

Cisco UCS version 1.4 has introduced the Fabric Sync feature, which enhances the fabric failover functionality for hypervisors as gARPs for VMs are sent out by the standby FI on failover. It does not necessarily reduce the number of vNICs as load sharing among the fabric is highly recommended. Also recommended is to keep the vNICs with fabric failover disabled, avoiding the use of the Fabric Sync feature in 1.4 for ESX based soft switches for quicker failover.

Danny Sandner · ‎07-18-2014

Just to clear:

Our enviroment consists two server Vlans, only for VMs!, 50% VMs on VLAN 10, 50%VMs on VLAN20 (historically). If we use NIC teaming for the servervlans, our "east/west traffic" within the same vlan may go out of the fabric interconnect to l2 switch and back to the other fabric interconnect. Very uncomfortable, not?

So we decided, to map VLAN 10 on Fabric A and VLAN 20 on Fabric B, both with hardware failover. So VLAN 10 will be switched inside Fabric Interconnect A, VLAN 20 will be switched inside Fabric Interconnect 20.

If we have just one server vlan, I understand the use of load-sharing and nic teaming. But in our case I think, fabric failover is the only thing.

For vMotion and Management (VLAN 100) its clear, one VLAN, only east-west traffic = single vnic with fabric failover.

So, now we are at the point, were east/west traffic of our VM servers will be switched inside their fabric interconnects. In that case, it is not necessary to connect the UCS to l2 switch, is it?

Please correct me, if I am wrong. I am not a network administrator ;) Just starting to understand networking :)

Walter Dey · ‎07-18-2014

OK, I understand your design. However, the same result could be achieved with DVS and the proper design; eg.with different port groups with active A/ standby B, and vice versa. In my opinion, this software version is designed and managed by server folks; the hardware solution (UCS failover) is more network oriented, with some caveats that I mentioned above.

So, now we are at the point, were east/west traffic of our VM servers will be switched inside their fabric interconnects. In that case, it is not necessary to connect the UCS to l2 switch, is it?

You need anyway a L2 connectivity from FI Northbound (perferred vPC); you are correct, however, even with your design, you might have failures, where some ESXi hosts switch from A to B, and therefore your traffic has to leave the UCS domain, and enter it again on the peer FI. e.g. a IOM - FI link of one chassis fails, or a FI port connecting a chassis fails. It is important that this external L2 connectivity is 10G and add's minimal additional hop count (vPC!); I had customers with external 1G connectivity, suddenly having performance problems ! I hope this should be clear by now ?

Danny Sandner · ‎07-18-2014

Of cource we are using 2x10G each Fabric for uplink in a portchannel. (actually lacp / catalyst 6509e). With Nexus we will use vpc.

however, even with your design, you might have failures, where some ESXi hosts switch from A to B, and therefore your traffic has to leave the UCS domain, and enter it again on the peer FI. e.g. a IOM - FI link of one chassis fails, or a FI port connecting a chassis fails

Yes I understand. But the traffic, which leaves the UCS domain, has to been routed, caus' it leave the VLAN / IP range, right? So the l2 switch will forward the packets to l3 and l3 send it "back" to l2 and destination.

l3 will be a nexus 5600 vpc pair, l2 are two nexus 5600 vpc pairs, one per datacenter.

Walter Dey · ‎07-18-2014

Hi Hugo

If source and destination vnic's are in the same vlan, but on different fabrics, then the Ethernet frame has to exit UCS domain, becoming L2 switches, and entering peer FI ! see attachment !

The L2 switching on the local FI only happens, if source and destination vnic are in the same vlan AND both interfaces are connected on the same fabric (A or B).

Hope this helps !

Walter.

Danny Sandner · ‎07-21-2014

Hi wdey,

in the scenario you attached it is ok. I paint our scenario which we are using, see attached file.

VLAN 10 / VLAN 20

all VNIC0 pinned to VLAN 10 / FABRIC A (failover to Fab. B)

all VNIC1 pinned to VLAN 20 / FABRIC B (failover to Fab. A)

Every ESX Host has only one VNIC connected to each VLAN.

VM1 and VM4 are in VLAN 10, VM2 and VM3 in VLAN 20.

Traffic VM1 to VM4 --> inside Fabric A

Traffic VM2 to VM3 --> inside Fabric B

Traffic VM1 to VM2 --> outside over Layer 3 routing ('cause it leaves the VLAN)

So normally, the only traffic we have outside the UCS, is layer 3 traffic, that has to be routed. And if there is a failure, the aggregation switch can do the l2 to, can't it?

/Danny

Walter Dey · ‎07-22-2014

So normally, the only traffic we have outside the UCS, is layer 3 traffic, that has to be routed.

No, this is not the whole truth ! as my previous picture shows, L2 traffic has to leave UCS, for the case eg. of VM1 to VM4, where VM4 has been failed over to fabric B.

In your setup, VLAN 10 and 20 are configured on fabric A and B, otherwise the failover would not work.

Danny Sandner · ‎07-22-2014

yeah, because in your picture only exists VLAN 10 in both fabrics, over all vnics.

Did you understand my scenario?

Two Vlans, seperate in fabric a and fabric b. For each VLAN one vnic for each host. If something fails, the vnics of one fabric switched to the other fabric, and the whole vlan is now handled by the other fabric.

The VMs are pinned on a virtual port on its ESX Host, which is connected to the ESX Hosts vnic to the relative fabric.

VM1 = VLAN 10, connected to vnic0 of Host 1 --> FABRIC A

VM2 = VLAN 20, connected to vnic1 of Host1 --> FABRIC B

VM3 = VLAN 20, connected to vnic1 of Host 2 --> FABRIC B

VM4 = VLAN 10, connected to vnic0 of Host 2 --> FABRIC A

In case Fabric A fails, all vnic0 will failover to FABRIC B. Just as reverse.

Do I have a heavy failure in my concept or knowledge or is this just a big misunderstanding?

Greets

/Danny

Walter Dey · ‎07-22-2014

Two Vlans, seperate in fabric a and fabric b.

Vlan 10 and 20 have to be present in Fabric A and B, otherwise failover will not work.

In case Fabric A fails, all vnic0 will failover to FABRIC B. Just as reverse.

Not true ! and this is the misunderstanding !

e.g. one chassis, 2 uplinks between IOM and FI, no PC; odd slots are mapped to uplink nr. 1, even slots to uplink nr. 2. If uplink 1 fails (IOM, FI port issue), all the odd slots fail !

If you have more than 1 chassis, there are even more such examples.