okay. this one is done. - Page 2

turtleburger · ‎06-29-2017

Scenario:

one 5108 AC2 chassis, with two B200 m4 blades installed running ESXi6. each server has same service profile. lan connectivity profile for vNICs, mac pools for each FI, etc.

two FIs, two sfp+ ports in each configured as uplinks. port one in each is carrying management vlan for ESXi. ESXi on each server has two vNICs for mgmt and vmotion, over vlans configured globally at lan cloud level. These are solid, no problem with both service profiles running.

Port three in each is carrying iSCSi plans, FI-A has iscsi-a vlan, FI-B has iscsi-b vlan. these are set at FI level, and are not global. each servers ESXi has two vNICs for iscsi; Each assigned to a different vlan for failover and multipath.

When i boot one server, it all is as expected. iSCSi connects two paths, one on each iSCSi vlan. if i boot the second service profile, mgmt vnic is fine, but iSCSi will not connect. no traffic is visible on the FI port. I can flip flop the service profiles and each work as expected- but the second to boot up never connects.

what is happening here? I don't have mac or ip conflicts.

extra info: using an HPE 5406/J99990 switch upstream because some dummy bought it instead of a nexus. I don't see anything weird in the logs for switch but it deserves mentioning. vlan config looks good.

thanks for any insight you can offer.

Walter Dey · ‎07-05-2017

so you have a shared iSCSI LUN, and it seems that the sharing doesn't work ! are you sure that this is not a ESXi and/or iSCSI storage issue. I could hardly believe that this is a UCS one.

turtleburger · ‎07-05-2017

i would tend to agree with you, as the ucs config is not an unusual one. had similar problems early on when trying iSCSi boot however, and those were not shared LUNs.

i and going to try to borrow a FEX in meantime.

the really weird part is that the management vmk are configured much the same: two vmnics per host, one in each FI, and they have no trouble.

Walter Dey · ‎07-05-2017

when you booted with iSCSI, you had 2 saparate LUN's (one per blade); how was the LUN masking / mapping done. Specifying the different originating IP addresses ?

Can a LUN be accessed at the same time on controller A AND B ? or is the access locked on the controller which sees the first request ?

turtleburger · ‎07-05-2017

correct. in the lan connectivity policy, iSCSi vnics are defined. each service profile has two, one in each vlan and pinned to the appropriate FI. Once the server boots, i can copy the IQN (which is assigned from a pool) to the ACL on the storage for the boot volume. This worked fine, with redundant paths to the boot LUN. but only for one server. the second one to boot would fail to make a connection.

In nimble storage systems, you dont access each controller at the same time- there is only one active at a time. in the cisco design (and mine) the storage has one interface in each vlan,with a discovery and a data ip in each subnet. the IPs are clustered across the two controllers for failover.

I have used a pc to verify that i can communicate with and log into iSCSi luns in both subnets at the same time. Based on the pcap, it seems like the switch is dropping those ARP frames. I am SPANning the downstream int (on the procurve) and see the ARP broadcasts come out of the FI, but dont see them when pcapping an access port in the storage vlan. BUT- this behavior only happens when a second server comes on line.

Walter Dey · ‎07-06-2017

Check

http://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/UCS_CVDs/smartstack_cs300_mini_deploy.html#_Toc436119300

Something trivial must be wrong ! I've done this many times (not with Nimble however)

turtleburger · ‎08-12-2017

okay. this one is done.

Borrowed a couple of SG-500 switches from a friend to remove the HP from the equation. Using the SGx switches I also was able to take advantage of a bunch of new copper and sfp+ 10Gb interfaces.

but here is the big reveal:

when I was unplugging stuff from the hp, I happened to glance over at the laptop that was running a bunch of icmp threads, aimed at the VMware vmnics. As I unplugged what I knew to be the interface I had assigned to management vlan for that FI, the ping replies kept coming. They continued until I unplugged the last trunk- and it slapped me in the face: I hadn't any idea which sfp+ uplink the FI was using for any particular connection- UCS just picks one. I was trying to treat those uplinks ports like nic interfaces and control them with access/trunk ports on the switch- but that's just not how things work. When those uplinks are just trunked, and vlans are done in the vnics, it all works like it should. jacking around with native vlans and trying to assign physical ports on the FI was just noise and unproductive headbanging.

Make a portchannel with all the sfp+ interfaces, configure your vNICs, and get on with life. Maybe you can control those ports with pinning or something- but why would you want to? Let the FI do its job and maybe read the fine manual before deciding that your HP gear hates your Cisco gear and blaming your own incompetence on that.

thanks for the fun ping pong Walter.

ucs mini puzzle. two servers break networking