cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
9276
Views
0
Helpful
15
Replies

B200 M4 with VIC 1340 Crashing with any Windows OS

Samuel Amosah
Level 1
Level 1

Since we started ordering UCS B200 M4 blades with the new 1340 VIC adapter we have noticed that the blades intermittently crash with any Windows server OS (2008 R2 and 2012). The blades display a BSOD stop error as attached. I suspect this is being caused by the new Windows driver for the VIC adapter but because we use iSCSI boot LUNs in our environment no crash logs are generated so I can't be sure. In the Windows event log before the blade crashes there are error messages with the following text: "Storage system does not have a reachable IP address [list of storage controller IP addresses]". We do not have these problems with our M3 blades. Our environment is as follows:

 

Storage protocol: iSCSI (NetApp)

UCS version: 2.2(3d)

CIMC controller: 2.2(3d)

Adapter firmware: 4.0(1d)

Windows Driver: 2.4.0.15

Any suggestions or help would be much appreciated.

 

Thanks

(Newbie struggler)

 

 

 

 

 

15 Replies 15

Walter Dey
VIP Alumni
VIP Alumni

According to the support matrix

http://www.cisco.com/web/techdoc/ucs/interoperability/matrix/matrix.html

Windows Server 2008 R2 SP1, 2012 and 2012 R2 are supported.

I assume your enic driver has below version ?

Adapter Driver   = 2.1.0.27 (FNIC) / 2.4.0.19 (ENIC) / 2.1.0.12 (NIC Teaming Driver)
Adapter Firmware = 4.0(1)
Boot Code / BIOS =

Note 12: iSCSI Boot is supported. Both Appliance and Uplink ports are supported unless otherwise noted

Note 25: NIC Teaming driver is not currently supported with Hyper-V

Note 38: 6100 Series Fabric Interconnects and 2100 Series Fabric Extender is not supported

Walter.

Thanks Walter. After installing the VIO drivers and utilities (2.2.3f) and running enictool -i I can confirm the ENIC driver version is 2.4.0.19 and the driver date is 2-3-2015. I've also installed the VIC iSCSI dump tool so i'm hoping that will help me find a solution for this issue.

 

Any other suggestions or help would be much appreciated.

 

Thanks

Sam

Further to my last message this definitely seems to be an issue with the VIC adapter driver. With the iSCSI dump utility installed the enic6x64.sys file is referenced in the crash dump.

 

Has anyone else had reliability issues with M4 blades with VIC 1340 and Windows operating systems?

 

Thanks

Sam

Hi Samuel

- is this happening with multiple blades ?

- is iSCSI array directly connected to UCS FI ?

- how long typically does it work, before crash happens ?

I would nevertheless open a TAC case.

Walter.

I'm still waiting for Cisco to confirm but it looks like this issue is being caused by bug CSCuh83588 

The details of this bug are below and since disabling TCP segmentation offload we have not had any further unexpected crashes. Cisco TAC have not confirmed that the bug below is the cause but that is only because we are having problems generating a full kernel dump using the iSCSI dump driver.

The conditons for us are:

B200 M4 with VIC 1340
UCSM version 2.2(3d)
Windows 2012 (R2)/Windows Server 2008 R2 

Title:

TCP segmentation offload on MLOM on windows 2012 causing vnic disconnect

  

Description:

Symptom:
B200 m3 server with VIC 1240 running windows 2012 OS has intermittent network loss.
The vnic disconnects and recovers automatically.

Conditions:
B200 M3 with VIC 1240
UCSM version 2.1(1a)
Windows 2012 OS

Workaround:
Disable "TCP segmentation offload" feature in the adapter policies of the vnic. You can find this under Server -> Adapter Policies. Once you have a policy with this disabled apply it to all your vNICs

Further Problem Description:

  

Status:

Fixed

  

Severity:

3 Moderate

Last Modified:

07-MAY-2015

  

Known Affected Releases:

2.1(1a)CS9

  

Known Fixed Releases:

 
 

Disabling the TCP segmentation offload on the adapter policies didn't work for me. I have a B200 M4 with VIC 1340 recently purchased trying to load Windows 2012 R2. After installing the OS (a times it doesn't even complete the OS install), it randomly reboot and goes into Repair mode and then reboot again and ultimately end up with a “ The Boot Configuration Data file is missing some required information" File:\Boot\BCD\ Error Code: 0xc0000034."

We have 6 x B200 M3 VIC 1240 running Windows 2012 R2 without any issues.

UCSM - 2.2(5b)

VIC 1340 - firmware 4.0(5bS2)

Anyone else having this same issue or know of a fix?

Hi mmcgovern,

 

Can you confirm which storage protocol you are using? We had the above issue because we use iSCSI boot LUNs. Additionally please confirm which Windows driver version is installed. Cisco have resolved bug CSCuh83588 in adapter driver 3.0.0.22 (ENIC) which is part of the UCS B-Series Blade Server Software 3.0(2a) bundle.

Have you installed the iSCSI dump driver which is part of the VIO installer tools? Doing this may help you diagnose the issue.

Thanks for the response Samuel.

We're using iSCSI trying to boot from SAN. I'm on the latest and greatest stable version of the Cisco firmware/drivers. The issue I believe is temporary intermittent connectivity loss during the boot process either would happen during a install or after the install of the OS. And once this occurs the OS is in a unstable state and corrupts the OS. For our storage we're using Nimble Storage.

Hi Michael,

How many initiators do you have in the igroup when you are building your servers? I know from experience and from other engineers you can run into strange OS corruption issues if you have 2 initiators listed and logged in during the OS build (best to add the secondary initiator after the build is complete). As a test you could try installing VMware ESXi 5.x as that does not seem to have the same OS corruption issues as Windows operating systems.

Sam

During installation, I only have one iSCSI path and one igroup. The same issue happens when I try to install ESXi so that rules out OS/driver issues. I performed multiple OS installs on newly created LUN w/o any luck so I don't believe it a LUN corruption issue. I even move the blade to another slot and same outcome so that lead me to believe it's a bad blade. Cisco shipping me a replacement blade so let's see what happens with that. I'll keep you posted.

Hi Sam,

 

Did Cisco ever confirm this was your issue?  I am experiencing exactly the same behavior and my environment is identical to yours.  I have applied the workaround to half my servers and am now waiting for data.  Did you upgrade the driver?

-Ian

Hi Ian,

Cisco confirmed it was related to the bug mentioned above - please see the case closure summary below for full details. Hope this helps you.

Problem Description

Windows crashes pointing to Enic driver.

*** STOP : 0x000000D1

*** enic6x64.sys - Address FFFFF88001DCF39F base at FFFFF88001DC4000, datestamp 54d1623f

 

Action Taken

 

 OS: Windows 2008 R2 SP1

B200 M4

UCS version: 2.2(3d)

CIMC controller: 2.2(3d)

Adapter firmware: 4.0(1d)

Adapter Driver   = 2.1.0.27 (FNIC) / 2.4.0.19 (ENIC) / 2.1.0.12 (NIC Teaming Driver)

Chassis SEL:

# B6 03 00 00 01 02 00 00 91 66 3A 55 41 00 04 20 00 00 00 00 6F 01 FF FF # 3b6 | 04/24/2015 16:51:45 | System Mgmt Software | OS stop/shutdown #0x00 | Run-time critical stop | Asserted

Chassis Mezz logs:

No aborts

150424-16:51:36.835584 mcp.int13 int13_initialize invoked on vnic 14

150424-16:51:36.835831 mcp.iscsi_boot ERROR: [vnic14]: Failed to get isnic

150424-16:51:36.837873 mcp.iscsi_boot ERROR: [vnic14]: Failed to get isnic

150424-16:51:36.838118 mcp.iscsi_boot ERROR: [vnic14]: Failed to get isnic

150424-16:51:36.838351 mcp.iscsi_boot ERROR: [vnic14]: Failed to get isnic

 

150424-16:51:39.113701 mcp.iscsi_boot iscsi_boot_target_login_cb New iSCSI node [tcp:[hw=,ip=,net_if=eth4,iscsi_if=eth4] 10.0.158.184,3260,-1 iqn.1992-08.com.netapp:sn.1873818464:vf.71d4c4ea-982a-11e1-ae7f-00a098215f93] added

150424-16:51:39.114200 mcp.iscsi_boot vnic[14] New iSCSI node [tcp:[hw=,ip=,net_if=eth4,iscsi_if=eth4] 10.0.158.184,3260,-1 iqn.1992-

Bug suspected: CSCuh83588

But we could not get the Kernel trace to confirm the Bug.

Involved the next level of escalation support.

Confirmed it is the same Bug. And Development team forwarded the driver with the fix.

It is version 2.4.0.21.

Tested it out on test machine and it was monitored for 3 -5 weeks.

 

Resolution Summary

 

Provided the Driver with the Fixed version for the bug and the issue is fixed.

 

Proceeding to close the SR as we did not have any updates after multiple follow ups.

 

Just a note here to others who may be experiencing this issue: 

I applied the same workaround described in CSCuh83588 of disabling TCP segmentation offload on the iSCSI boot adapters.  This worked to resolve the issue for me, and I monitored the environment for 10 days. 

After consulting with Cisco, they instructed me to install the driver from package 2.2.3(h), which includes ENIC driver 3.0.0.22.

After updating the driver and removing the workaround, the issue returned for me.  I am re-implementing the workaround with the updated driver and will see if the issue goes away again.

Hi Samuel

- is this happening with multiple blades ?

- is iSCSI array directly connected to UCS FI ?

- how long typically does it work, before crash happens ?

I would nevertheless open a TAC case.

Walter.

Review Cisco Networking for a $25 gift card

Review Cisco Networking for a $25 gift card