12-10-2013 11:34 AM - edited 03-01-2019 11:24 AM
We are running UCS and we have some blades running vmware. We have a SAN Storage in the backend which the vm's are installed on. We are in the process of upgrading vmware from 5.0 to 5.5. The problem we ran into is all the hosts that have been upgraded to 5.5 cannot see all the storage.
Within vmware the hosts running 5.5 we can see 5 of the datastores but we cannot see 2 of them. The hosts running 5.0 we can see all 7 datastores.
If i select a host running 5.5 I can goto configuration -> Storage and in the datastores view I can only see 5 of them. but if I click the devices tab next to it I can see the other 2 datastores sitting there but there is no way to mount them.
I am curious if anyone has any clue as to why this is or how to resolve it? TAC seems to say we need to upgrade the firmware on the blades but that seems really odd considering vmware can see some of the storage but not all of it.
12-11-2013 12:26 AM
The situation is dependant on multiple factors (sizes, switching/NPV,...) , in the past we saw things like this.
I would be curious to see what pops up in vmkernel log when you rescan the storage devices.
Firmware upgrade might be a good idea regardless ;-)
12-11-2013 07:53 AM
Thanks for the reply. I know upgrading the firmware would be a good idea as we are about 5 releases behind. Problem is we just have bare metal machines on the chassis that are running 24/7 so taking the downtime is very difficult.
the storage is carved up into 2TB. They are sharing the same vsan through the cisco switches as the other storage, and they are zoned the same way as well. As I stated they show up in the host just fine that are running 5.0. Just not the upgraded 5.5 hosts. Also I see the storage under the storage view.
Checking the vmkernal.log after running a scan the screenshots below are the errors I saw pertaining to those 2 datastores. Sorry I couldnt find a way to pull just the text from the KVM console so i took screenshots. Also there was a few more screens with the same errors but i figured this was enough.
12-11-2013 08:48 AM
Looking at the SCSI calls the request was illegal 0x5 but the ASC/Qualifies are not ones which I'm aware of.
I'd be curious what the TAC folks were looking into.
I'd typically check what's going on the other side ...
As to why you might be seeing them as disable in ESX 5.5 only, COULD be possibly due to:
Just my 2c.
12-11-2013 09:20 AM
Thanks for the reply and links. I looked through them. But it seems none of them really point to the cause or a resolution.
as a note the TAC folks just asked what firmware my servers were on and just said it was firmware. That was the extent of there digging. The fact the 5.5 hosts can see it under devices and the 5.0 hosts can mount them, really has my doubts its firmware.I could see it being firmware if they didnt see the storage at all.
12-12-2013 02:00 AM
Well you can hardly expect for a problem to be solved based on two screenshots, I wish it was that easy :-)
In this particular case I share your doubts about firmware upgrade, but I'd like ot have a look at the TAC case (can you share the number?).
the SCSI calls indicate compatibility problem between two components, one of the possibilities is that ESX 5.5 is uising newer framework which is not yet implmented in your version of firmware. Hance the illegal operation code in SCSI.
12-12-2013 09:02 AM
I sent you a PM with the case #. I assume we can continue discussion here but wanted to keep the case private.
12-13-2013 04:03 AM
Got the message. I have to say, I'm not surprised about their reaction. The support matrixes are there for a reason.
The problem there is even if they started digging in it would not be supported at the end of the day, but your question about a possible workaround is one which I find fair.
I can't make any promises as far as workaround goes but I'd start by checking VMware side.
If you were hoping for a simple answer I'm afraid I cannot provide it. Below are a few ideas, of what you can do to narrow down the problem. Obviously no guarantees.
Not sure what you tried so far and how if it's safe to perform those actions at this stage in your environment.
We'd need to understand that difference/order of the datastores.
What datastores are there at the moment?
Are only the first 5 connecting or maybe always specific ones?
What are those datastores? FC, iSCSI? Are there specific datastores connecting.
In terms of practical things to try deleting/unmounting datastores (and re-adding them) would be interesting.
12-13-2013 06:12 AM
Thanks for the reply again. So all of our storage is FC. It is always the same 5 datastores that are showing up. The 2 that are not showing up are new storage that we added in. Our storage provider said we should upgrade our MDS switches from 5.0.1 to 6.2.3 and see if that fixes the issue.
My question to the TAC's on our case was will upgrading the MDS switches have any impact on the UCS side, they were unaware and told me to ask the MDS TAC on that question. I just want to make sure if I update the code on the MDS switches that it wont break anything on the UCS side because the UCS is older code as well.
12-14-2013 04:56 AM
Not my area of expertise :/
Let me see if I can find somebody from our SAN team to pitch in or at least comment offline.
12-16-2013 02:01 AM
Spoke with the folks in our SAN/SV team.
They would not say for sure, since there's not much to go on, but they would rather point to UCS problem rather than FC.
Considering that we do get SCSI reply it SEEMS to imply the FC layer underneatch is fine.
They would rather point to firmware problem (when I hinted that).
BTW, You can ask (technically speaking you should not have to, it CAN be done on the background) for the UCS follks to open a collaboration with MDS folks. Get more people involved.
Again this one IS a tricky one, since we're in a scenario which is unsupported. So TAC can only work on best effort basis, without much help from business unit side.
12-16-2013 06:44 AM
THanks for doing so much digging for me on this. much appreciated. We have been working with our SAN support vendor as well. They recommended we upgrade the MDS and I opened a TAC case with the Cisco SAN team and got all the info I needed on that.
Upgrading the SAN firmware creates the least amount of disruption to our data center so I think we are going to go that route first and see if that solves it. If not I think we are only left with upgrading the entire UCS system.
Does the UCS Firmware going from 2.0.3 to 2.1.3a upgrade the server firmwares as well? Or does it just do the FI's and chassiis?
12-16-2013 07:25 AM
What were you planning to do exactly?
SOME cross-version support exists, for example
"Cross-Version Firmware Support"
But if you jump directly you will need to bring up everything to same software.
12-16-2013 07:43 AM
Well according to the TAC technician on the case, in order for us to be working properly in vmware 5.5 we need to upgrade the firmware to the latest. So I assume that means UCS from 2.0.3 to 2.1.3. Or am I incorrect in that?
12-16-2013 09:24 AM
I guess I overcomplicated what I tried to say :-)
My digression about cross-versions support was not really adding much to our conversation.
Yes to be back under "supported" bracket you need to get to 2.1.3 from your 2.0.3. No way to avoid a service interruption AFAIK.
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: