I recently purchased a UCS C220 bundle, which includes a Cisco UCS RAID SAS 2008M-8i Mezzanine Card. I'm planning on deploying this as a standalone server running XenServer 6.2 in the near future.
I'm happy with the unit and testing is good, but for one aspect. The disk IO throughput on it seems to be far short of what I expected. I have a desktop PC with an Intel DZ77GA-70K motherboard as a lab spare, and the disk IO I can achieve from that server with the same disks exceeds what the C220 seems to be able to achieve, on a consistent repeatable basis.
The testing I am doing is based around two benchmarks:
1. A 400G file copy between the two machines, over the network back-to-back to note the maximum sustained throughput, and
2. A mix of runs with 'sysbench' to mix of tests of local IO, of sequential and random reads and writes, with the command: "sysbench --test=fileio --file-total-size=150G --file-test-mode=seqrd --init-rng=on --max-time=300 --max-requests=0 run"
For #1, I run a copy to/from an HP MicroServer Gen8, which has 12TB of disk space on it in a Linux RAID0 configuration (4x3TB Seagate drives). If I copy files to or from this HP Server to the Intel DZ77GA-70K I am able to easily saturate the 1G network, achieving a sustained 960MBit/s for an hour or more at a time. If I then take the exact same SATA disks from the DZ77GA-70K, connect these into the UCS box and do the exact same network copy with the exact same OS, I'm only about to get around ~400-500MBit/sec of sustained throughput.
For #2, the test results of which are entirely local to the C220, come in around 105-110 MByte/sec on a sequential read, which drops to around 2 MByte/sec on a random read or write test. No surprise of the enormous drop - because random reads/writes are a pretty tough IO load, but I would expect sequential reading should be much better. I can get consistent sysbench seqrd results from the MicroServer of around 300 MByte/sec, for example.
I can consistently replicate this with Redhat 6.5, as well as Gentoo (running the latest linux kernel) as well as from a Xen 6.2SP1 Hypervisor install on the C220 (tested from the Dom0 domain itself, as well as a Linux guest) all 64 bit. Jumbo frames are enabled end-to-end also, and CPU is not bottlenecking. Latest firmware is installed on all components. The ucs-cxxx-drivers.1.5.4a.iso image states that for the Redhat and Xen systems, that the required drivers are included in the OS, so I don't need to worry about installing them separately. Presumably the Gentoo system has even newer drivers again because it has a very new kernel, but alas the throughputs are the same on all of those systems.
I have tried with SATA as well as a SAS drive, and the test results are also practically the same. All disks in all servers are Seagate 6.0 Gb/s units, and none of the servers are swapping to disk at any stage.
I am happy with network IO - I can completely saturate the 1G ports easily, and I'm convinced that's not a part of the problem here.
What could cause this sort of performance? Storage card logs in CIMC don't indicate anything is wrong and none of the OS's are indicating issues of any sort, but it certainly does seem something isn't right in that I'm getting significantly superior performance from a desktop motherboard and the MicroServer, than an enterprise grade server, when testing with the exact same hard drives.
- Is the 2008M-8i card considered a low-end RAID card or should I be getting reasonable throughput from it? I was anticipating performance at least as good as a desktop motherboard, but this doesn't seem to be the case. The RAID card as a component is more expensive than an entire MicroServer or Intel Motherboard so it should run much better, yes?
- What sort of performance should I expect out of this card on a single sequential read or write?
- Can this RAID card run drives just as JBOD's or do all disks have to be initialised in an array (even if just a RAID0 array with 1 disk)? It seems if they are added to the server they do not show up to any OS until they are initialised as part of an array, although I haven't delved into the BIOS settings of the card itself (only from CIMC so far).
- I recall seeing something about best practice of having two virtual drives on these cards, what is the impact in running more, given the card certainly allows more to be created (I currently have 4 while I am testing)
- I noticed on Cacti graphs while rebuilding a RAID1 array that the CPU ran hotter while the array was being rebuilt, and cooled down once the rebuild had completed, which indicates the rebuild was using up CPU on the host hardware. Should this not have been entirely transparent to the system if the RAID activity is offloaded to this card, or is an increase in CPU to be expected?
I'm very keen to find out others experiences of this card, what people have done to get good throughput out of it, or if I should go back to a whitebox server with an Intel board :-)
Solved! Go to Solution.
2008 Mezz RAID controller does not have onboard cache and is low-end ( relative ) controller.
Following white paper provides details on benchmark results for LSI cards under different IO work loads
Interesting. I realise the 2008M-8i isn't a high end card but it's a factor of 25 or 30 worse than the PCI-E cards in the whitepaper.
I've also just noticed that the Linux megaraid_sas driver (or maybe it is the card) is turning off the read-cache and write-cache on the actual drives themselves. I might re-enable those and try retesting again.
Turning on the read and write caches on the drives made only a slight positive difference. In a day or two I'll be able to test a RAID5 array and see how that goes also.
Padma - are there any benchmarks of the 2008M-8i card or even the built in RAID on the boards? The document above is focussed on the LSI-9266/85 cards. There seems to be almost no documentation on the 2008M-8i on CCO other than how to install it.
Thanks for taking the time to answer my questions. Much appreciated!
I can't now find the document which refers to 2 virtual drives being best practice, if I do I'll respond.
As for the performance on the 2008M-8i then yes I've been able to achieve 1Gbps (about 125 MByte/sec) on sequential read testing so I guess the card is working within specs. The write speeds are much poorer than this but I imagine this is due to the fact that the card has no write cache.