Well, unfortunately I didn't run any more tests. I wish I had had a whole day to run tests, but I didn't.
I'm thinking, though, that if I lowered the block size to 512B and used a full 4Gb between two machines, over 100,000 random I/Os is not out of the question if you can fit everything in the cache. We have some other options for motherboards in our standard box, too, which have 16 slots, which would let us cheaply add 64GB of ECC RAM. There are also relatively inexpensive boxes out there (1U's, even), which have 32 slots, allowing a cheap 128GB of (ECC) RAM. We could market these as disk-backed memory appliances or database-acceleration boxes if there was a way to keep everything in cache (well, once the database starts up, everything important would get put in cache, so that's not really important). If we had two of them with autofailover and cache-coherent DRBD, you have effectively a highly-available memory appliance for much less expensive than less-reliable (because no autofailover) proprietary memory appliances.
Except for the fibre-channel autofailover, we could basically do this right now. Granted, it'd be much lower latency if DRBD worked with Infiniband (RDMA, not the IPoIB). But, for customers who don't care too much about high-availability or keeping track of every single write operation, this would work fine.
Great. I guess if I were doing a strict cost/benefit analysis, the SATA 1TB drives would come out a little bit on top (since they're so cheap), but when a little extra performance matters but you can't afford the ridiculous prices for 10k/15k drives, they work very well. Besides, we have lots of experience working with Seagate for well over a decade.
Overall, the 1TB SAS drives work very well and are much cheaper comparatively than any other SAS drive. They also have 32MB of cache, which is more than other SAS drives.
Well, our distributor had already patched in the newer (fixed) firmware, so we didn't have that problem at all. I had been prepared to flash it myself, but it was already done. Really, we haven't suffered any drive failures from the firmware problem. Stuff like this happens every once in a while to all the manufacturers, and it isn't as severe as, say, problems that we've had with the 400GB Hitachi "Deathstar" drives. We still like Seagate a lot.
Alright, so I've been doing some testing on a smaller system using FC. Unfortunately, the initiator side still has only 1Gb FC with 32bit/33Mhz PCI-bus, but at least that gives me a way to compare performance to the DSS FC system we've already installed on our customer's site.
Basically, random read I/O/s are about 14,000 from the cache. The DSS target has only 1GB of system memory, so I used a smaller test file of only 200MB. This DSS target is using a software RAID 5 with 3 drives, so there's not much performance coming from the drives themselves
Target side:
DSS b3278 32-bit
Dual-core pentium 4 (Netburst) 3GHz (no Hyperthreading). 1GB of system RAM. 3x500GB 7200 rpm Barracuda SATA drives (I think they have 32MB of cache, otherwise it's 16MB each). Dual-port 4Gb qlogic fibrechannel card with Pci-E x4. 10GB FC volume with 4KB blocksize, default options (i.e. write-back enabled, not write-through).
Initiator side:
Windows 2003 Server.
Pentium 4 2GHz (Netburst crap). 1GB system RAM. some really old single-port 1Gb Qlogic Fibrechannel card plugged into a 32bit/33Mhz PCI slot.
or something like that. I have to run it for a lot longer beforehand in order to make sure that the whole test file is in the DSS's system cache.
Anyways, I consistently get about 13980 or so, which is within 5% of the much higher spec'd system I used before. The difference there was that it had 4 times as many cores (and they were using the Core-flavored Xeons which probably had a much higher FSB and DDR2 vs DDR).
So, it seems that ~14000 random read I/Os is to be expected when using 1Gb FC with 4kB blocks and a test file that fits entirely in the system cache. When I use a larger test file which doesn't fit in the cache, the random read I/Os on this system drop to around 150 to 300.
So, if I made a system with, say, 128GB of system memory in 1U, then I could get a usable high-performance SAN without having to fill a rack up with hundreds of 15K SAS drives and then short-stroking them. The only bad part so far is that I'd have to wait until the cache was filled until I could expect high performance.
PS, we have another system coming in that's identical to the DSS target system (i.e. dual-core pentium4, etc.), so I should be able to test how well this works with volume replication.
Stay tuned for write performance and performance using iSCSI!