DSS FC is very fast.

**Robotbeat** · 03-16-2009, 11:58 PM

I just installed a DSS server at a customer's site. It had 20 GB of cache (16GB of system cache and 4GB on the 1680ix Areca card), dual quad-core (eight cores), and 16 1TB SAS drives. It also had a (dual-port) 4Gb fibre channel card in it (for using in target mode).

I ran an SQLIO test on a test system which had a small test volume mounted on it from the DSS. When using 4KB block sizes and random reads (with 8 outstanding requests), it got 14000 I/O! That's about 55 MB/s random I/O. Not bad at all. Not only that, but that's only running with 1Gb fibre channel (that's all the test system had). Granted, it's all cached, but if you're running a small database about 10GB or less, it could very well be all cached anyway, or at least, the part you're actually using could be all cached.

You're supposed to benchmark by using a test file at least 2-4 times as big as your cache, but I only used a 1GB test file, since I didn't have time to test it again. Still, that's pretty good!

Anyways, I'm very happy with this outcome. Also, the average read latency, according to SQLIO, is 0 milliseconds. That's right. Zero. Obviously, SQLIO needs to add an option to display the results in microseconds (or nanoseconds), but looks pretty good.

Linear reads are around 96 MB/s, so that's obviously bumping up against the (initiator-side) test system's 1Gb FC interface (which is only using a 33MHz/32-bit PCI bus).

I'm sorry I didn't get the chance to test the system with a test-file that is bigger than the cache and with a 4Gb FC pci-e card on both ends. I also should've got write speeds, too. It would've been very interesting. Next time, I guess!

**To-M** · 03-17-2009, 12:40 AM

VERY COOL!!!

I am going to pass this onto the team! Please keep us up to date on the rests of the test.

**Robotbeat** · 03-17-2009, 12:56 AM

Well, unfortunately I didn't run any more tests. I wish I had had a whole day to run tests, but I didn't.

I'm thinking, though, that if I lowered the block size to 512B and used a full 4Gb between two machines, over 100,000 random I/Os is not out of the question if you can fit everything in the cache. We have some other options for motherboards in our standard box, too, which have 16 slots, which would let us cheaply add 64GB of ECC RAM. There are also relatively inexpensive boxes out there (1U's, even), which have 32 slots, allowing a cheap 128GB of (ECC) RAM. We could market these as disk-backed memory appliances or database-acceleration boxes if there was a way to keep everything in cache (well, once the database starts up, everything important would get put in cache, so that's not really important). If we had two of them with autofailover and cache-coherent DRBD, you have effectively a highly-available memory appliance for much less expensive than less-reliable (because no autofailover) proprietary memory appliances.

Except for the fibre-channel autofailover, we could basically do this right now. Granted, it'd be much lower latency if DRBD worked with Infiniband (RDMA, not the IPoIB). But, for customers who don't care too much about high-availability or keeping track of every single write operation, this would work fine.

**enealDC** · 03-17-2009, 03:08 AM

So I see you are using the Seagate 1TB SAS drives. How are those working out for you?

**Robotbeat** · 03-17-2009, 03:46 AM

Great. I guess if I were doing a strict cost/benefit analysis, the SATA 1TB drives would come out a little bit on top (since they're so cheap), but when a little extra performance matters but you can't afford the ridiculous prices for 10k/15k drives, they work very well. Besides, we have lots of experience working with Seagate for well over a decade.

Overall, the 1TB SAS drives work very well and are much cheaper comparatively than any other SAS drive. They also have 32MB of cache, which is more than other SAS drives.

**enealDC** · 03-18-2009, 08:17 AM

Do they suffer the same firmware problems that the ES2 SATA drives have had the last year or so? I'm not loving Seagate this year...

**Robotbeat** · 03-18-2009, 02:08 PM

Well, our distributor had already patched in the newer (fixed) firmware, so we didn't have that problem at all. I had been prepared to flash it myself, but it was already done. Really, we haven't suffered any drive failures from the firmware problem. Stuff like this happens every once in a while to all the manufacturers, and it isn't as severe as, say, problems that we've had with the 400GB Hitachi "Deathstar" drives. We still like Seagate a lot.

**Robotbeat** · 03-18-2009, 10:59 PM

Alright, so I've been doing some testing on a smaller system using FC. Unfortunately, the initiator side still has only 1Gb FC with 32bit/33Mhz PCI-bus, but at least that gives me a way to compare performance to the DSS FC system we've already installed on our customer's site.

Basically, random read I/O/s are about 14,000 from the cache. The DSS target has only 1GB of system memory, so I used a smaller test file of only 200MB. This DSS target is using a software RAID 5 with 3 drives, so there's not much performance coming from the drives themselves

Target side:
DSS b3278 32-bit
Dual-core pentium 4 (Netburst) 3GHz (no Hyperthreading). 1GB of system RAM. 3x500GB 7200 rpm Barracuda SATA drives (I think they have 32MB of cache, otherwise it's 16MB each). Dual-port 4Gb qlogic fibrechannel card with Pci-E x4. 10GB FC volume with 4KB blocksize, default options (i.e. write-back enabled, not write-through).

Initiator side:
Windows 2003 Server.
Pentium 4 2GHz (Netburst crap). 1GB system RAM. some really old single-port 1Gb Qlogic Fibrechannel card plugged into a 32bit/33Mhz PCI slot.

sqlio -kR -s60 -frandom -o16 -b4 -LS d:\testfile.dat 1 0x0 200

or something like that. I have to run it for a lot longer beforehand in order to make sure that the whole test file is in the DSS's system cache.

Anyways, I consistently get about 13980 or so, which is within 5% of the much higher spec'd system I used before. The difference there was that it had 4 times as many cores (and they were using the Core-flavored Xeons which probably had a much higher FSB and DDR2 vs DDR).

So, it seems that ~14000 random read I/Os is to be expected when using 1Gb FC with 4kB blocks and a test file that fits entirely in the system cache. When I use a larger test file which doesn't fit in the cache, the random read I/Os on this system drop to around 150 to 300.

So, if I made a system with, say, 128GB of system memory in 1U, then I could get a usable high-performance SAN without having to fill a rack up with hundreds of 15K SAS drives and then short-stroking them. The only bad part so far is that I'd have to wait until the cache was filled until I could expect high performance.

PS, we have another system coming in that's identical to the DSS target system (i.e. dual-core pentium4, etc.), so I should be able to test how well this works with volume replication.

Stay tuned for write performance and performance using iSCSI!

**Robotbeat** · 03-19-2009, 07:15 PM

I just ran sqlio while using an iscsi volume (only using a single 1Gb ethernet connection, jumbo frames not enabled). It was pretty slow for random read I/O in cache. In fact, it was about only 470 I/O when I ran it this way (with a 200MB test file):

sqlio -kR -s60 -frandom -o128 -b4 -LS -Fparam.txt

The param.txt file has only this in it: "H:\testfile.dat 1 0x0 200", which is where the test file is located (i.e. we are testing the D volume right now, which happens to be iSCSI) and how many threads to run (one) and the bitmask that sets processor affinity (does not apply, since there's only one core) and how big of a test file in megabytes. When I did both the FC volume and the iSCSI volume, I used NTFS with 4096Byte blocksize. The initiator is MS S/W iSCSI 2.08. The iSCSI target is block I/O mounted with writeback cache enabled. The FC volume uses 4096 byte blocksize.

Whereas when I use a FC volume over a 1Gb FC interface, it's about 14000 random read I/O.

I've done this a few times now, and it seems quite obvious that FC is far superior for random I/O reads from cache. Granted, I am only able to test one thread at a time right now, since my initiator side is only single-core. Oh well.

I tried running the same thing on the FC volume right after this test (but first running a sequential sqlio run to put the benchmark file in the DSS system cache, as in the iSCS volume test). I get over 14,000 random read I/Os!

BTW, the easiest way to make sure the whole benchmark test file is in the cache while using sqlio is to just run sqlio in sequential mode beforehand once or twice for a long enough for it to read the entire test file. If this were not a benchmark but a production environment with a cache large enough for, say, an entire mysql database to fit in, one could just "dd if=/dev/sda1 of=/dev/null bs=4k" to make sure everything is in cache before you start using it. This could be done on the DSS side or the client side. The DSS side would be a little faster, but it doesn't really matter that much, since you'd only rarely have to do it.

PS, I'll try to see what difference jumbo frames makes, if I can enable it.

**Robotbeat** · 03-20-2009, 07:59 PM

Here are some of my results with writing to a fibre channel volume (1Gb link):

I get over 10000 random write iops. Not bad. But, as the size of the test file goes up, the random write iops drop dramatically.

Thread: DSS FC is very fast.

Thread Tools

Display

DSS FC is very fast.

Posting Permissions