Visit Open-E website
Results 1 to 10 of 12

Thread: Another couple orders of magnitude performance: RDMA with Infiniband or 10Gbe

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1

    Lightbulb

    Have been thinking about this (and experimenting on debian linux and our little network), and the very perfect purpose for this sort of thing would be for storing the swap partition (or file) for systems that booted from the SAN already. I mean, if the main server goes down, you're going to shutdown your clients that booted to that server anyways, and you don't have to worry about keeping the data in the swap file, anyway, since it's just working as an extension of the system's RAM. Also, you could relatively easily enable autofailover for this situation, since the ramdisk is just a block device. Granted, you'd have to recreate the ramdisk every time you rebooted, but you could just make that part of the startup procedure. Plus, in linux, it's easy to make multiple swap partitions!

    Here's an interesting paper on using an RDMA-connected (infiniband) network block device to store the swap partition on a ramdisk on a remote system:

    http://www.cse.ohio-state.edu/~liang...-cluster05.pdf

    Apparently, their performance was so good that certain tasks that they tested (like quicksort) were only 40% longer execution time than using local memory, while using a swap file on a disk is 20 times longer. Here's another presentation they did:

    http://nowlab.cse.ohio-state.edu/pub...-cluster05.pdf

    I might try setting this up myself in debian (may possibly try to get this working with drbd failover, just to see if it works...). I'll let you guys at open-e know what it's like.

    Also, having to use the tcp/ip software stack vs. not (i.e. rdma) means that you have to use 3x the memory bandwidth (I think), which is fine at 100 MB/s, but you start running into problems at 1-2GB/s.

  2. #2

    Lightbulb

    Have you guys heard of "Managed Flash Technology"? It's a software (originally developed for linux) that sits between a hardware flash device (or array of ssds) and any filesystem. Basically, it makes random-writes into sequential writes. It costs money, so Open-E would have to license it, but it could allow open-e to compete with the big boys when it comes to random write IOPS using just regular SSDs.

    Open-E DSS ssd-edition.

    http://managedflash.com/home/index.htm

  3. #3

    Default

    Thanks!! We are looking into some of these but we found out there is allot of work in development and this is why you pay for those NetApp cost but we are working on it . Then we will list it on our HCL but give us some time it will come!
    All the best,

    Todd Maxwell


    Follow the red "E"
    Facebook | Twitter | YouTube

  4. #4
    Join Date
    Aug 2008
    Posts
    236

    Default

    The managed flash software seems very interesting. I think that SSDs are the next evolution for storage. From a physics perspective, I'm not sure how much more can be done to improve mechanical disks. But right now, it's hardly affordable and because of the life time of the more affordable MLC variety, I have questions about reliability and MTBF.

    I think that RDMA would be awesome, but RDMA is going to be for IB and right now, there are some innovative things that can be done to improve iSCSI performance over native IP/Ethernet.

    http://www.ele.uri.edu/Research/hpcl/STICS/iCache.pdf

    If you have access to this white paper, it talks about a different caching strategy to improve performance of iSCSI by 58 to 70ish percent. I'd like to see some innovation along these lines...

  5. #5

    Default

    Thanks for this info - I passed it on to our development and management team as we are looking for ideas.
    All the best,

    Todd Maxwell


    Follow the red "E"
    Facebook | Twitter | YouTube

  6. #6
    Join Date
    Jan 2008
    Posts
    86

    Default

    G'day,
    Thought I would "bump" this topic and pop Tiered storage for SSD/SAS/Sata Previous post
    I agree with enealDC that SSD are the next evolution. The performance we are seeing with Databases on SLC in RAID1 is outstanding. But lets face it, it will be a long time before 1.5TB SATA are displaced by SSD.
    But the amount of data that benefits from really high IOPS or transfer rates is actually quite small (in my experience). Most data on our clients networks is taken up by images and email(PST's).

    So the ability to seamlessly shift frequently accessed data to faster storage and older/archived data to cheaper storage would be fantastic.

    Cheers Ben

  7. #7
    Join Date
    Aug 2008
    Posts
    236

    Default

    Hey Beng. You are spot on! Understanding the IO requirements is a critical component of virtualization and storage management. I think not too many folks really spend time trying to understand the best blend of mix of the various types of storage when approaching these two topics. With the economy the way it is, there is less money to spend (unless you are a bank with bail out funds). Knowing where to put the high IO, but high cost storage and exactly how much to put is very important. Being able to leverage the different tiers and quality of storage is going to be critical at this junction while we wait for paradigm to shift and costs to drop. And you are right, 1.5 TB SATA drives are not going to be displaced anytime soon. But there is a danger with drive sizes getting so large. We have to constantly remind ourselves (and our customers) that delivering IO is the name of the game when it comes to SATA volumes and that larger is not necessarily *better*.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •