I just wanted to add my voice to all who are asking for RDMA with Infiniband support.
It would be greatly appreciated if the update(s) on ETA of this could be posted as they become available.
Generally, is there roadmap/planned features list available?
We are working on this for our partners but publicly we might not provide this for competitive reason as even some of our competitors don't do this. We have been thinking of providing this information but we need more time to think about the implications that are associated with this type of announcement.
Have been thinking about this (and experimenting on debian linux and our little network), and the very perfect purpose for this sort of thing would be for storing the swap partition (or file) for systems that booted from the SAN already. I mean, if the main server goes down, you're going to shutdown your clients that booted to that server anyways, and you don't have to worry about keeping the data in the swap file, anyway, since it's just working as an extension of the system's RAM. Also, you could relatively easily enable autofailover for this situation, since the ramdisk is just a block device. Granted, you'd have to recreate the ramdisk every time you rebooted, but you could just make that part of the startup procedure. Plus, in linux, it's easy to make multiple swap partitions!
Here's an interesting paper on using an RDMA-connected (infiniband) network block device to store the swap partition on a ramdisk on a remote system:
Apparently, their performance was so good that certain tasks that they tested (like quicksort) were only 40% longer execution time than using local memory, while using a swap file on a disk is 20 times longer. Here's another presentation they did:
I might try setting this up myself in debian (may possibly try to get this working with drbd failover, just to see if it works...). I'll let you guys at open-e know what it's like.
Also, having to use the tcp/ip software stack vs. not (i.e. rdma) means that you have to use 3x the memory bandwidth (I think), which is fine at 100 MB/s, but you start running into problems at 1-2GB/s.
Have you guys heard of "Managed Flash Technology"? It's a software (originally developed for linux) that sits between a hardware flash device (or array of ssds) and any filesystem. Basically, it makes random-writes into sequential writes. It costs money, so Open-E would have to license it, but it could allow open-e to compete with the big boys when it comes to random write IOPS using just regular SSDs.
Thanks!! We are looking into some of these but we found out there is allot of work in development and this is why you pay for those NetApp cost but we are working on it . Then we will list it on our HCL but give us some time it will come!
The managed flash software seems very interesting. I think that SSDs are the next evolution for storage. From a physics perspective, I'm not sure how much more can be done to improve mechanical disks. But right now, it's hardly affordable and because of the life time of the more affordable MLC variety, I have questions about reliability and MTBF.
I think that RDMA would be awesome, but RDMA is going to be for IB and right now, there are some innovative things that can be done to improve iSCSI performance over native IP/Ethernet.
If you have access to this white paper, it talks about a different caching strategy to improve performance of iSCSI by 58 to 70ish percent. I'd like to see some innovation along these lines...