Hi everyone,
I currently have two DDS6 boxes with different harware, 4TB each in a volume replication model were one of them is Active and the other Passive, both of them doing close to 200 MB/sec on the storage subsystem measured with hdparm. Both of them have 4x 2GB FC connected to 2 brocade switches (2 connections to each), then 3x ESXi 4.1 hosts each with 1x 2GB FC link to each switch and using round robin multi path...
I have around 16 VMs for different applications and this seems to be working well, but I seemed to be maxed to around 100 MB/s write speed and reading through this forum I found the bottleneck is probably the replication over gigabit network, even though each storage server has a bond of 2 gigabit internal ports, I also learned here that the bond is only good for redundancy but replication will not gain bandwidth benefits from it, also Open-e is not able to replicate over FC.
So, with a very limited budget in mind (I already spend more than what I had available on all the licensing and hardware) I was trying to figure out how to improve performace, One of the storage servers is a little old so the only PCI slots available are 32 bit 33 MHz, so using 10GBE or Infiniband cards for replication is not an option. I came up with this idea and wanted to run it by you guys to get some input:
Add a gigabit network card to each server to be used for management only (pci bandwidth not a problem here), then use crossover cables between the internal gigabit cards from the first storage server to the seccond one, half of the storage space will be active in server 1 replicated over one of the gigabit links to server 2, and the other half configured the other way around replicatinf over the second gigabit link. Then, 8 VMs running of the active storage in server 1, the other 8 running of the active storage in server 2... Benefits I see from this:
- Theorically both volumes will be able write at 100 MB/sec, replication should not affect much as the storage subsystem should be able to handle the 200 MB/sec together...
- If one of the server goes down, only half of the VM's go down until manually failed over (automatic failover not available for FC).
- Maximizing the utilization of the hardware resources (memory, CPU, FC bandwidth, etc) as now the load will be equally distributed between the two servers.
Cons:
- Complicate administration?
Any thoughts and ideas will be appreciated!
Thanks,
G