The engineers will have this capability to replicate beyond 4TB (no set time frame for which release - but they are working on it) they will need additional time as we recognize the need for this.
Wrong DRBD need 32MB of vmalloc by 1TB of storage, the formula is storage_in_GB/32 and not by 32768 so 4096GB/32 = 128 so I don't know if some people have a rock solid replication volume over 1TB but from the maker of DRBD 10 to 20MB is clearly not correct.
Read this post about memory allocation and 32MB by 1TB , the writer is the PROGRAMMER of DRBD Dipl-Ing Philipp Reisner so I don't know if your enginnier are reading this forum or even DRBD doc and mailing list but I think they should do , 32K by 1GB of storage so 1TB=1000GBx32Kb=32000Kb=32MBx4(for 4096GB or 4TB)=128MB,now I know why people complaining about replication stop working.
Just for Information Protocol C is probably the slowest one but the most secure for your data , so this is good , a control of protocol could be an interesting option , like protocol A for asynchronious remote site.
Here let me help Keven out as he wanted to place his trophy on our response (yes should have been better but without the blood).
Here is a better description and using Protocol C is better in the end. If you don’t like it build it yourself .
For each device, drbd will (try to) allocate X MB of bitmap, plus some constant amount (<1MB). X = storage_size_in_GB/32, so 1 TB storage -> 32 MB bitmap.
By default Linux allocates 128MB to Vmalloc. For systems using more than 4TB, this may cause an issue - but give us time as we resolve this!!! It will happen, we are not going to just update without hard testing with our main customers unlike what some want it now.
We are replicating with 2 x 4TB volumes and haven’t had any issues for over 68 days, though we are using DSS ver. 5.0_42 in block io mode.
Not sure if this helps others but on a side note we would have more interest in what Heimic and SeanLeyne stated then the 16TB replication option (DRBD+). Vkeven must be needing something other then what DSS provides currently or wanting blood like Todd said
Todd - any chance that we will see asynchronous soon (not asynchronious )?
We want to say mid part of Q2 - though if there are issues or pending projects (especially from you guys for those huge orders ) then all this delays it.
Now i did some research and found that protocol A asynchronous is a completion of request happens as soon as it is completed locally and handed over to the local network stack. In case of a Primary node crash and fail-over to the other node, all WRITEs that have not yet reached the other node are lost.