We have many customers who runs LSI with Cashcade, and everything is working normal. It could be the Firmware build with your current controller causing this issue, or there is an issue with one of the disk(s).
In case if the data get corrupted on the Primary box, if the replication service kept running then the mirror will get corrupted too. Having Snapshots, Backup in such cases should help ( in some cases ).
Also you may find the following Forum link and KB useful for you:
I found in the meantime 3 more corrupted filesystems (1*HFS+, 2*SLES 11 SP2). Unfortunately I find this corruption only after rebooting the VMs.
We have our production data on this box, so a snapshot and revert to snapshot for invoicing data, Mail server data etc. is maybe not the best idea.
I added a 8 SSD drives to the box and moved the performance critical vmdks to this SSD datastore, the performance is now okay.
Currently I see not way to perform more alpha tests with LSI cachecade, it took too much time and effort to repair the data, the risk to lose data
is too high. To revert to a full backup after 3 weeks or more is usually not an option, to restore the data file by file and verify what's happened in between is maybe possible, but takes a lot of time.
Maybe anybody else can work with LSI to bring the software to a beta stage.
just got the message, that the issue was caused by an outdated MegaRAID driver, Revision 6.12 in the latest version of OPEN-V6,
which does not support CacheCade.
I'm sure if there is a new driver we can provide this in a small update (for Premium Support only), but is there release notes from LSI stating this that if you don't have the latest driver this can happen?
me intention was only to benefit by the performance improvement by Cachecade, "just to use".
The hardware vendor wrote me, the data corruption was caused by a outdated driver in OPEN-E (confirmed by OPEN-E). The driver 6.12 was released by LSI mid 2011, LSI CacheCade in Mar 2012. Therefor CacheCade is not supported at the driver version. LSI told him, if disabling Cachecade helped at data corruption, this is a indication of the kind of problems, both OPEN-E and LSI has confirmed the problem.
Just want to know if you can confirm this too (from the OPEN-E standpoint of view) and when a LSI CacheCade compatible version will be released by OPEN-E?
Currently I'am stuck at V6 because I have enabled the NFS HA feature. Is this in the meantime available in V7 (then I can upgrade)?
Sorry, had more than enough trouble with this stuff.
We have 1000's of V6 and V7 in production and we would have updated the LSI Megaraid SAS drivers from the LSI release notes if this was true.
I have checked our support DB and have not found any records of this associated with any support tickets of this issue. What was the ticket that Open-E stated the same as LSI?
We don't make the driver and the Cachecade is an addon feature to certain LSI controllers. Normally the firmware would control the Cachecade along with some parts of the driver to perform the algorithms it uses for hot data.
I would also like to know the case w/ LSI where they stated this, you can submit a support ticket to Open-E with the contact information.
We have several webinars about the LSI Cachecade and I did one with the LSI engineer in March 28th of 2012 and nothing was mentioned about a out dated driver that can cause corruption.
Again we don't make the drivers, the vendors do so DSS would not be involved here. If there was corruption then it could be where the SSD for the cache was a RAID 0 and not a RAID 1 thus if losing the SSD in a RAID 0 then yes you can have corruption for the Writes at that time the SSD went bad but not for the Reads.
I moved the most important VMs to a additional SSD Raid5 Array and updated the DSS V6 to "6.0up99.8101.8328 64bit", even with the small update I ended at a old LSI driver version in V6 (2011 or so).
Is it now safe to reenable CacheCade after updating the LSI FW to the most current level?
Btw. if you confirm that the S200 box from xtivate.de is not certified, I will ask the guy why the sold me a S200 and 200.
We have 100's + different builds in testing so I am not sure about this build that you have "8328" as we try to stay with the official releases, currently there is still build "7337" now that is the latest build for V6. V7 release that came out has the latest drivers, small update for V6 would have to be created. So far I have not seen issues with the latest firmware from LSI for both V6 and or V7 so I would assume they have resolved it. I did not see xtivate.de on our list where they have done there certification with us as one of the Manufactures that are listed. http://www.open-e.com/partners/certified-systems/