Hello,
two 2 weeks ago we had an totaly corrupted DSS6 Failover System.
Our Hardware:
2 identical Storage Server with Open-E DSS6
1xIntel X5550
24GB RAM
2 1GB Intel NIC on board
1 DualPort 10GBE CX4 Intel
16 1TB HDD from Western Digital RE3
1 ICP5165BR Raid Controller (Adaptec) latest bios build!
RAID Config: Raid6
We use 2 Citrix XEN Server to access DSS6 over 10GBE CX4.
On Citrix XEN are 15 Windows 2003 Server.
The DSS6 Storage Server are in Failover mode with multiple Targets.
so long...
one day, a HDD fails in our Primary DSS6.
So we changed the HDD against a new one of same type.
the change was done with hot pluging in a running system.
anything was fine. the raidcontroller start the rebuild for raid 6.
until this our data was not! corrupted.
suddenly we noticed that some VM on XEN run into Bluescreens.
at this moment the rebuild status shows ca. 30% und the adaptec shows no error!
DSS shows now error! anything ok...
and from now by every new 1%build more VMs have heavy Problems.
in this moment we realize that the rebuild destroys our data.
we run the manuell failover and start replay the backups to DSS Slave.
after replay the backup we don't resume the manuell FAILover to invetigate Primary System "Error" .
3 Days later a HDD fails in the second DSS6, the adaptec starts an automated rebuild (of course during the night...) after noticed that the rebuild status was ca. 30%.
and again many VM's crashed und our data was corrupted again.
so we make a call to adaptec, send them log files.
we make call to open-e and send them logfiles, too.
there was no error in the logfiles!
adaptec have no idea, and i think they w'ont find anything...
my opinion is that there is a bug in the latest firmware build from the ICP5165BR.
data became corrupted only a rebuild, and rebuild is only done by raidcontroller...
have someone else problems like our problem?
and for those who have the same raidcontroller be aware!
greetings
rogerk