i do have the same controller icp5165br (15753)
since it is in my secondary node i cannot tell if the same issue occurs, but in last failover (quick update of primary node) had no issue with it, even after 2 harddisk failures in r10
but my overall experience with adaptec controllers is rather bad... some random kernel panics in an older controller, crapy management software and such
Ouch!
Your controller rebuild settings were probably set to High.
With RAID5/6, it is ESSENTIAL that you run scrub checks routinely. We run them twice a month.
We use Areca Controllers..
@enealDC
there is an automated check once a month.
what do you mean with rebuild settings to high?
there no setting on the adaptec that allow me to change something like this.
we changd the controller on both dss machines.
because of compatibility of raiddata we use on the first DSS another adaptec (51645).
on the second we use now areca.
but after 3 weeks we have no idea wy this happend.
no answer from open-e and no answer from adaptec.
i got an answer from open-e,
they say it is driver related. the driver for the IPC Raid Controller in the actuell dss6 release is buggy and old. the driver will be changed in the next dss6 release.
so be carefull with raid 6 rebuild and this controller!
1. Make a raid 6
2. put some data on it
3. then run verify_fix
Do this until it says different sectors found in RaidEvtA.log.
Then the filesystem is damaged and some data is gone, independently of the filesystem (xfs or ext3/ext4).
We had this now every couple of weeks with the ICP5165BR.
There was no indication in the logs, but it was strange that verify_fix finds 100000s of different sector and all disks are reporting fine,i.e. not a single media error etc.
According to our vendor, this controller has an error in the paritiy algorithm. There is a none public newer firmware at Adaptec (they are aware of this problem, the vendor got this info from Adaptec) but this new firmware would not be compatible with our mainboard.
Thus, they send us a new Controller Adaptec 51645.
BTW, this cannot be a problem in the OS because the verify_fix runs entirely on the Controller OS (IMHO).
I was wondering why not more people has this problem. We are running Debian Lenny though, because we wanted console access to the NAS.
And that kernel was a 2..6.26, now I am using 2.6.32 from lenny-backport and the problem still exists.