-
Hey jiassic and chkohlruss
the errors that you posted are different,
jiassic may be drive related
he is showing these:
2009/08/31 00:03:01|[<ffffffff8069c592>] ? __down_read+0x12/0xa0
2009/08/31 00:03:01|[<ffffffff804256c8>] ? __down_write_trylock+0x48/0x60
2009/08/31 00:03:01|[<ffffffff804256c8>] ? __down_write_trylock+0x48/0x60
2009/08/31 00:03:01|[<ffffffffa001d663>] ? di_read_unlock+0x73/0x130 [aufs]
2009/08/31 00:03:01|[<ffffffffa001c497>] ? h_d_revalidate+0x4c7/0x6b0 [aufs]
2009/08/31 00:03:01|[<ffffffff8069c592>] ? __down_read+0x12/0xa0
2009/08/31 00:03:01|[<ffffffff804256c8>] ? __down_write_trylock+0x48/0x60
2009/08/31 00:03:01|[<ffffffff804256c8>] ? __down_write_trylock+0x48/0x60
2009/08/31 00:03:01|[<ffffffff80425701>] ? __up_read+0x21/0xb0
2009/08/31 00:03:01|[<ffffffffa001d663>] ? di_read_unlock+0x73/0x130 [aufs]
while chkohlruss
you have different errors
2009/09/08 06:19:54 kernel:[<ffffffff80237390>] ? put_files_struct+0x70/0xc0
2009/09/08 06:19:54 kernel:[<ffffffff80237a9b>] ? do_exit+0x17b/0x8b0
2009/09/08 06:19:54 kernel:[<ffffffff80238244>] ? do_group_exit+0x34/0xa0
2009/09/08 06:19:54 kernel:[<ffffffff80228282>] ? ia32_sysret+0x0/0xa
Best is to send the entire logs to suport to be sure.
Did you guys hear back from them?
-
Support sent me a patch file, which I believe is an update to SCST (version 1.0.1.1). I have not actually implemented the patch due to the necessity of shutting down all of my VM's to do the upgrade. That's a pretty hefty maintenance window to try to schedule.
-
Symm, thanks for the info. How do you find out what these error messages refer to?
I looked a little deeper and found that our RAID controller (3ware 9690SA) does a scheduled verify every night at 12:00 AM, and the event log in the RAID controller shows that the job kicks off at 12:03 - the same time as the error message timestamp above. It does this every night. Is it possible that the increased drive usage as it verifies the RAID5 parity information would cause the DSS error?
-
Support says it is the RAM.
We´ve done several memtests, no errors found.
We´ve change the RAM with a new one, still got the error messages.
-
jisaac
errors that I have seen in past life (telecom)
we used linux and raid for storage
-
What build of DSS 6 are you running? We were running build 3535 and seeing similar issues were our SAN would act like it just froze up. Support had us upgrade to build 3537 and problem solved. The issue was with the system cache. A reboot would fix it for a while and them it would freek out again. High data transfer agrivates this issus.
-
6.0up06.8102.3535 64bit
Like I said earlier, support sent me a patch file which I haven't had the downtime to implement yet. I'll try to get that done this weekend.
-
We are running 6.0up04.8101.3530 64bit
Are there any updates i can donwload and install?
-
We have reinstalled and reconfigured the primary server and it seems that the errors are gone now.