-
Adaptec 6805
Hi,
does anybody has tried DSS with an Adaptec 6805 controller ?
We have tried Open-E DSS V6 ver. 6.00 up75 build 5377, which should support the new controller.
We can also see the Raid volume, but as soon we create a vg, the system is no longer responding, or to be more correct, after creating a vg the system becomes extrem slow, and after a few minutes the system is no longer responding.
So, is anybody using the 6805 (or any of the 6xxx series)? Does DSS works fine with 6805?
Regards
Babbnase
-
6805 should work fine, are you able to access the Adaptec via the ASM app to get the logs?
What happens when restarting the system and what is in the critical error logs showing?
-
Also is the Adaptec to the latest firmware?
-
Hi,
thanks for the super fast reply :-)
The Adaptec is to the latest firmware.
We have also tried with CentOS (had to build our own aacraid.ko modul) and configuration by hand. In this case the system works w/o any problems (We can create vg....)
Sadly, after create a vg with DSS the system didn't responding any longer, and if we doing a hard reset, the system boots and stops at 33% (Don't know if the system will boot if we would wait several hours). At least after 30min, it still shows 33%
Because the system doesn't boot any longer, we are also not able to have a look to any logfiles :-(
-
If it is hanging 33% sounds like there might be some volume issues from the array, try to reboot the system then while the system is booting press "Tab" key on "Select kernel" screen
type "rescue_mode" and press "Enter" key (in order to run the system without splash).
or
Type "rescue_mode=no_mount_lv" and press "Enter" key (in order to run the system without mounting logical volumes)
Then see if you can access the controller from the ASM or from the CLI from Console screen by entering CTRL ALT R to gain access to the controller.
-
Hi,
I had to reinstall DSS, because we had done some test with CentOs.
I have now reinstalled DSS:
created a vg - w/o any problems (sorry, we had never a problem w/ vg; my mistake, it was always a poblem w/ the next step)
crated a lv, and the problems started
~10 min after creating, the system becomes more and more slow
~30 min after creating, the system doesn't response any longer
I was able to connect w/ ASM
Before creating the lv, ASM could see both controller
we are using two controllers:
Adaptec 2610A with 4 HDD for the DSS System, and planed for images/iso...
Adaptec 6805 with 8 HDD for FC
Now, ASM only see the 2610A controller, but no longer the 6805
I was able to save the logfile (while the system no longer response to anything).
How to att. a logfile (txt) to this board ????
As i can see from the logfile, one HDD timedout:
dss Command timeout: controller 2, channel 0, SCSI device ID 7, LUN 0, cdb [2a 00 20 a1 f4 00 00 02 00 00 00 00]
followed by:
dss Sense data: Illegal request (PARAMETER NOT SUPPORTED). Controller 2, channel 0, SCSI device ID 7, LUN 0, cdb [2a 00 21 4f fe 00 00 02 00 00 00 00], data [70 00 05 00 00 00 00 00 00 00 00 00 26 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00]
Sense data: Unit attention (POWER ON, RESET, OR BUS DEVICE RESET OCCURRED). Controller 2, channel 0, SCSI device ID 7, LUN 0, cdb [2a 00 21 50 2a 00 00 02 00 00 00 00], data [70 00 06 00 00 00 00 00 00 00 00 00 29 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00]
An error occurred while accessing the logical device: controller 2, logical device 0
An error occurred while accessing the logical device: controller 2, logical device 252
The HDD are all new, and I had no similar problems while running with CentOs.
I have ordered 2 more HDD's, to see if this solves the problem.
Nevertheless, a dregraded RAID should not bring the whole server down (we have used a RAID 10). A single failed HDD should not be a problem.
As soon I have done some more tests, I will get back to you
-
It could be more then the drive from what I see it might be the controller as well - what did Adaptec say about the logs from there end? We will report only a minor bit like IO Journal errors or something in the lines of that.
-
until now, I haven't contacted Adaptec, cos I had no problems before.
All the hardware is new, and as I have not seen the 6805 in your HCL, we have tried to use CentOs as storageserver.
I was more or less luck, that I have checked the lasted demo of DSS, to see the 6805 is now supported.
So we just testing everything since only 2-3 days, not enough time, to name the bad boy :-)
-
Can you make the log files(DSS) available for us to download?
I'm curious to see the boot logs, and some of the hardware info....
-
Download this small update and test - install in the GUI from Maint. > Software Update then restart and test again.
ftp://software:enterforupdate@ftp.op...ptec_series_6/
-
Hi,
I have switched HDD 3 with HDD 7, and created a new RAID 10.
Afterwards, I have installed DSS complete new.
Please find the first logs here:
https://noc.all-tld.net/downloads/DSS/
ASM_Events.txt is from the last installation, where the HDD 7 was not responding (after creating a lv).
TRL00022_[5377]_logs_2011-05-10_21_49__clean_install.tar.gz is from the new installation (w/o any further vg/lv configuration).
I will now apply your patch, and try to create a vg, and afterwards a lv
-
enable write caching on the controllers:
18.664228] sd 0:0:0:0: [sda] Write cache: disabled,
19.434642] sd 1:0:0:0: [sdb] Write cache: disabled,
-
Also check firmware on this controller:
18.661668] AAC0: kernel 4.2-1[7368] Mar 30 2005
thats SDA....
-
This message is confusing me :-(
I have enabled write cache on both controller (during boot while creating the RAID Volumes).
No glue, why the message named wr cache disabled
I will check it twice during next reboot.
-
Hi,
Please find a new log here:
https://noc.all-tld.net/downloads/DS...D_switched.txt
As you can see, again HDD 7 is not responding.
As I wrote before, I had switched HDD 7 with HDD 3.
So, if the HDD would be faulty, I had expected ASM would now complain about HDD 3, and not HDD 7.
Either the backplane, the cable between backplane and the controller, the controller itself, or ...? seems to be faulty.
So, time to find some sleep - Will report tomorrow if any news.
-
Hi,
I have now rebooted with rescue_mode "enabled"
DSS hangs on "restoring_base_settings start"
BTW: write cache was enable for the 6805, but not for the 2610A. I have now also enabled write cache for the 2610A.
Bo glue, why sdb (=6805) named write cache as disabled :-(
-
Hi,
I have now rebooted with rescue_mode=no_mount_lv
The system booted completely.
please find the new logfiles https://noc.all-tld.net/downloads/DSS/TRL00022_[5377]_logs_2011-05-11_15_51.tar.gz
-
The write cache is still being reported as disabled on both controllers.
Can you try booting from a USB thumb drive with DSS image ?
-
Hi,
I removed the 2610A, so current only the 6805 is installed.
I have also moved the harddisk to another slots (if the backplane is faulty, or the cable).
Booted DSS from USB
Applied patch from last night
created a vg
created a lv
~15 min later, the server is getting extrem slow
I will try to get some more logfiles, but current it looks like the server will die again
-
remove the small update and reboot
-
Just a short update.
We have created a case at Adaptec.
They are current checking the logfiles.
I'm not sure, but current I would rather name the Controller the root cause, or the combination mainboard <-> controller.
I will inform you as soon I have any news.
-
Let us know. I kind of suspect the same.
-
Hi,
sorry for the delay, but it was really a hard way :-(
We have replaced the Adaptec Controller - w/o success
We have replaced all Harddisk, even with different models - w/o success
We have tested only a pair of two disk (disk 0&1, disk 2&3, disk 4&5...) and always the last drive failed
Of course, every test took between 4 and 20 hours.
Finally, we have replaced the backplane (or better saying the whole server case).
And now everything works absolute fine.
We have used the same server case with a different controller but the same harddisk - w/o any problems.
So it seems it was a combination between the backplane and the controller.
Nevertheless, now everything is working, and we can finally start with our real tests.
To all open-e employees: Thanks you very much for your fast and amazing support.
Even it was NOT the fault of your software, your support was really great.
Good to know that there is somebody when something goes wrong
THANKS !!!
Babbnase
-
Thanks for the update.
Anytime we can help, we will try... of course!