It's a 3ware 3DM2 9650SE-16ML card.
There are no errors at all from the RAID card.
The critical error log is empty.
The dmesg.2 file is too big to post here.
It's a 3ware 3DM2 9650SE-16ML card.
There are no errors at all from the RAID card.
The critical error log is empty.
The dmesg.2 file is too big to post here.
What does it say on screen? 'no system volume found' ?I have a couple issues. First issue, the Open-E web client/Volume Manager is not
seeing a volume that the it was previously connected to
fix the first issue, and this will likely solve itself.Second issue, there is a second volume group with a share set up that was previously working. Now, none of those shares are accessable on the network
scsi 1:0:0:0: Direct-Access AMCC 9650SE-16M DISK 4.06 PQ: 0 ANSI: 5
sd 1:0:0:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
sd 1:0:0:0: [sdb] 9765519360 512-byte hardware sectors (4999946 MB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 23 00 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: disabled, doesn't support DPO or FUA
sd 1:0:0:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
sd 1:0:0:0: [sdb] 9765519360 512-byte hardware sectors (4999946 MB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 23 00 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: disabled, doesn't support DPO or FUA
sdb: unknown partition table
sd 1:0:0:0: [sdb] Attached SCSI disk
scsi 1:0:1:0: Direct-Access AMCC 9650SE-16M DISK 4.06 PQ: 0 ANSI: 5
sd 1:0:1:0: [sdc] Very big device. Trying to use READ CAPACITY(16).
sd 1:0:1:0: [sdc] 23437369344 512-byte hardware sectors (11999933 MB)
sd 1:0:1:0: [sdc] Write Protect is off
sd 1:0:1:0: [sdc] Mode Sense: 23 00 00 00
sd 1:0:1:0: [sdc] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
sd 1:0:1:0: [sdc] Very big device. Trying to use READ CAPACITY(16).
sd 1:0:1:0: [sdc] 23437369344 512-byte hardware sectors (11999933 MB)
sd 1:0:1:0: [sdc] Write Protect is off
sd 1:0:1:0: [sdc] Mode Sense: 23 00 00 00
sd 1:0:1:0: [sdc] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
sdc: unknown partition table
sd 1:0:1:0: [sdc] Attached SCSI disk
Loading iSCSI transport class v2.0-869.
iscsi: registered transport (tcp)
Ethernet Channel Bonding Driver: v3.2.5 (March 21, 2008)
bonding: MII link monitoring set to 50 ms
bonding: bond0 is being deleted...
e1000: eth0: e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
program scsiinfo is using a deprecated SCSI ioctl, please convert it to SG_IO
program scsiinfo is using a deprecated SCSI ioctl, please convert it to SG_IO
program scsiinfo is using a deprecated SCSI ioctl, please convert it to SG_IO
program scsiinfo is using a deprecated SCSI ioctl, please convert it to SG_IO
kjournald starting. Commit interval 5 seconds
EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
EXT3 FS on dm-10, internal journal
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
drbd: initialised. Version: 8.0.8 (api:86/proto:86)
drbd: GIT-hash: bd3e2c922f95c4fa0dca57a4f8c24bf8b249cc02 build by root@compiler, 2008-07-31 14:13:57
drbd: registered as block device major 147
drbd: minor_table @ 0xf5f85c00
kjournald starting. Commit interval 5 seconds
EXT3 FS on ram15, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
program scsiinfo is using a deprecated SCSI ioctl, please convert it to SG_IO
program scsiinfo is using a deprecated SCSI ioctl, please convert it to SG_IO
kjournald starting. Commit interval 5 seconds
EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
EXT3 FS on dm-6, internal journal
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
Filesystem "dm-9": Disabling barriers, not supported by the underlying device
XFS mounting filesystem dm-9
Starting XFS recovery on filesystem: dm-9 (logdev: internal)
Ending XFS recovery on filesystem: dm-9 (logdev: internal)
kjournald starting. Commit interval 5 seconds
EXT3 FS on dm-2, internal journal
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
Filesystem "dm-5": Disabling barriers, not supported by the underlying device
XFS mounting filesystem dm-5
Ending clean XFS mount for filesystem: dm-5
Adding 4194296k swap on /dev/vg+MediaLibrary/nas_swap. Priority:-1 extents:1 across:4194296k
Adding 4194296k swap on /dev/vg+MediaLibrary2/nas_swap. Priority:-2 extents:1 across:4194296k
scst: Waiting for 0 active commands to complete... This might take few minutes for disks or few hours for tapes, if you use long executed commands, like REWIND or FORMAT. In case, if you have a hung user space device (i.e. made using scst_user module) not responding to any commands, if might take virtually forever until the corresponding user space program recovers and starts responding or gets killed.
scst: All active commands completed
scst: Attached SCSI target mid-level at scsi0, channel 0, id 0, lun 0, type 0
scst: Waiting for 0 active commands to complete... This might take few minutes for disks or few hours for tapes, if you use long executed commands, like REWIND or FORMAT. In case, if you have a hung user space device (i.e. made using scst_user module) not responding to any commands, if might take virtually forever until the corresponding user space program recovers and starts responding or gets killed.
scst: All active commands completed
scst: Attached SCSI target mid-level at scsi1, channel 0, id 0, lun 0, type 0
scst: Waiting for 0 active commands to complete... This might take few minutes for disks or few hours for tapes, if you use long executed commands, like REWIND or FORMAT. In case, if you have a hung user space device (i.e. made using scst_user module) not responding to any commands, if might take virtually forever until the corresponding user space program recovers and starts responding or gets killed.
scst: All active commands completed
scst: Attached SCSI target mid-level at scsi1, channel 0, id 1, lun 0, type 0
scst: Processing thread started, PID 10316
scst: Processing thread started, PID 10320
scst: Init thread started, PID 10322
scst: Task management thread started, PID 10323
scst: Management thread started, PID 10324
scst: SCST version 1.0.0-rc1 loaded successfully (max mem for commands 505MB, per device 202MB)
scst: Enabled features: TRACING
scst: Virtual device handler vdisk for type 0 registered successfully
scst: Virtual device handler vdisk_blk for type 0 registered successfully
scst: Virtual device handler vcdrom for type 5 registered successfully
iSCSI Enterprise Target Software - version 0.4.15
iotype_init(92) register fileio
iotype_init(92) register blockio
iotype_init(92) register nullio
3w-9xxx: scsi1: AEN: INFO (0x04:0x000B): Rebuild started:unit=0.
3w-9xxx: scsi1: AEN: WARNING (0x04:0x004B): Battery temperature is high:.
3w-9xxx: scsi1: AEN: ERROR (0x04:0x004D): Battery temperature is too high:.
3w-9xxx: scsi1: AEN: INFO (0x04:0x0055): Battery charging started:.
3dm2[12028] general protection ip:b7e71bf2 sp:bfb430a0 error:0 in libc-2.3.2.so[b7e00000+12a000]
3w-9xxx: scsi1: AEN: INFO (0x04:0x000B): Rebuild started:unit=0.
3w-9xxx: scsi1: AEN: INFO (0x04:0x000B): Rebuild started:unit=0.
3w-9xxx: scsi1: AEN: INFO (0x04:0x0056): Battery charging completed:.
3w-9xxx: scsi1: AEN: ERROR (0x04:0x0057): Battery charging fault:.
3w-9xxx: scsi1: AEN: ERROR (0x04:0x0009): Drive timeout detectedort=7.
3w-9xxx: scsi1: AEN: ERROR (0x04:0x0002): Degraded unit:unit=0, port=7.
3w-9xxx: scsi1: AEN: ERROR (0x04:0x0009): Drive timeout detectedort=7.
3w-9xxx: scsi1: AEN: ERROR (0x04:0x0009): Drive timeout detectedort=8.
3w-9xxx: scsi1: AEN: ERROR (0x04:0x000A): Drive error detected:unit=0, port=8.
3w-9xxx: scsi1: AEN: ERROR (0x04:0x0009): Drive timeout detectedort=8.
sd 1:0:0:0: WARNING: (0x06:0x002C): Command (0x88) timed out, resetting card.
3w-9xxx: scsi1: ERROR: (0x06:0x001F): Microcontroller not ready during reset sequence.
3w-9xxx: scsi1: ERROR: (0x06:0x001F): Microcontroller not ready during reset sequence.
3w-9xxx: scsi1: ERROR: (0x06:0x002B): Controller reset failed during scsi host reset.
sd 1:0:0:0: Device offlined - not ready after error recovery
sd 1:0:0:0: Device offlined - not ready after error recovery
sd 1:0:0:0: Device offlined - not ready after error recovery
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
Buffer I/O error on device dm-10, logical block 163848
lost page write due to I/O error on dm-10
sd 1:0:0:0: [sdb] Result: hostbyte=0x01 driverbyte=0x00
end_request: I/O error, dev sdb, sector 88440
sd 1:0:0:0: rejecting I/O to offline device
Buffer I/O error on device dm-10, logical block 2817
lost page write due to I/O error on dm-10
Aborting journal on device dm-10.
sd 1:0:0:0: rejecting I/O to offline device
Buffer I/O error on device dm-10, logical block 521
lost page write due to I/O error on dm-10
EXT3-fs error (device dm-10) in ext3_reserve_inode_write: Journal has aborted
EXT3-fs error (device dm-10) in ext3_reserve_inode_write: Journal has aborted
sd 1:0:0:0: rejecting I/O to offline device
Buffer I/O error on device dm-10, logical block 0
lost page write due to I/O error on dm-10
EXT3-fs error (device dm-10) in ext3_dirty_inode: Journal has aborted
------------[ cut here ]------------
WARNING: at fs/buffer.c:1183 mark_buffer_dirty+0x96/0xa0()
Modules linked in: iscsi_trgt scst_vdisk scst drbd bonding iscsi_tcp libiscsi scsi_transport_iscsi 3w_9xxx e1000 button ftdi_sio usbserial
Pid: 30727, comm: smbd Not tainted 2.6.25.13-oe32-00000-g0c9122d #23
[<c012234e>] warn_on_slowpath+0x3e/0x60
[<c0433517>] dm_unplug_all+0x17/0x30
[<c01190c9>] task_rq_lock+0x29/0x50
[<c011bf77>] try_to_wake_up+0xc7/0xf0
[<c011dca6>] __wake_up_common+0x36/0x60
[<c011dcf7>] __wake_up+0x27/0x40
[<c012303c>] wake_up_klogd+0x2c/0x30
[<c0192277>] __wait_on_buffer+0x17/0x20
[<c0195a74>] sync_dirty_buffer+0x54/0xc0
[<c01932e6>] mark_buffer_dirty+0x96/0xa0
[<c01c2bca>] ext3_commit_super+0x3a/0x50
[<c01c012f>] ext3_handle_error+0x3f/0xa0
[<c01c0293>] __ext3_std_error+0x33/0x60
[<c01c007e>] __ext3_journal_stop+0x2e/0x40
[<c018e6c7>] __mark_inode_dirty+0xc7/0x150
[<c01867ec>] touch_atime+0x8c/0xe0
[<c0153093>] generic_file_mmap+0x43/0x50
[<c01643f0>] mmap_region+0x170/0x420
[<c0153050>] generic_file_mmap+0x0/0x50
[<c0164063>] do_mmap_pgoff+0x1d3/0x370
[<c0106e8c>] sys_mmap2+0x5c/0x80
[<c0103a42>] syscall_call+0x7/0xb
=======================
---[ end trace 0c9a357e48a06970 ]---
sd 1:0:0:0: rejecting I/O to offline device
Buffer I/O error on device dm-10, logical block 0
lost page write due to I/O error on dm-10
ext3_abort called.
EXT3-fs error (device dm-10): ext3_journal_start_sb: Detected aborted journal
Remounting filesystem read-only
EXT3-fs error (device dm-10) in ext3_reserve_inode_write: Journal has aborted
EXT3-fs error (device dm-10) in ext3_reserve_inode_write: Journal has aborted
sd 1:0:0:0: rejecting I/O to offline device
Buffer I/O error on device dm-10, logical block 0
lost page write due to I/O error on dm-10
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
Buffer I/O error on device dm-10, logical block 163845
lost page write due to I/O error on dm-10
sd 1:0:0:0: rejecting I/O to offline device
Buffer I/O error on device dm-10, logical block 163848
lost page write due to I/O error on dm-10
3w-9xxx: scsi1: WARNING: (0x06:0x0037): Character ioctl (0x108) timed out, resetting card.
3w-9xxx: scsi1: AEN: WARNING (0x04:0x0039): Buffer ECC error corrected:address=0x17C820.
3w-9xxx: scsi1: AEN: WARNING (0x04:0x0039): Buffer ECC error corrected:address=0x17C820.
3w-9xxx: scsi1: AEN: WARNING (0x04:0x0039): Buffer ECC error corrected:address=0x17C820.
3w-9xxx: scsi1: AEN: ERROR (0x04:0x005F): Cache synchronization failed; some data lost:unit=0.
3w-9xxx: scsi1: AEN: ERROR (0x04:0x005F): Cache synchronization failed; some data lost:unit=1.
3w-9xxx: scsi1: AEN: INFO (0x04:0x000B): Rebuild started:unit=0.
3w-9xxx: scsi1: AEN: INFO (0x04:0x0005): Rebuild completed:unit=0.
3w-9xxx: scsi1: AEN: WARNING (0x04:0x002F): Verify not started; unit never initialized:Raid5/6 subunit=0.
3w-9xxx: scsi1: AEN: INFO (0x04:0x0029): Verify started:unit=0.
3w-9xxx: scsi1: AEN: WARNING (0x04:0x0023): Sector repair completedort=3, LBA=0x15E0C8.
3w-9xxx: scsi1: AEN: INFO (0x04:0x0029): Verify started:unit=0.
3w-9xxx: scsi1: AEN: INFO (0x04:0x000B): Rebuild started:unit=0.
st: Version 20080221, fixed bufsize 32768, s/g segs 256
Driver 'st' needs updating - please use bus_type methods
sd 0:0:0:0: Attached scsi generic sg0 type 0
sd 1:0:0:0: Attached scsi generic sg1 type 0
sd 1:0:1:0: Attached scsi generic sg2 type 0
sd 1:0:0:0: rejecting I/O to offline device
xfs_force_shutdown(dm-9,0x1) called from line 420 of file fs/xfs/xfs_rw.c. Return address = 0xc02a2650
Filesystem "dm-9": I/O Error Detected. Shutting down filesystem: dm-9
Please umount the filesystem, and rectify the problem(s)
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
Buffer I/O error on device dm-10, logical block 167938
EXT3-fs error (device dm-10): ext3_readdir: directory #81946 contains a hole at offset 0
sd 1:0:0:0: rejecting I/O to offline device
Buffer I/O error on device dm-10, logical block 167939
EXT3-fs error (device dm-10): ext3_readdir: directory #81947 contains a hole at offset 0
sd 1:0:0:0: rejecting I/O to offline device
EXT3-fs error (device dm-10): ext3_get_inode_loc: unable to read inode block - inode=81984, block=163845
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
Buffer I/O error on device dm-10, logical block 182301
EXT3-fs error (device dm-10): ext3_readdir: directory #82013 contains a hole at offset 0
sd 1:0:0:0: rejecting I/O to offline device
Buffer I/O error on device dm-10, logical block 176154
EXT3-fs error (device dm-10): ext3_readdir: directory #82045 contains a hole at offset 0
sd 1:0:0:0: rejecting I/O to offline device
Buffer I/O error on device dm-10, logical block 176155
EXT3-fs error (device dm-10): ext3_readdir: directory #82046 contains a hole at offset 0
sd 1:0:0:0: rejecting I/O to offline device
EXT3-fs error (device dm-10): ext3_get_inode_loc: unable to read inode block - inode=82064, block=163848
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
3w-9xxx: scsi1: AEN: WARNING (0x04:0x0023): Sector repair completedort=3, LBA=0x8AD396.
3w-9xxx: scsi1: AEN: INFO (0x04:0x000B): Rebuild started:unit=0.
3w-9xxx: scsi1: AEN: INFO (0x04:0x0055): Battery charging started:.
3w-9xxx: scsi1: AEN: INFO (0x04:0x0056): Battery charging completed:.
3w-9xxx: scsi1: AEN: INFO (0x04:0x0055): Battery charging started:.
3w-9xxx: scsi1: AEN: INFO (0x04:0x0056): Battery charging completed:.
3w-9xxx: scsi1: AEN: WARNING (0x04:0x004B): Battery temperature is high:.
3w-9xxx: scsi1: AEN: INFO (0x04:0x0005): Rebuild completed:unit=0.
3dm2[15921] general protection ip:b7fe5bf2 sp:bfeb67e0 error:0 in libc-2.3.2.so[b7f74000+12a000]
xfs_force_shutdown(dm-9,0x1) called from line 420 of file fs/xfs/xfs_rw.c. Return address = 0xc02a2650
3w-9xxx: scsi1: AEN: ERROR (0x04:0x004D): Battery temperature is too high:.
3w-9xxx: scsi1: AEN: WARNING (0x04:0x004B): Battery temperature is high:.
3w-9xxx: scsi1: AEN: INFO (0x04:0x0029): Verify started:unit=0.
3w-9xxx: scsi1: AEN: WARNING (0x04:0x0023): Sector repair completedort=3, LBA=0x8A90F3.
3w-9xxx: scsi1: AEN: ERROR (0x04:0x004D): Battery temperature is too high:.
3w-9xxx: scsi1: AEN: WARNING (0x04:0x004B): Battery temperature is high:.
3w-9xxx: scsi1: AEN: INFO (0x04:0x002B): Verify completed:unit=0.
xfs_force_shutdown(dm-9,0x1) called from line 420 of file fs/xfs/xfs_rw.c. Return address = 0xc02a2650
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
program scsiinfo is using a deprecated SCSI ioctl, please convert it to SG_IO
program scsiinfo is using a deprecated SCSI ioctl, please convert it to SG_IO
program scsiinfo is using a deprecated SCSI ioctl, please convert it to SG_IO
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
program scsiinfo is using a deprecated SCSI ioctl, please convert it to SG_IO
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
program scsiinfo is using a deprecated SCSI ioctl, please convert it to SG_IO
xfs_force_shutdown(dm-9,0x1) called from line 420 of file fs/xfs/xfs_rw.c. Return address = 0xc02a2650
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
sd 1:0:0:0: rejecting I/O to offline device
program scsiinfo is using a deprecated SCSI ioctl, please convert it to SG_IO
There is for sure something wrong w/ the 3Ware, Buffer IO errors, ECC errors (related to memory).... If you can really say that the RAID is ok then you can try running the Repair Filesystem from the Extended tools section (but you will need Console access). I would really get 3Ware involved to have a look at those outputs.
The customer was hardbooting the NAS server which caused those errors. The RAID rebuilt fine, but after the rebuild, NAS-R3 started having the problems described.