Visit Open-E website
Page 1 of 2 12 LastLast
Results 1 to 10 of 16

Thread: cmnd_abort

  1. #1

    Exclamation cmnd_abort

    Hello,

    since nerly a hour I receive emails from dds-lite:
    2008/01/21 15:43:17 cmnd_abort(1143) 7c000010 1 1000 42 4096 0 1

    Should I get nervous? :-)

    Cheers

    Matthias

  2. #2

    Default

    This message mostly means that some iSCSI Initiator commands were aborted probably due to high usage. This can repeat itself for awhile.

    You can try and test with changing the iSCSI daemon options in the Console screen CTRL + ALT + W then Tuning options and set the following.


    MaxRecvDataSegmentLength 65536
    MaxXmitDataSegmentLength 65536
    All the best,

    Todd Maxwell


    Follow the red "E"
    Facebook | Twitter | YouTube

  3. #3

    Default

    OK,

    I'll try that, but this morning I've had 2 new entries at 03:43. No one is working here at this time :-)

    Gruss

    Matthias

  4. #4

    Default

    Hello To-M,

    setting the options didn't helped:
    2008/01/22 09:44:44 :n
    2008/01/22 09:44:44 kernel:cmnd_abort(1143) 11000090 1 1000 42 4096 0 1
    2008/01/22 09:43:34 kernel:cmnd_abort(1143) 6000090 1 1000 42 4096 0 1
    2008/01/22 09:42:14 kernel:cmnd_abort(1143) 4b000090 1 1000 42 4096 0 1


    snippet from dmesg due to restriction of 10000 characters:

    Using IPI Shortcut mode
    Freeing unused kernel memory: 240k freed
    squashfs: version 3.2-r2 (2007/01/15) Phillip Lougher
    aufs 20070903
    attempt to access beyond end of device
    loop0: rw=0, want=66, limit=8
    isofs_fill_super: bread failed, dev=loop0, iso_blknum=16, block=32
    attempt to access beyond end of device
    loop0: rw=0, want=68, limit=8
    attempt to access beyond end of device
    loop0: rw=0, want=1252, limit=8
    attempt to access beyond end of device
    loop0: rw=0, want=1028, limit=8
    UDF-fs: No partition found (1)
    XFS: bad magic number
    XFS: SB validate failed
    Vendor: TTI-MSA Model: USB 2.0 MD Rev: PMAP
    Type: Direct-Access ANSI SCSI revision: 00
    SCSI device sda: 985088 512-byte hdwr sectors (504 MB)
    sda: Write Protect is off
    sda: Mode Sense: 23 00 00 00
    SCSI device sda: 985088 512-byte hdwr sectors (504 MB)
    sda: Write Protect is off
    sda: Mode Sense: 23 00 00 00
    sda: sda1
    sd 0:0:0:0: Attached scsi removable disk sda
    usb-storage: device scan complete
    UDF-fs: No VRS found
    XFS: bad magic number
    XFS: SB validate failed
    UDF-fs: No VRS found
    XFS: bad magic number
    XFS: SB validate failed
    kjournald starting. Commit interval 5 seconds
    EXT3 FS on dm-0, internal journal
    EXT3-fs: recovery complete.
    EXT3-fs: mounted filesystem with ordered data mode.
    attempt to access beyond end of device
    dm-1: rw=0, want=66, limit=8
    isofs_fill_super: bread failed, dev=dm-1, iso_blknum=16, block=32
    attempt to access beyond end of device
    dm-1: rw=0, want=68, limit=8
    attempt to access beyond end of device
    dm-1: rw=0, want=1252, limit=8
    attempt to access beyond end of device
    dm-1: rw=0, want=1028, limit=8
    UDF-fs: No partition found (1)
    XFS: bad magic number
    XFS: SB validate failed
    UDF-fs: No VRS found
    XFS: bad magic number
    XFS: SB validate failed
    UDF-fs: No VRS found
    XFS: bad magic number
    XFS: SB validate failed
    UDF-fs: No VRS found
    XFS: bad magic number
    XFS: SB validate failed
    UDF-fs: No VRS found
    XFS: bad magic number
    XFS: SB validate failed
    attempt to access beyond end of device
    dm-6: rw=0, want=354, limit=352
    isofs_fill_super: bread failed, dev=dm-6, iso_blknum=88, block=176
    UDF-fs: No VRS found
    XFS: bad magic number
    XFS: SB validate failed

    I've created 3 volumes which will be accessed by two servers via RHCS/clvm. I've formatted all 3 with GFS2. Currently are both servers connected but only one is accessing one target.

    Cheers

    Matthias

  5. #5

    Default

    Hello again,

    I've checked my servers and found this:
    Jan 22 09:13:40 xen-2 iscsid: connect failed (111)
    Jan 22 09:13:41 xen-2 iscsid: Kernel reported iSCSI connection 3:0 error (1011) state (3)
    Jan 22 09:13:41 xen-2 iscsid: connection1:0 is operational after recovery (2 attempts)
    Jan 22 09:13:43 xen-2 iscsid: connection4:0 is operational after recovery (3 attempts)
    Jan 22 09:13:44 xen-2 iscsid: connection3:0 is operational after recovery (2 attempts)
    Jan 22 09:13:46 xen-2 kernel: connection2:0: iscsi: detected conn error (1011)
    Jan 22 09:13:46 xen-2 iscsid: Kernel reported iSCSI connection 2:0 error (1011) state (3)
    Jan 22 09:13:51 xen-2 kernel: connection1:0: iscsi: detected conn error (1011)
    Jan 22 09:13:51 xen-2 iscsid: Kernel reported iSCSI connection 1:0 error (1011) state (3)
    Jan 22 09:13:53 xen-2 kernel: connection4:0: iscsi: detected conn error (1011)
    Jan 22 09:13:54 xen-2 kernel: connection3:0: iscsi: detected conn error (1011)
    Jan 22 09:13:54 xen-2 iscsid: Kernel reported iSCSI connection 4:0 error (1011) state (3)
    Jan 22 09:13:54 xen-2 iscsid: connection1:0 is operational after recovery (2 attempts)
    Jan 22 09:13:54 xen-2 iscsid: Kernel reported iSCSI connection 3:0 error (1011) state (3)
    Jan 22 09:13:56 xen-2 iscsid: connection4:0 is operational after recovery (2 attempts)
    Jan 22 09:13:57 xen-2 iscsid: connection3:0 is operational after recovery (2 attempts)
    Jan 22 09:14:07 xen-2 kernel: connection4:0: iscsi: detected conn error (1011)
    Jan 22 09:14:07 xen-2 iscsid: Kernel reported iSCSI connection 4:0 error (1011) state (3)
    Jan 22 09:14:08 xen-2 kernel: connection3:0: iscsi: detected conn error (1011)
    Jan 22 09:14:08 xen-2 iscsid: Kernel reported iSCSI connection 3:0 error (1011) state (3)
    Jan 22 09:14:10 xen-2 iscsid: connect failed (111)
    Jan 22 09:14:11 xen-2 iscsid: connect failed (111)
    Jan 22 09:14:13 xen-2 kernel: connection1:0: iscsi: detected conn error (1011)
    Jan 22 09:14:13 xen-2 iscsid: connection4:0 is operational after recovery (3 attempts)
    Jan 22 09:14:14 xen-2 iscsid: Kernel reported iSCSI connection 1:0 error (1011) state (3)
    Jan 22 09:14:14 xen-2 iscsid: connection3:0 is operational after recovery (3 attempts)
    Jan 22 09:14:16 xen-2 iscsid: connection1:0 is operational after recovery (2 attempts)
    Jan 22 09:14:24 xen-2 kernel: connection4:0: iscsi: detected conn error (1011)
    Jan 22 09:14:24 xen-2 iscsid: Kernel reported iSCSI connection 4:0 error (1011) state (3)
    Jan 22 09:14:24 xen-2 kernel: connection3:0: iscsi: detected conn error (1011)
    Jan 22 09:14:25 xen-2 iscsid: Kernel reported iSCSI connection 3:0 error (1011) state (3)
    Jan 22 09:14:25 xen-2 iscsid: connection2:0 is operational after recovery (6 attempts)
    Jan 22 09:14:27 xen-2 kernel: connection1:0: iscsi: detected conn error (1011)
    Jan 22 09:14:27 xen-2 iscsid: connection4:0 is operational after recovery (2 attempts)
    Jan 22 09:14:27 xen-2 iscsid: Kernel reported iSCSI connection 1:0 error (1011) state (3)
    Jan 22 09:14:28 xen-2 iscsid: connection3:0 is operational after recovery (2 attempts)
    Jan 22 09:14:30 xen-2 iscsid: connection1:0 is operational after recovery (2 attempts)


    initiator and target are hooked to the same switch. Could the switch be the problem? :-)

    Cheers

    Matthias

  6. #6

    Default

    Try directly connecting to DSS LITE to diagnose if this is a switch issue.
    All the best,

    Todd Maxwell


    Follow the red "E"
    Facebook | Twitter | YouTube

  7. #7

    Default

    After researching this with the XFS: SB validate failed & XFS: bad magic number errors this could be issues with your Volume. Not sure if you have any data residing on these volumes but if you can backup the data and reconfigure the Unit (RAID set) and start over.
    I would recommend using function from Extended Console Tools - "Clear contents of units" in order to delete VG and LV configuration (reboot will happen). Then in the WebGUI add the unit again to the storage.
    All the best,

    Todd Maxwell


    Follow the red "E"
    Facebook | Twitter | YouTube

  8. #8

    Default

    I've recreated the unit as you suggested, plugged the remaining NIC into a seperate switch wher only DSS and both servers are connected to. The proble is still persistent :-|
    ATM I'm a little bit pissed since recreating the setup, clvm volumes and stuff took nearly three hours.

    Oh, when this timeout happens the Xen-DomU, which IS on the iSCSI-volume, stops responding...

    dmesg:
    SCSI device sda: 985088 512-byte hdwr sectors (504 MB)
    sda: Write Protect is off
    sda: Mode Sense: 23 00 00 00
    sda: sda1
    sd 0:0:0:0: Attached scsi removable disk sda
    usb-storage: device scan complete
    UDF-fs: No VRS found
    XFS: bad magic number
    XFS: SB validate failed
    UDF-fs: No VRS found
    XFS: bad magic number
    XFS: SB validate failed
    kjournald starting. Commit interval 5 seconds
    EXT3 FS on dm-0, internal journal
    ext3_orphan_cleanup: deleting unreferenced inode 2526
    EXT3-fs: dm-0: 1 orphan inode deleted
    EXT3-fs: recovery complete.
    EXT3-fs: mounted filesystem with ordered data mode.
    attempt to access beyond end of device
    dm-1: rw=0, want=66, limit=8
    isofs_fill_super: bread failed, dev=dm-1, iso_blknum=16, block=32
    attempt to access beyond end of device
    dm-1: rw=0, want=68, limit=8
    attempt to access beyond end of device
    dm-1: rw=0, want=1252, limit=8
    attempt to access beyond end of device
    dm-1: rw=0, want=1028, limit=8
    UDF-fs: No partition found (1)
    XFS: bad magic number
    XFS: SB validate failed
    UDF-fs: No VRS found
    XFS: bad magic number
    XFS: SB validate failed
    UDF-fs: No VRS found
    XFS: bad magic number
    XFS: SB validate failed
    UDF-fs: No VRS found
    XFS: bad magic number
    XFS: SB validate failed
    UDF-fs: No VRS found
    XFS: bad magic number
    XFS: SB validate failed
    attempt to access beyond end of device
    dm-6: rw=0, want=354, limit=352
    isofs_fill_super: bread failed, dev=dm-6, iso_blknum=88, block=176
    UDF-fs: No VRS found
    XFS: bad magic number
    XFS: SB validate failed


    and again:
    2008/02/20 13:47:36 kernel:cmnd_abort(1143) 2a000000 1 1000 42 4096 0 1

    regards

    Matthias

  9. #9
    Join Date
    Jan 2008
    Posts
    82

    Default

    As I know this [cmnd_abort (1143)] is related to the writing process. If disks are high loaded sometimes could the iSCSI reach timeout before writing goes.
    So the iSCSI initiator should retry the writing.

    Same command will be send when a disk hardware problem slowdown the writing - not only in high load.

    What Hardware u r using??

    Can you check your drives, RAID??

  10. #10

    Default

    The server is an Opteron 185 or 2218 (dualcore), 4GB Ram and 3ware 9550 4port HBA.
    Disks are four WD3200 configured to raid5 on the 3ware. OK, I know that the 3ware suck on Linux, but we're using them on our other identical servers too.
    The raid is fine for me, no problems reported by the HBA.
    One thing I've noticed is load and CPU-Load when writing to the array: CPU goes up to 50% when writing with one initiator, load goes up to 3 or 4.

    regards

    Matthias

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •