Visit Open-E website
Results 1 to 10 of 22

Thread: Server hangs every few days

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1

    Default

    DSS is currently running from a Intel 40GB SSD.

    In the logs the only errors i found are:

    Code:
    dev_vdisk: Registering virtual vdisk_blockio device S5iAcDpDBwN6nolN (WRITE_THROUGH, BLOCKIO)
    dev_vdisk: ***ERROR***: blkdev_issue_flush() failed: -95
    dev_vdisk: ***WARNING***: Device /dev/vg+backup1/lv+b+lvbackup101 doesn't support barriers, switching to NV_CACHE mode. Read README for more details.
    dev_vdisk: Attached SCSI target virtual disk S5iAcDpDBwN6nolN (file="/dev/vg+backup1/lv+b+lvbackup101", fs=768000MB, bs=512, nblocks=1572864000, cyln=768000)
    scst: Attached to virtual device S5iAcDpDBwN6nolN (id 1)
    and

    Code:
    iscsi-scst: ***WARNING***: CONFIG_TCP_ZERO_COPY_TRANSFER_COMPLETION_NOTIFICATION not enabled in your kernel. ISCSI-SCST will be working with not the best performance. Refer README file for details.
    scst: Target template iscsi registered successfully
    nothing else i checked pointed to an error.

  2. #2
    Join Date
    Oct 2010
    Location
    GA
    Posts
    935

    Default

    look into critical errors log, and dmesg logs as mentioned.
    If you can provide them to me, I'll take a look.

    also look at the iSCSI folder and show me the target settings.

  3. #3

    Default

    Hello,

    I have uploaded the logs here: logs. I couldn't find any errors in the logs, maybe you can take a look.

    Thanks

  4. #4

    Default

    Hey,

    open a ticket via http://www.open-e.com/service-and-support/ . Please attach logs downloaded via WebGUI.

    Ja-B

  5. #5
    Join Date
    Oct 2010
    Location
    GA
    Posts
    935

    Default

    Looking at what you have provided, the efollowing needs to be changed:
    you have:
    MaxBurstLength=1048576
    FirstBurstLength=262144

    change to:
    MaxBurstLength=16776192
    FirstBurstLength=65536

    Also you need to make sure to change each target, not just the first one.

    Overall these settings work welll:
    maxRecvDataSegmentLen=262144
    MaxBurstLength=16776192
    Maxxmitdatasegment=262144
    FirstBurstLength=65536
    DataDigest=None
    maxoutstandingr2t=8
    InitialR2T=No
    ImmediateData=Yes
    headerDigest=None
    Wthreads=8

    And I cant see the NIC settings in your upload, but you can also try jubo frames for the NICs: http://kb.open-e.com/Does-Open-E-sup...Frames_28.html

    Make sure initiators have matching settings, as this can cause the machine to seem locked/stalled.

    But complete logs would give a better picture as to wether or not there are other issues.
    I can only see a few files in yur link, not the whole package.

  6. #6

    Default

    Update: i have disabled some CPU power saving features in the bios, and it is running stable now.

  7. #7

    Default

    Unfortunately the box crashed again last night. But now i have some errors in the log:

    Code:
    	2011-04-12 01:59:10	scsi cmnd aborted, scsi_cmnd(0xffff88012c6d4700), cmnd[0x8a,0x 0,0x 0,0x 0,0x 0,0x 1,0xb0,0x9e,0xf8,0x 0,0x... (0/1) 	 	
    	2011-04-12 01:58:40	scsi cmnd aborted, scsi_cmnd(0xffff88012c6d4340), cmnd[0x8a,0x 0,0x 0,0x 0,0x 0,0x 1,0xb0,0x9e,0xf7,0x 0,0x... (1/1) 	 	
    	2011-04-12 01:58:10	scsi cmnd aborted, scsi_cmnd(0xffff88012c6d45c0), cmnd[0x8a,0x 0,0x 0,0x 0,0x 0,0x 1,0xb0,0x9e,0xf6,0x 0,0x... (1/1) 	 	
    	2011-04-12 01:57:40	scsi cmnd aborted, scsi_cmnd(0xffff88012c6d40c0), cmnd[0x8a,0x 0,0x 0,0x 0,0x 0,0x 1,0xb0,0x9e,0xf5,0x 0,0x... (0/1)
    This points to an hardware error. The raid controller has been replaced so it seems the backplane is not working properly.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •