Visit Open-E website
Page 3 of 3 FirstFirst 123
Results 21 to 27 of 27

Thread: Kernel error kernel:[360783.353677]

  1. #21

    Default

    Strange. Do you have any spare memory for that machine?

  2. #22
    Join Date
    Oct 2009
    Posts
    53

    Default

    I don't, but I could take out 1 module (4GB remaining), although I wonder: a full pass on memtest did not reveal any errors.

    There are only 3 things remaining that I can think of:

    1. Now that I recall: I disabled the APM and ACPI boot options, but two (?) other were still enabled. Would it be worth a try to turn them off?
    2. Could it be a problem with the USB DOM that hosts the OS? I could try running on another USB device.
    3. Would it be worth a try to run a previous version that ran stable for a long time, i.e. v6 update 90? Are there any known changes that could cause this random freeze?


    P.s. I just saw by the inability to reach my website that the machine froze again, this time after only a few hours - it's slowly getting hopeless...
    Last edited by Arcesilaus; 10-26-2012 at 01:37 PM.

  3. #23

    Default

    If we assume that the mainboard and the cpu is not faulty (which is very rare) and that all drivers are tested and your system is on the HCL of open-e there is only one thing left: the memory.
    Please note that you need a minimum of 4 hours memtest to see if everything is okay.
    Best would be if you test it much longer!

  4. #24
    Join Date
    Oct 2009
    Posts
    53

    Default

    The motherboard and cpu have been replaced recently, and besides that, the system only contains a Transcend USB DOM, an Areca 1220 controller and two Intel NICs. As far as I know, these are all on the HCL and they've ran without any issues for over 2 years.

    I will run a memtest during the night and see what that gives.

  5. #25
    Join Date
    Oct 2009
    Posts
    53

    Default

    Well, the memtest ran for 7 hours, without any errors.
    Nonetheless, I am getting more and more convinced that it is a hardware error, more than a software error.
    Since the freeze happens mostly under a certain (known) load, I suspect the NICs.
    I've changed the network configuration as a test, and will replace the NICs as soon as possible.

    For the sake of shortening other users' problem cycles in the future, I'll keep this thread posted...

  6. #26
    Join Date
    Oct 2009
    Posts
    53

    Default

    After a little short to a week, I'm back, unfortunately.
    Last week, I changed the NIC configuration and brought the machine back up again.

    It ran stable for a week, and I've seriously stressed the machine without issues.
    Tonight, however, just when really nothing special was going on (machine was almost idle), I saw it happening again:
    The SMB connections were cut off, followed by the iSCSI connections a few minutes later.
    At first, a ping was still possible, but soon the machine froze completely, including the console.

    So, again, I rebooted, downloaded the logs and went to see what happened. Here's what I found:

    [2012/11/02 21:47:01.820369, 1] smbd/service.c:1070(make_connection_snum)
    192.168.47.11 (192.168.47.11) connect to service Music initially as user DOMAIN+USERNAME1 (uid=102, gid=107) (pid 32727)
    [2012/11/02 21:47:13.160369, 1] smbd/service.c:1251(close_cnum)
    192.168.47.11 (192.168.47.11) closed connection to service Music
    [2012/11/02 21:57:01.840369, 0] lib/fault.c:46(fault_report)
    ================================================== =============
    [2012/11/02 20:57:01.850369, 0] lib/fault.c:47(fault_report)
    INTERNAL ERROR: Signal 7 in pid 32727 (3.5.4)
    Please read the Trouble-Shooting section of the Samba3-HOWTO
    [2012/11/02 20:57:01.850369, 0] lib/fault.c:49(fault_report)

    From: http://www.samba.org/samba/docs/Samba3-HOWTO.pdf
    [2012/11/02 20:57:01.850369, 0] lib/fault.c:50(fault_report)
    ================================================== =============
    [2012/11/02 20:57:01.850369, 0] lib/util.c:1465(smb_panic)
    PANIC (pid 32727): internal error
    [2012/11/02 20:57:01.850369, 0] lib/util.c:1569(log_stack_trace)
    BACKTRACE: 0 stack frames:
    [2012/11/02 20:57:01.850369, 0] lib/fault.c:326(dump_core)
    dumping core in /usr/local/samba/var/cores/smbd
    After finding the connection was lost, I tried to reconnect a share:

    [2012/11/02 21:40:14.290369, 1] smbd/service.c:1070(make_connection_snum)
    192.168.47.117 (192.168.47.117) connect to service Pictures initially as user DOMAIN+USERNAME2 (uid=0, gid=107) (pid 16420)
    [2012/11/02 21:40:25.680369, 1] smbd/service.c:1251(close_cnum)
    192.168.47.117 (192.168.47.117) closed connection to service Music
    [2012/11/02 21:40:25.680369, 1] smbd/service.c:1251(close_cnum)
    192.168.47.117 (192.168.47.117) closed connection to service Pictures
    [2012/11/02 21:45:20.340369, 0] lib/fault.c:46(fault_report)
    ================================================== =============
    [2012/11/02 21:45:20.340369, 0] lib/fault.c:47(fault_report)
    INTERNAL ERROR: Signal 7 in pid 16420 (3.5.4)
    Please read the Trouble-Shooting section of the Samba3-HOWTO
    [2012/11/02 21:45:20.340369, 0] lib/fault.c:49(fault_report)

    From: http://www.samba.org/samba/docs/Samba3-HOWTO.pdf
    [2012/11/02 21:45:20.340369, 0] lib/fault.c:50(fault_report)
    ================================================== =============
    [2012/11/02 21:45:20.340369, 0] lib/util.c:1465(smb_panic)
    PANIC (pid 16420): internal error
    [2012/11/02 21:45:20.340369, 0] lib/util.c:1569(log_stack_trace)
    BACKTRACE: 0 stack frames:
    [2012/11/02 21:45:20.350369, 0] lib/fault.c:326(dump_core)
    dumping core in /usr/local/samba/var/cores/smbd
    [2012/11/02 21:45:20.730369, 1] smbd/service.c:1070(make_connection_snum)
    192.168.47.117 (192.168.47.117) connect to service Movies initially as user DOMAIN+USERNAME2 (uid=0, gid=107) (pid 9889)
    [2012/11/02 21:45:45.880369, 0] lib/fault.c:46(fault_report)
    ================================================== =============
    Not only looks the time registration in the first part a little weird, I cannot relate it to any process at the client with IP 192.168.47.11, any machine stress in general, nor any hardware issue.
    Please help!
    Last edited by Arcesilaus; 11-02-2012 at 11:47 PM.

  7. #27
    Join Date
    Oct 2009
    Posts
    53

    Default

    There is one last thing I can think of: VMWare's iSCSI initiator causes the Open-E network stack to freeze the system.
    Possible cause: http://vmtoday.com/2012/02/vsphere-5...oftware-iscsi/.

    I found this problem to be the case in my vSphere machine. With hindsight, the problem occurred first with ESXi 5.0...

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •