Visit Open-E website
Results 1 to 7 of 7

Thread: Problems with NFS and Webinterface

  1. #1

    Default Problems with NFS and Webinterface

    Hello,

    since a few days we have a new Intel Server (2xXeon, 32GB RAM, LSI SAS 8708ELP RAID6) with DSS 5.0.DB49000000.3278 running. From a second server (same hardware) we will do some heavy reading/writing (snapshoting/backup) per NFS. Performance is stable at 94MB/s across two switches and buildings for writing a single file per NFS (dd from second server to first server per nfs).
    If we do run an backup with many small files (cp -alr && rsync --delete) from nearly complete linux server, the DSS fails. The backup runs in a single thread on the second server. The backup contains then more data than files/MB has been changed. The nfs connection itself doesn't break, but the webinterface crashes and returnes after stopping the backup and waiting a short time. Also i got per mail the following errors:

  2. #2

    Default

    Code:
    2009/06/22 16:25:42	kernel:Pid: 0, comm: swapper Not tainted 2.6.25.13-oe32 #61
    
    2009/06/22 16:25:42	kernel:[<c0156fef>] __alloc_pages+0x2df/0x330
    
    2009/06/22 16:25:42	kernel:[<c016f805>] kmem_getpages+0x45/0x110
    
    2009/06/22 16:25:42	kernel:[<c0170394>] cache_grow+0x104/0x140
    
    2009/06/22 16:25:42	kernel:[<c0170587>] cache_alloc_refill+0x1b7/0x1f0
    
    2009/06/22 16:25:42	kernel:[<c0170921>] __kmalloc+0xb1/0xc0
    
    2009/06/22 16:25:42	kernel:[<c0476873>] __alloc_skb+0x43/0x100
    
    2009/06/22 16:25:42	kernel:[<c0476944>] __netdev_alloc_skb+0x14/0x40
    
    2009/06/22 16:25:42	kernel:[<e4ac4239>] e1000_alloc_rx_buffers+0x149/0x320 [e1000]
    
    2009/06/22 16:25:42	kernel:[<e4ac3ba3>] e1000_clean_rx_irq+0x353/0x480 [e1000]
    
    2009/06/22 16:25:42	kernel:[<e4ac31e6>] e1000_intr+0xc6/0x250 [e1000]
    
    2009/06/22 16:25:42	kernel:[<c014f6a5>] handle_IRQ_event+0x25/0x50
    
    2009/06/22 16:25:42	kernel:[<c0150700>] handle_fasteoi_irq+0x60/0xc0
    
    2009/06/22 16:25:42	kernel:[<c0105b78>] do_IRQ+0x38/0x70
    
    2009/06/22 16:25:42	kernel:[<c01049c3>] common_interrupt+0x23/0x30
    
    2009/06/22 16:25:42	kernel:[<c01021b1>] mwait_idle_with_hints+0x31/0x40
    
    2009/06/22 16:25:42	kernel:[<c01021c0>] mwait_idle+0x0/0x10
    
    2009/06/22 16:25:42	kernel:[<c01020fb>] cpu_idle+0x4b/0xa0
    
    2009/06/22 16:25:42	kernel:=======================
    
    2009/06/22 16:25:42	kernel:DMA per-cpu:
    
    2009/06/22 16:25:42	kernel:CPU    0: hi:    0, btch:   1 usd:   0
    
    2009/06/22 16:25:42	kernel:CPU    1: hi:    0, btch:   1 usd:   0
    
    2009/06/22 16:25:42	kernel:CPU    2: hi:    0, btch:   1 usd:   0
    
    2009/06/22 16:25:42	kernel:CPU    3: hi:    0, btch:   1 usd:   0
    
    2009/06/22 16:25:42	kernel:CPU    4: hi:    0, btch:   1 usd:   0
    
    2009/06/22 16:25:42	kernel:CPU    5: hi:    0, btch:   1 usd:   0
    
    2009/06/22 16:25:42	kernel:CPU    6: hi:    0, btch:   1 usd:   0
    
    2009/06/22 16:25:42	kernel:CPU    7: hi:    0, btch:   1 usd:   0
    
    2009/06/22 16:25:42	kernel:Normal per-cpu:
    
    2009/06/22 16:25:42	kernel:CPU    0: hi:  186, btch:  31 usd:  57
    
    2009/06/22 16:25:42	kernel:CPU    1: hi:  186, btch:  31 usd: 109
    
    2009/06/22 16:25:42	kernel:CPU    2: hi:  186, btch:  31 usd: 172
    
    2009/06/22 16:25:42	kernel:CPU    3: hi:  186, btch:  31 usd:  86
    
    2009/06/22 16:25:42	kernel:CPU    4: hi:  186, btch:  31 usd: 167
    
    2009/06/22 16:25:42	kernel:CPU    5: hi:  186, btch:  31 usd: 181
    
    2009/06/22 16:25:42	kernel:CPU    6: hi:  186, btch:  31 usd:  64
    
    2009/06/22 16:25:42	kernel:CPU    7: hi:  186, btch:  31 usd: 126
    
    2009/06/22 16:25:42	kernel:HighMem per-cpu:
    
    2009/06/22 16:25:42	kernel:CPU    0: hi:  186, btch:  31 usd:  13
    
    2009/06/22 16:25:42	kernel:CPU    1: hi:  186, btch:  31 usd:  32
    
    2009/06/22 16:25:42	kernel:CPU    2: hi:  186, btch:  31 usd: 127
    
    2009/06/22 16:25:42	kernel:CPU    3: hi:  186, btch:  31 usd:  93
    
    2009/06/22 16:25:42	kernel:CPU    4: hi:  186, btch:  31 usd: 157
    
    2009/06/22 16:25:42	kernel:CPU    5: hi:  186, btch:  31 usd:  87
    
    2009/06/22 16:25:42	kernel:CPU    6: hi:  186, btch:  31 usd: 148
    
    2009/06/22 16:25:42	kernel:CPU    7: hi:  186, btch:  31 usd:  31
    
    2009/06/22 16:25:42	kernel:Active:28739 inactive:121038 dirty:927 writeback:0 unstable:0
    
    2009/06/22 16:25:42	kernel:free:8096891 slab:27275 mapped:6273 pagetables:450 bounce:0
    
    2009/06/22 16:25:42	kernel:DMA free:2224kB min:84kB low:104kB high:124kB active:0kB inactive:0kB present:16256kB pages_scanned:0 all_unreclaimable? yes
    
    2009/06/22 16:25:42	kernel:lowmem_reserve[]: 0 555 34020 34020
    
    2009/06/22 16:25:42	kernel:Normal free:1096kB min:2968kB low:3708kB high:4452kB active:27108kB inactive:23552kB present:568960kB pages_scanned:0 all_unreclaimable? no
    
    2009/06/22 16:25:42	kernel:lowmem_reserve[]: 0 0 267716 267716
    
    2009/06/22 16:25:42	kernel:HighMem free:32384244kB min:512kB low:45248kB high:89984kB active:87848kB inactive:460600kB present:34267648kB pages_scanned:0 all_unreclaimable? no
    
    2009/06/22 16:25:42	kernel:lowmem_reserve[]: 0 0 0 0
    
    2009/06/22 16:25:42	kernel:DMA: 0*4kB 0*8kB 1*16kB 1*32kB 0*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 2224kB
    
    2009/06/22 16:25:42	kernel:Normal: 1*4kB 1*8kB 0*16kB 1*32kB 0*64kB 1*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 940kB
    
    2009/06/22 16:25:42	kernel:HighMem: 44*4kB 32*8kB 15*16kB 3*32kB 32*64kB 556*128kB 344*256kB 196*512kB 103*1024kB 41*2048kB 7796*4096kB = 32384256kB
    
    2009/06/22 16:25:42	kernel:133969 total pagecache pages
    
    2009/06/22 16:25:42	kernel:Swap cache: add 0, delete 0, find 0/0
    
    2009/06/22 16:25:42	kernel:Free swap  = 4194296kB
    
    2009/06/22 16:25:42	kernel:Total swap = 4194296kB
    
    2009/06/22 16:26:04	kernel:Pid: 29454, comm: apache2 Not tainted 2.6.25.13-oe32 #61
    
    2009/06/22 16:26:04	kernel:[<c0156fef>] __alloc_pages+0x2df/0x330
    
    2009/06/22 16:26:04	kernel:[<c0138410>] notify_die+0x30/0x40
    
    2009/06/22 16:26:04	kernel:[<c016f805>] kmem_getpages+0x45/0x110
    
    2009/06/22 16:26:04	kernel:[<c0170394>] cache_grow+0x104/0x140
    
    2009/06/22 16:26:04	kernel:[<c0170587>] cache_alloc_refill+0x1b7/0x1f0
    
    2009/06/22 16:26:04	kernel:[<c0170921>] __kmalloc+0xb1/0xc0
    
    2009/06/22 16:26:04	kernel:[<c0476873>] __alloc_skb+0x43/0x100
    
    2009/06/22 16:26:04	kernel:[<c0476944>] __netdev_alloc_skb+0x14/0x40
    
    2009/06/22 16:26:04	kernel:[<e4ac4239>] e1000_alloc_rx_buffers+0x149/0x320 [e1000]
    
    2009/06/22 16:26:04	kernel:[<e4ac3ba3>] e1000_clean_rx_irq+0x353/0x480 [e1000]
    
    2009/06/22 16:26:04	kernel:[<e4ac31e6>] e1000_intr+0xc6/0x250 [e1000]
    
    2009/06/22 16:26:04	kernel:[<c014f6a5>] handle_IRQ_event+0x25/0x50
    
    2009/06/22 16:26:04	kernel:[<c0150700>] handle_fasteoi_irq+0x60/0xc0
    
    2009/06/22 16:26:04	kernel:[<c0105b78>] do_IRQ+0x38/0x70
    
    2009/06/22 16:26:04	kernel:[<c01049c3>] common_interrupt+0x23/0x30
    
    2009/06/22 16:26:04	kernel:=======================
    
    2009/06/22 16:26:04	kernel:DMA per-cpu:
    
    2009/06/22 16:26:04	kernel:CPU    0: hi:    0, btch:   1 usd:   0
    
    2009/06/22 16:26:04	kernel:CPU    1: hi:    0, btch:   1 usd:   0
    
    2009/06/22 16:26:04	kernel:CPU    2: hi:    0, btch:   1 usd:   0
    
    2009/06/22 16:26:04	kernel:CPU    3: hi:    0, btch:   1 usd:   0
    
    2009/06/22 16:26:04	kernel:CPU    4: hi:    0, btch:   1 usd:   0
    
    2009/06/22 16:26:04	kernel:CPU    5: hi:    0, btch:   1 usd:   0
    
    2009/06/22 16:26:04	kernel:CPU    6: hi:    0, btch:   1 usd:   0
    
    2009/06/22 16:26:04	kernel:CPU    7: hi:    0, btch:   1 usd:   0
    
    2009/06/22 16:26:04	kernel:Normal per-cpu:
    
    2009/06/22 16:26:04	kernel:CPU    0: hi:  186, btch:  31 usd:  30
    
    2009/06/22 16:26:04	kernel:CPU    1: hi:  186, btch:  31 usd: 164
    
    2009/06/22 16:26:04	kernel:CPU    2: hi:  186, btch:  31 usd:  81
    
    2009/06/22 16:26:04	kernel:CPU    3: hi:  186, btch:  31 usd:  17
    
    2009/06/22 16:26:04	kernel:CPU    4: hi:  186, btch:  31 usd:  94
    
    2009/06/22 16:26:04	kernel:CPU    5: hi:  186, btch:  31 usd: 165
    
    2009/06/22 16:26:04	kernel:CPU    6: hi:  186, btch:  31 usd: 121
    
    2009/06/22 16:26:04	kernel:CPU    7: hi:  186, btch:  31 usd: 164
    
    2009/06/22 16:26:04	kernel:HighMem per-cpu:
    
    2009/06/22 16:26:04	kernel:CPU    0: hi:  186, btch:  31 usd:  57
    
    2009/06/22 16:26:04	kernel:CPU    1: hi:  186, btch:  31 usd: 165
    
    2009/06/22 16:26:04	kernel:CPU    2: hi:  186, btch:  31 usd: 161
    
    2009/06/22 16:26:04	kernel:CPU    3: hi:  186, btch:  31 usd:  47
    
    2009/06/22 16:26:04	kernel:CPU    4: hi:  186, btch:  31 usd: 154
    
    2009/06/22 16:26:04	kernel:CPU    5: hi:  186, btch:  31 usd: 182
    
    2009/06/22 16:26:04	kernel:CPU    6: hi:  186, btch:  31 usd: 147
    
    2009/06/22 16:26:04	kernel:CPU    7: hi:  186, btch:  31 usd:  53
    
    2009/06/22 16:26:04	kernel:Active:28940 inactive:116726 dirty:843 writeback:0 unstable:0
    
    2009/06/22 16:26:04	kernel:free:8099870 slab:28331 mapped:6451 pagetables:492 bounce:0
    
    2009/06/22 16:26:04	kernel:DMA free:2256kB min:84kB low:104kB high:124kB active:0kB inactive:0kB present:16256kB pages_scanned:0 all_unreclaimable? no
    
    2009/06/22 16:26:04	kernel:lowmem_reserve[]: 0 555 34020 34020
    
    2009/06/22 16:26:04	kernel:Normal free:1096kB min:2968kB low:3708kB high:4452kB active:25236kB inactive:21768kB present:568960kB pages_scanned:0 all_unreclaimable? no
    
    2009/06/22 16:26:04	kernel:lowmem_reserve[]: 0 0 267716 267716
    
    2009/06/22 16:26:04	kernel:HighMem free:32396128kB min:512kB low:45248kB high:89984kB active:90524kB inactive:445136kB present:34267648kB pages_scanned:0 all_unreclaimable? no
    
    2009/06/22 16:26:04	kernel:lowmem_reserve[]: 0 0 0 0
    
    2009/06/22 16:26:04	kernel:DMA: 0*4kB 3*8kB 2*16kB 1*32kB 0*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 2264kB
    
    2009/06/22 16:26:04	kernel:Normal: 0*4kB 1*8kB 0*16kB 0*32kB 3*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1352kB
    
    2009/06/22 16:26:04	kernel:HighMem: 133*4kB 54*8kB 11*16kB 8*32kB 510*64kB 333*128kB 366*256kB 198*512kB 106*1024kB 43*2048kB 7795*4096kB = 32396660kB
    
    2009/06/22 16:26:04	kernel:129550 total pagecache pages
    
    2009/06/22 16:26:04	kernel:Swap cache: add 0, delete 0, find 0/0
    
    2009/06/22 16:26:04	kernel:Free swap  = 4194296kB
    
    2009/06/22 16:26:04	kernel:Total swap = 4194296kB
    The errors happens on 32bit and on 64bit. Then i logged on the console on increased the nfs processes and vmalloc to max. The errors happens again. If i throttle rsync with --bwlimit 10576 down, then it runs smoothly.

    Any ideas?

    PS: Splitted posting, because the forum software limit posting to 10.000 chars

  3. #3

    Default

    Please run the memtest (after restarting the system enter the ESC key then run the memtest from the basic menu for about 1-2hrs to check if there are not errors) before running the "Repaire filesystem" from the Extended tools CTRL + ALT + X from the console.
    All the best,

    Todd Maxwell


    Follow the red "E"
    Facebook | Twitter | YouTube

  4. #4

    Default

    memtest has passed 3 times without errors, also no logged errors in the bios. Then i rebooted and initiated the file check and rebootet again. Just started again rsyncing and i got the same errors on the event viewer after a few minutes.

    btw, here the hardware-specs:
    CPU: 2x Intel L5410
    MB: Intel S5000PAIL
    RAM: 32 GB Full Buffered ECC
    Controller: LSI 8708E
    HD: 6x WD1002FBYS with RAID6

  5. #5

    Default

    It looks as it starts with the e1000 nic card

    Have you tried moving your nic to another slot ?

    You may want to try modifying the drivers for the nic
    I had some network issues before and made some changes, it has helped
    http://kb.open-e.com/entry/116/
    you will need to read the link to the intel page and make the apporate changes to match your setup

    check the logs to see what is going on with the nics,

  6. #6

    Default

    I can't change the slot for the nic, because all for nics are onboard. I changed to another nic and swapped the cat7 cables between the server and a second server. I won't change settings, because everything runs fine and the doc from intel says something about tuning. I got already 94MB/s across two switches and two buildings.

    Now i tried the latest beta (dss6.0up03_beta.oe_i.tar.gz). The beta complains about the storage are in 32bit-format with selected 64bit mode. So i deleted the storage and created a new one. Then i tested nfs again and it runs fine. I booted back into the old system (DSS 5.0.DB49000000.3278) from the buyed usb stick. Now it runs fine.

    I'm absoluty sure, yesterday i have selected 64bit SMP while booting, saved this on the console with selecting 64bit SMP for booting and rebooted after that. I looked onto the memory settings and there stand something with 32GB RAM and a memory use above 4GB. This confirmed me being in 64bit mode How can i check wheter the system runs in 32bit or 64bit mode?

    Rsync runs since 15min stable with no errors. I think the error was using 32bit mode on an 8 Core System with 32GB RAM.

  7. #7

    Default

    Yes after you supplied the hw specs I was going to recommend using the 64bit option - you can verify the settings from the console Tool CTRL + ALT + T then select boot options then System Architecture.
    All the best,

    Todd Maxwell


    Follow the red "E"
    Facebook | Twitter | YouTube

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •