Visit Open-E website
Results 1 to 6 of 6

Thread: Network problems with DSS

  1. #1

    Default Network problems with DSS

    Hello,

    A few months ago we purchased a Open-E DDS system based on a :
    Supermicro X7DBN with Intel ESB2 controller
    2 E5430 cpu's
    4gb ram.
    Areca 1280 24Tb Raid with 24 Hitachi A7K1000.
    Kernel is 64bit --> 2 raid5 combined in a software stripe (18TB).
    DDS Version: 5.0.DB44000000.3025

    The system is connected to nortel stack based on 4 5510 & 5520 switches, the switchports are set to 1Gb auto-negotiate. The 2 onboard intel ethernet are connected, but with a different IP-address (exemaple : 192.168.1.10 & 192.168.2.10).

    We installed a fibrechannel QLA2344 card and the fibre setup works perfectly.
    3 Servers are connected and use it as storage.

    However when we use the NAS/SMB or ISCSI (block io-mode) functionality of the Open-E DSS strange things happen on the network.

    When we launch 2 or more robocopy actions on a NAS or ISCSI Open-E volume the whole network is getting unstable.

    We see timeouts on ping commands & disconnects all over the network. When we quite the copy action on the DSS everything returns to normal.

    The copy-action between the host en Open-E server is not affected by these problems.

    I have tried to do same copy action on a ISCSI-windows target (starport) and this works perfectly (no disconnects - stable network) Multiple robocopy's at the same time works without a problem.

    When we copy to a NAS-volume on the Open-E the same thing happens (time-outs).
    Copying to a NAS-volume on a windows server works.

    We have tried to reset the DSS network settings, used several bonding methods but the problem stays.

    Jumboframes are off, IPfrag is standard (high: 262144 low: 196608).

    Unfortunately the Nortelswitches don't have good logging features, so I cannot determine the exact cause (at this moment). The next step will be sniffing with a networkanalyzer.

    Has somebody experienced the same problems?

  2. #2
    Join Date
    Jan 2008
    Posts
    86

    Default

    G'day Hologic,
    Not that we run Nortel switches, but we also had a lot of grief with one client using an IBM Blade centre, until they turned off Autoneg
    I can't recall if this was the exact post.

    https://www-304.ibm.com/systems/supp...andind=5000020

    Sorry I can'r be more specific, but the onsite IT guys hndled this, so it is just from my memory.
    HTH.

    Rgds Ben

  3. #3

    Default

    Hello Beng,

    I was thinking in the same direction.

    I've tried to set the ports to a fixed speed (1000Mbit/full) and it appears to be working.

    Latency is a little bit higher on some machines but the network is stable.

  4. #4

    Default

    It appears to be one of the two network interfaces on the server.

    When connected to the second network interface the network gets unstable.

    Going to do some further testing.

  5. #5
    Join Date
    Jan 2008
    Posts
    86

    Default

    G'day Hologic,
    Quote Originally Posted by hologic
    Latency is a little bit higher on some machines but the network is stable.
    Good to hear, is the latency just during pings? ie does it stay at <1 and then spike for a few and return?
    If so I have seen that too, and not quite sure what is going on. So far it does not seem to cause any issue with the clients...

    I've read through:
    http://www.vmware.com/files/pdf/ESX_...erformance.pdf
    and
    http://www.vmware.com/files/pdf/perf...devices_wp.pdf

    Which are interesting.

    But this is WAY offtopic, and Thom will probably shut us down soon

    Cheers Ben

  6. #6

    Default

    It seems that the NAS part still has problems. When we start the copy with 2 concurrent robocopy actions, the system works for about 10min and then the network problems begin.

    Also we are receiving errors that the specified file cannot be written to the share. On the network we are seeing ping time-outs and rising ping times.

    The amount of files were a copying are big, very big. Several millions in several thousands of directory's. Sizes range from 50kb to 35Mb.

    The seems that after a certain amount of files, the shares/smb disconnects.
    Somebody on the forum had the same problems after approx. 65000 files.
    Does anybody know what the limitations are of the NAS/SMB part for Open-E?

    The ISCI part is running stable now, with the Nortel ports set on a fixed speed.
    No more network disconnects/timeouts.

    3 systems are now connected to the Iscsi (1 system with 2 robocopy sessions copying several millions of files and 1 ESX 3.5 system).

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •