Visit Open-E website
Results 1 to 8 of 8

Thread: Again: Bad replication Performance

  1. #1

    Default Again: Bad replication Performance

    Again same problem: Bad replication performance. We are losing patience now.

    Support doesn't answer after 48 hours (ticket # 1014840)
    After we called support today by phone with some new information:

    Your case is being moved to the second / third level of support
    We will require additional time to investigate the problem.
    Thank you for your patience - we will get back to you as soon as possible
    We are bored now.

    Support never solved ticket # 1002902, which was startetd 23.10.2008
    and was closed unsuccessful 16.11.2009 with the comment:

    This request was send to developers future task list but there is no ETA yet
    Support closed ticket # 1011213 (started 27.8.2009) "successful" at 7.10.2009 although no solution by OPEN-E.


    Current problem (ticket # 1014840) won't be solved by support in a reasonable time as well,
    because first and second level support probably do not even understand, what the problem is about.


    OK, maybe some one here can help:

    2 identical machines:

    Supermicro PDSME+ Motherboard with 4GB mem, CPU Intel Core2Quad Q6600 2,4 GHz

    Slot 1 (PCIe 8x)
    3ware 9650SE with 8x ST3500320NS SATA drives in Raid 5 (1 of them hot spare)

    Slot 2 (PCI-x)
    3C996B-T 1000Base-T

    Slot 3 (PCI-x)
    3C996B-T1000Base-T

    Slot 4 (PCIe 4x)
    Intel Pro/1000 PT Dualport Server Adapter

    Intel Pro Dualport configurerd as Bond 802.3ad and connected via direct cable to same bond on the other machine.
    Bond up and running.


    Replication of 3 Volumes goes over the bond.
    Maximum replication speed 60MB/s, no matter if only one task is running
    or all three replication tasks share the boonds bandwidth (then each with 20 MB/s).
    Replication tasks are configured with 100MB/s. As long as one of three is consistent,
    this should do no harm.


    Tried replication through one of the 3com NICs: even worse, maximum repl speed 13MB/s
    Tried jumbo frames, no gain

    Disk read speed taken from tests.log hdparm is about 224 MB/s,
    disk write speed taken from measuring time while initializing volumes is about 150MB/s

    But the worse thing of the replication is, that 60MB/s is NOT CONTINOUS.

    Watching a single replication task running, speed "flaps" down to 6 MB/s,
    raises again to 60, falls back to 30, goes to 45 reaches 60,
    falls down to 6, then somewhere around 15 and so on...

    Serious problem, since we have to use ISCSI-Failover,
    and as we saw before (and this is the only logical thing inside this boxes),
    iscsi speed = repl speed.

    Any hints ANYBODY ?

    Regards.

  2. #2
    Join Date
    Jan 2008
    Posts
    82

    Default

    Hello,

    This could be more than annoying. But to be fair with Open-E guys and software. In the past I monitored the replication and didn't have this behavior. And frankly, I never have issue with their support!!!!

  3. #3

    Default

    Dear Sir,

    A response with a small-update that fix this issue was sent to you.

    Please accept our apology for any problem!

    Regards,
    Shon

  4. #4

    Default

    First please use the update that was provided to you today for your ticket that you had mentioned and use the latest version of the DSS V6 build 4221 with your replications and provide the logs to our engineers so we can assist you on getting the speeds that you are needing.

    For Ticket 1014840 the engineer did take your call to talk with you and to help you we provided you a small update. In many cases with FREE SUPPORT we may need more time to investigate your issues as fast as possible for you and this can take time.

    Also there are many forum notes and references that talk about using a Bond and using a bond for replication does not help in performance, please try to use 1GbE NIC.

    Concerning the other tickets I sent you an email and I think you were unfair not providing all the information and in each ticket our CTO personally contacted you. The forums are designed to help others with technical references and not Tickets to post for personal gain that are not fairly published as we did and will address them.

    Let us help you to get your speed issues resolved and provide the logs to our engineers from the update that was provided to you today and with the latest build 4221.
    All the best,

    Todd Maxwell


    Follow the red "E"
    Facebook | Twitter | YouTube

  5. #5

    Default

    Thaks a lot for the quick response.
    It's 21:11 pm here now, we've been working for 14 hours now,
    we have to give it a try tomorrow morning.

    Regards
    Ralph

  6. #6

    Default

    Update:

    Installed the small update provided by support, didn't help,
    Replication speed of a single task varies between 9,5MB/s and 65MB/s, in the mean it's about 45MB/s,
    there is no constant data rate, it goes up and down as before.
    Sent logs of both machines to support.

    Regards
    Ralph

  7. #7

    Default

    So you bonding with 802.3ab going point to point, This will elimenate any switch issues

    have you tested with any of the other bond modes?

  8. #8

    Default

    Yes, that's what we do with the direct connection.
    The network cables were tested before with a Fluke DTX-1800 CAT6a testing
    procedure to eliminate bad cabling as a reason for bad performance.

    We have tested also without bonding on a single Intel NIC
    Same result.
    As mentionend in the first posting, replication through a 3com NIC
    delivers poor 13MB/s...
    Estimated replication speed of a single task would be about 100MB/s constantly.

    We also do have a production setup of DSS-V6 with HP-hardware
    with NC375i and NC360T NICs and smart array P410i,
    overall replication speed is as estimated with 3 tasks running simultaneously
    over a 802.3ad bond at about 180MB/s.

    So this seems to be an issue how drdb handles the Intel or 3com NICs,
    since the ISCSI performance without any replication is very OK on all NICs,
    if we use the machine as standalone.

    BTW, we know, that a bond will not increase performance for a single replication task,
    but it will increase available bandwith for more than one replication task running,
    as we can see in our production environment.


    Regards
    Ralph

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •