Visit Open-E website
Results 1 to 10 of 14

Thread: Resolving a split brain on replication

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Join Date
    Oct 2010
    Location
    GA
    Posts
    935

    Default

    make sure nodes can ping each other.

    clear metadata on both nodes, and verify source/destination settings.

    reestablish connections and then the add the task.

  2. #2
    Join Date
    Aug 2008
    Posts
    236

    Default

    I've already done this. The nodes can see each other plainly. I've practically redone the configuration from scratch and the result is always the same.
    We have also replicating from the secondary node to the first node.

  3. #3
    Join Date
    Oct 2010
    Location
    GA
    Posts
    935

    Default

    check and make sure NIC settings are same on both nodes... jumbo frames ?

    also make sure volumes are same size.

    have you done anything to tune drdb ? same settings on both nodes ?

  4. #4
    Join Date
    Aug 2008
    Posts
    236

    Default

    Keep in mind that as I've already said, I have completely restored factory configuration and setup. Removed contents of units, etc. These units are on the same vlan and when you configure replication, you can see the volume on the other end indicating good connectivity. In addition, the task gets created on the remote node..
    As I said previously, it does not appear that this error or situation is resolvable via any of the interfaces I have at my disposal..

  5. #5
    Join Date
    Oct 2010
    Location
    GA
    Posts
    935

    Default

    send me the logs from each node, via support, or in PM

  6. #6
    Join Date
    Aug 2008
    Posts
    236

  7. #7
    Join Date
    Oct 2010
    Location
    GA
    Posts
    935

    Default

    there are sock errors on the network connection between the nodes.

    ===
    block drbd0: [drbd0_worker/21962] sock_sendmsg time expired, ko = 4294967142
    block drbd0: sock_recvmsg returned -110
    block drbd0: sock_sendmsg returned -32
    block drbd0: peer( Secondary -> Unknown ) conn( WFBitMapS -> BrokenPipe ) pdsk( UpToDate -> DUnknown )
    block drbd0: short read expecting header on sock: r=-110
    block drbd0: short sent ReportBitMap size=4096 sent=1504
    ===

    reset the DRDB configuration to default and see if it works

    default values:
    max-buffers=2048
    max-epoch-size=2048
    unplug-watermark=128
    sndbuf-size=0
    al-extents=127
    no-disk-barrier=off
    no-disk-flushes=off

  8. #8
    Join Date
    Aug 2008
    Posts
    236

    Default

    Thanks for looking it over.
    We started with the defaults and modified it as a last resort to see if it would fix it.
    In short, we've already tried reestablishing it using the default settings..

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •