Visit Open-E website
Results 1 to 10 of 14

Thread: Resolving a split brain on replication

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Join Date
    Aug 2008
    Posts
    236

    Default

    Keep in mind that as I've already said, I have completely restored factory configuration and setup. Removed contents of units, etc. These units are on the same vlan and when you configure replication, you can see the volume on the other end indicating good connectivity. In addition, the task gets created on the remote node..
    As I said previously, it does not appear that this error or situation is resolvable via any of the interfaces I have at my disposal..

  2. #2
    Join Date
    Oct 2010
    Location
    GA
    Posts
    935

    Default

    send me the logs from each node, via support, or in PM

  3. #3
    Join Date
    Aug 2008
    Posts
    236

  4. #4
    Join Date
    Oct 2010
    Location
    GA
    Posts
    935

    Default

    there are sock errors on the network connection between the nodes.

    ===
    block drbd0: [drbd0_worker/21962] sock_sendmsg time expired, ko = 4294967142
    block drbd0: sock_recvmsg returned -110
    block drbd0: sock_sendmsg returned -32
    block drbd0: peer( Secondary -> Unknown ) conn( WFBitMapS -> BrokenPipe ) pdsk( UpToDate -> DUnknown )
    block drbd0: short read expecting header on sock: r=-110
    block drbd0: short sent ReportBitMap size=4096 sent=1504
    ===

    reset the DRDB configuration to default and see if it works

    default values:
    max-buffers=2048
    max-epoch-size=2048
    unplug-watermark=128
    sndbuf-size=0
    al-extents=127
    no-disk-barrier=off
    no-disk-flushes=off

  5. #5
    Join Date
    Aug 2008
    Posts
    236

    Default

    Thanks for looking it over.
    We started with the defaults and modified it as a last resort to see if it would fix it.
    In short, we've already tried reestablishing it using the default settings..

  6. #6
    Join Date
    Oct 2010
    Location
    GA
    Posts
    935

    Default

    Quote Originally Posted by enealDC
    Thanks for looking it over.
    We started with the defaults and modified it as a last resort to see if it would fix it.
    In short, we've already tried reestablishing it using the default settings..
    OK, I will dig deeper

  7. #7
    Join Date
    Oct 2010
    Location
    GA
    Posts
    935

    Default

    Check the MTU on the NICs on both nodes:
    eth0 at 1500 is OK, but the others need to be consistant. [9000]

    san01
    eth0 UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
    eth1 UP BROADCAST RUNNING SLAVE MULTICAST MTU:9216 Metric:1
    eth2 UP BROADCAST RUNNING SLAVE MULTICAST MTU:9216 Metric:1
    eth3 UP BROADCAST RUNNING SLAVE MULTICAST MTU:9216 Metric:1
    eth4 UP BROADCAST RUNNING SLAVE MULTICAST MTU:9216 Metric:1
    eth5 UP BROADCAST RUNNING SLAVE MULTICAST MTU:9000 Metric:1

    san02
    eth0 UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
    eth1 UP BROADCAST RUNNING SLAVE MULTICAST MTU:9216 Metric:1
    eth2 UP BROADCAST RUNNING SLAVE MULTICAST MTU:9216 Metric:1
    eth3 UP BROADCAST RUNNING SLAVE MULTICAST MTU:9216 Metric:1
    eth4 UP BROADCAST RUNNING SLAVE MULTICAST MTU:9216 Metric:1
    eth5 UP BROADCAST RUNNING SLAVE MULTICAST MTU:9216 Metric:1

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •