Visit Open-E website
Results 1 to 5 of 5

Thread: iSCSI Failover - Connection loss when Secondary Node is doing RAID Rebuild

Thread has average rating 5.00 / 5.00 based on 1 votes.
Thread has been visited 6611 times.
  1. #1
    Join Date
    Aug 2011
    Location
    Germany
    Posts
    22

    Default iSCSI Failover - Connection loss when Secondary Node is doing RAID Rebuild

    Hi,

    Situation:

    iSCSI Failover with two DSS and Storage Replication.
    Connected to two VMware ESXi 4 Servers.

    Secondary/Passive Node have a damaged RAID Harddisk (RAID5) and I replaced it.
    The controller (LSI) is now doing the RAID REBUILD (copyback) proccess.

    I started the Secondary/Passive, checked the Storage Replication (was ok) and started the Failover Service.

    In this moment my Windows 2003 virtual machines lost the connection to their SCSI drives and I was not able to browse the Storage from ESXi. VMware does not lost the connection to the Virtual iSCSI IP, only the Storage was not accessable.

    Now I stopped Secondary/Passive again and opend the RAID GUI. I will wait now until the reboot is finished before starting Open-E again.



    Questions:

    1. Is this only a performance issue when Sec./Pass. is doing a RAID Rebuild (LSI Hardware Controller)? Why does this have an impact to Primary/Active Node?

    2. When I keep Sec/Pass offline for 1-2 Days until the rebuild is done - what is the best way to start it again?

    - Start Open-E
    - Start Volume Replication and wait until it's constistent
    - Start Failover Service ?



    Best Regards,
    Manuel
    There are only 10 types of people in the world:
    Those who understand binary, and those who don't

  2. #2

    Default

    This is true that you are impacted from what is called a degraded synchronized replication as it is acting like a RAID 1 mirror.

    If you are rebuilding the Array and the volumes will have to be recreated and the targets and the replications and the Virtual IP's and Auto Failover service you will have to then click the start service again.
    All the best,

    Todd Maxwell


    Follow the red "E"
    Facebook | Twitter | YouTube

  3. #3
    Join Date
    Aug 2011
    Location
    Germany
    Posts
    22

    Default

    Hello Todd,

    thanks for your respond.

    Now it looks like that the RAID controller is damaged because the new disk goes to BAD after one hour. I will receive the sparepart on Friday.



    In the worst case, if I cannot repair the RAID what is the procedure:


    Setup Open-E
    Setup Storage Replication and wait until finished
    then Setup iSCSI Failover ?

    At the moment I hope I can save the RAID... We will replace the controller, then the damage disk and try to rebuild it.


    bye Manuel
    There are only 10 types of people in the world:
    Those who understand binary, and those who don't

  4. #4

    Default

    Sorry to hear this issue you are having - Yes you will have to start from scratch.
    All the best,

    Todd Maxwell


    Follow the red "E"
    Facebook | Twitter | YouTube

  5. #5
    Join Date
    Aug 2011
    Location
    Germany
    Posts
    22

    Default

    Oh BTW: The Controller is now only DEGRADED and not on REBUILD.
    But if I try to power up the server it boot only to 90% and then the drives are not accessable.

    From the Email Messages I see that the servers try to startup the Volume Replication

    2012/02/08 15:45:19 Replication:Volume Replication: Device lv0001: Connection (from WFConnection to SyncSource). Mode local/remote (from Primary/Unknown to Primary/Secondary). Device lv0000: Connection (from WFConnection to SyncSource). Mode local/remote (from Primary/Unknown to Primary/Secondary).

    15:48:07 Replication:Volume Replication: Device lv0001: Connection (from SyncSource to WFConnection). Mode local/remote (from Primary/Secondary to Primary/Unknown). Device lv0000: Connection (from SyncSource to WFConnection). Mode local/remote (from Primary/Secondary to Primary/Unknown).


    Let's wait for the new controller... :-(
    There are only 10 types of people in the world:
    Those who understand binary, and those who don't

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •