We followed instructions on"Step-by-Step Guide to Synchronous Volume Replication (Block Based) with Fail-over over a LAN (withunicast) Supported by Open-E DSS", it works.
Fail-over when primary node down also working. However a strange issue happened when secondary node down.
Below are steps to reproduce our problem.
1. both primary and secondary node is up. iSCSI node status are "primary/active" and "secondary/passive".
2. we intentionally turn secondary node off.
3. after some time, iSCSI virtual ip longer can be ping.
4. check node status in primary node become suspend.
5. iSCSI storage not working..
step to resume:
1. turn secondary node on.
2. in primary node, stop fail-over manager.
3. in primary node, start replication tasks.
4. in primary node, start fail-over manger.