Hi,

this is a general question. We had three times the same recovery problem after a complete power loss.

Our equipment: two Open-e DSS V7 in different buildings, working as A-A-Sync cluster.
- sync-line point-to-point 10Gbit
- storage network line 10Gbit
- admin network line 2Gbit
We've 3 aux paths and some ping nodes in the storage network
The cluster is working well: When we trigger a failover, it works, and we can maintain the passive system and then failback again. All well.

Euch node is secured by its own local UPS devices. They provide battery power for 15-30 mins in case of line power loss.

The problem:

We had three global power losses during the last months in all of our buildings (during night, at weekend, of course, bad luck ;-)
And the behavior of the nodes/clusterwas the same:

After ten+x minutes, the first node felt when its batteries became empty.
Another ten minutes later(!), the second one did, after its UPS shut down.
The time between should be enough to complete the failover.

In our expectation, we should be able to restart the cluster and failback after repowering the nodes.
But this didn't happen.

We succeeded to re-instantiate all sync tasks after we repowered the nodes.
But we found the cluster unable to start and needed the help of the technical support (they did something via remote support line what we can't do via admin interface).

Question: Bug or feature?

Our idea, now: The nodes seemed not to keep the cluster's state from before the powerloss of the second node.
Thus I think that I misunderstand how failover works, at least im some detail. Any hints?

Robert