]
Evgheni Dereveanchin reassigned OVIRT-1494:
-------------------------------------------
Assignee: Evgheni Dereveanchin (was: infra)
RCA for PHX Storage outage on 29.06.2017
----------------------------------------
Key: OVIRT-1494
URL:
https://ovirt-jira.atlassian.net/browse/OVIRT-1494
Project: oVirt - virtualization made easy
Issue Type: Task
Reporter: Evgheni Dereveanchin
Assignee: Evgheni Dereveanchin
The PHX storage stopped working around 5:40 GMT today and was brought up manually at 9:12
GMT by shutting down ovirt-storage02 and starting services on the remaining node.
ovirt-storage02 was the active cluster member and some unknown condition triggered a
cluster failover attempt. This event however failed with all cluster resources going
offline and not coming up on either of the nodes until one of them was shut down
completely.
Opening this ticket to analyze logs and confirm what triggered the failover and why it
eventually failed.