[JIRA] (OVIRT-1494) RCA for PHX Storage outage on 29.06.2017

Evgheni Dereveanchin (oVirt JIRA) jira at ovirt-jira.atlassian.net
Thu Jun 29 11:14:01 UTC 2017


Evgheni Dereveanchin created OVIRT-1494:
-------------------------------------------

             Summary: RCA for PHX Storage outage on 29.06.2017
                 Key: OVIRT-1494
                 URL: https://ovirt-jira.atlassian.net/browse/OVIRT-1494
             Project: oVirt - virtualization made easy
          Issue Type: Task
            Reporter: Evgheni Dereveanchin
            Assignee: infra


The PHX storage stopped working around 5:40 GMT today and was brought up manually at 9:12 GMT by shutting down ovirt-storage02 and starting services on the remaining node.

ovirt-storage02 was the active cluster member and some unknown condition triggered a cluster failover attempt. This event however failed with all cluster resources going offline and not coming up on either of the nodes until one of them was shut down completely.

Opening this ticket to analyze logs and confirm what triggered the failover and why it eventually failed.




--
This message was sent by Atlassian JIRA
(v1000.1092.0#100053)


More information about the Infra mailing list