Does it mean that I have to run the ansible-playbook command from an external server and use as host in inventory the engine server, or does it mean that the ansible-playbook command is to be run from within the server where the ovirt-engine service is running and so keep intact the lines inside the smple yal file:
"
- name: oVirt shutdown environment
hosts: localhost
connection: local
"
Both options are valid.
Good! It seems it worked ok in shutdown mode (the default one) in a test hosted engine based 4.2.6 environment, where I have 2 hosts (both are hosted engine hosts), the hosted engine VM + 3 VMs
Initially ovnode2 is both SPM and hosts the HostedEngine VM
If I run the playbook from inside ovmgr42:
[root@ovmgr42 tests]# ansible-playbook test.yml
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit
localhost does not match 'all'
PLAY [oVirt shutdown environment] ******************************************************************
TASK [oVirt.shutdown-env : Populate service facts] *************************************************
ok: [localhost]
TASK [oVirt.shutdown-env : Enforce ovirt-engine machine] *******************************************
skipping: [localhost]
TASK [oVirt.shutdown-env : Enforce ovirt-engine status] ********************************************
skipping: [localhost]
TASK [oVirt.shutdown-env : Login to oVirt] *********************************************************
ok: [localhost]
TASK [oVirt.shutdown-env : Get hosts] **************************************************************
ok: [localhost]
TASK [oVirt.shutdown-env : set_fact] ***************************************************************ok: [localhost]
TASK [oVirt.shutdown-env : Enforce global maintenance mode] ****************************************
skipping: [localhost]
TASK [oVirt.shutdown-env : Warn about HE global maintenace mode] ***********************************
ok: [localhost] => {
"msg": "HE global maintenance mode has been set; you have to exit it to get the engine VM started when needed\n"
}
TASK [oVirt.shutdown-env : Shutdown of HE hosts] ***************************************************
changed: [localhost] => (item= . . . u'name': u'ovnode1', . . . u'spm': {u'priority': 5, u'status': u'none'}})
changed: [localhost] => (item= . . . u'name': u'ovnode2', . . . u'spm': {u'priority': 5, u'status': u'spm'}})
TASK [oVirt.shutdown-env : Shutdown engine host/VM] ************************************************
Connection to ovmgr42 closed by remote host.
Connection to ovmgr42 closed.
[g.cecchi@ope46 ~]$
At the end the 2 hosts (HP blades) are in power off state, as expected.
ILO event log of ovnode1:
Last Update Initial Update Count Description
09/12/2018 10:13 09/12/2018 10:13 1 Server power removed.
ILO event log of ovnode2:
Last Update Initial Update Count Description
09/12/2018 10:14 09/12/2018 10:14 1 Server power removed.
Actually due to time settings, they are to be intended as 11:13 and 11:14 my local time
In /var/log/libvirt/qemu/HostedEngine.log of node ovnode2
2018-09-11 17:04:16.388+0000: starting up libvirt version: 3.9.0, . . . hostname: ovnode2
...
2018-09-12 09:11:29.641+0000: shutting down, reason=shutdown
Actually we are at 11:11 local time
For now I have then manually restarted all the env
I began starting from ovnode2 (that was SPM and with HostedEngine during shutdown), keeping ovnode1 powered off, and it took some time because I got some messages like this (to be read bottom up)
Host ovnode1 failed to recover. 9/12/18 2:30:21 PM
Host ovnode1 is non responsive. 9/12/18 2:30:21 PM
...
Host ovnode1 is not responding. It will stay in Connecting state for a grace period of 60 seconds and after that an attempt to fence the host will be issued. 9/12/18 2:27:40 PM
Failed to Reconstruct Master Domain for Data Center MYDC42. 9/12/18 2:27:34 PM
VDSM ovnode2 command ConnectStoragePoolVDS failed: Cannot find master domain: u'spUUID=5af30d59-004c-02f2-01c9-0000000000b8, sdUUID=cbc308db-5468-4e6d-aabb-f9d133d05de2' 9/12/18 2:27:33 PM
Invalid status on Data Center MYDC42. Setting status to Non Responsive. 9/12/18 2:27:27 PM
...
ETL Service Started 9/12/18 2:26:27 PM
With ovnode1 still powered off, if I try to start it from the gui I get:
Trying to power on ovnode1 I get in events:
Host ovnode1 became non responsive. Fence operation skipped as the system is still initializing and this is not a host where hosted engine was running on previously. 9/12/18 2:30:21 PM
and as popup I get this "operation canceled" window:
What's the meaning?
In the phrase "the system is still initializing and this is not a host where hosted engine was running" the term THIS to which host is referred?
After some minutes I automatically get (to be read bottom up):