Il giorno mar 6 lug 2021 alle ore 17:33 Nir Soffer <nsoffer@redhat.com> ha scritto:
On Tue, Jul 6, 2021 at 5:58 PM Scott Worthington
<scott.c.worthington@gmail.com> wrote:
>
>
>
> On Tue, Jul 6, 2021 at 8:13 AM Nir Soffer <nsoffer@redhat.com> wrote:
>>
>> On Tue, Jul 6, 2021 at 2:29 PM Sandro Bonazzola <sbonazzo@redhat.com> wrote:
>>>
>>>
>>>
>>> Il giorno mar 6 lug 2021 alle ore 13:03 Nir Soffer <nsoffer@redhat.com> ha scritto:
>>>>
>>>> On Tue, Jul 6, 2021 at 1:11 PM Nathanaël Blanchet <blanchet@abes.fr> wrote:
>>>> > We are installing UPS powerchute client on hypervisors.
>>>> >
>>>> > What is the default vms behaviour of running vms when an hypervisor is
>>>> > ordered to shutdown: do the vms live migrate or do they shutdown
>>>> > properly (even the restart on an other host because of HA) ?
>>>>
>>>> In general VMs are not restarted after an unexpected shutdown, but HA VMs
>>>> are restarted after failures.
>>>>
>>>> If the HA VM has a lease, it can restart safely on another host regardless of
>>>> the original host status. If the HA VM does not have a lease, the system must
>>>> wait until the original host is up again to check if the VM is still
>>>> running on this
>>>> host.
>>>>
>>>> Arik can add more details on this.
>>>
>>>
>>> I think the question is not related to what happens after the host is back.
>>> I think the question is what happens when the host goes down.
>>> To me, the right way to shutdown a host is putting it first to maintenance (VM evacuate to other hosts) and then shutdown.
>>
>>
>> Right, but the we don't have integration with the UPS, so engine cannot put the host
>> to maintenance when the host lose power and the UPS will shut it down after
>> few minutes.
>
>
> This is outside of the scope of oVirt team:
>
> Perhaps one could combine multiple applications ( NUT + Ansible + Nagios/Zabbix ) to notify the oVirt engine to switch a host to maintenance?
>
> NUT[0] could be configured to alert a monitoring system ( like Nagios or Zabbix) to trigger an Ansible playbook [1][2] to put the host in maintenance mode, and the trigger should happen before the UPS battery is depleted (you'll have to account for the time it takes to live migrate VMs).

I would trigger this once power is lost. You never know how much time
migration will take, so best migrate all vms immediately.

It would be nice to integrate this with engine, but we can start by something
like you describe, that will use engine API/SDK to prepare the hosts for
graceful shutdown.

we already have a role for immediate shutdown of the whole datacenter: https://github.com/oVirt/ovirt-ansible-shutdown-env
now integrated in ansible collection https://github.com/oVirt/ovirt-ansible-collection/tree/master/roles/shutdown_env

 

> [0] Network UPS Tools https://networkupstools.org/docs/user-manual.chunked/index.html
> [1] https://www.ovirt.org/develop/release-management/features/infra/ansible_modules.html
> [2] https://docs.ansible.com/ansible/latest/collections/ovirt/ovirt/ovirt_host_module.html



--