[ovirt-users] Shutdown Problems on oVirt Node

Nir Soffer nsoffer at redhat.com
Wed Mar 16 03:24:11 EDT 2016


On Tue, Mar 15, 2016 at 10:28 PM, Joop <jvdwege at xs4all.nl> wrote:
> On 14-3-2016 10:43, Allon Mureinik wrote:
>
> Odd. Never seen such behavior in any of our set ups.
> Can you please include vdsm's logs, sanlock's logs and /var/log/messages?
>
> I have noticed the same behaviour but not on server hardware but on my
> workstations which I use as a ovirt test setup.
> One would expect that a shutdown  on a host would shut it down cleanly but
> the only way to get that is to run a small script that will take care of:
> - service ovirt-ha-agent/broker stop
> - shutting down engine if it runs on this host
> -  service vdsmd stop
> - service sanlock stop (takes quite a bit of time (~2min?))
> - umount whatever is needed
> - service nfs stop
> - shutdown
>
> This will poweroff my host which normally runs my hosted-engine everytime.
> Sanlock seems to be indirectly the problem. wdmd(?) (watchdog daemon) seems
> able to keep the host from powering off, most of the time it will result in
> a reboot, or hanging at 'powering off'
>
> I spend quite a bit of time looking into logs but have not been able to find
> anything conclusive, could be my problem not knowing which log to look at or
> to dig up enough info to find the root cause.

The issue is probably sanlock - it will refuse to stop if it is
maintaining lockspaces
on shared storage. If you kill sanlock, the machine watchdog will
trigger a reboot
after a minute or so. This behavior is by design and what allows ovirt
to use locks
on shared storage, used for SPM, hosted engine ha agent, and hosted engine vm.

To shutdown or reboot a hypervisor, you should release the sanlock leases on
shared storage.

The process is:

1. Put the hypervisor in maintenance mode via engine
    This will migrate vms to another hypervisor
2. Put the hosted engine ha server in local maintenance mode
3. Reboot

For emergency reboot, when you cannot put the host to maintenance:

1. Kill sanlock
    (This will cause a reboot in a minute or so)
2. Reboot

Nir


More information about the Users mailing list