On Thu, Mar 7, 2019 at 10:34 AM Yedidyah Bar David <didi@redhat.com> wrote:
On Thu, Mar 7, 2019 at 11:30 AM Martin Sivak <msivak@redhat.com> wrote:
>
> Hi,
>
> there is no way to distinguish an engine that is not responsive
> (software or network issue) from a VM that is being powered off. The
> shutdown takes some time during which you just do not know.

_I_ do not know, but the user might still know beforehand.

> Global
> maintenance informs the tooling in advance that something like this is
> going to happen.

Yes. But users keep forgetting setting it. So I am trying to come up
with something that will fix that :-)

Now we have exactly the opposite:
engine-setup is already checking for global maintenance mode (the check acts on the engine DB over what the hosts report when polled so we have a bit of latency here) and engine-setup is exiting if we are on hosted-engine and not in global maintenance mode.
https://github.com/oVirt/ovirt-engine/blob/master/packaging/setup/plugins/ovirt-engine-common/ovirt-engine/system/he.py#L49

 

Perhaps instead of my original text, use something like "Right before
the engine goes down, it should set global maintenance".

>
> Who do you expect should be touching the shared storage? The engine VM
> itself? That might be possible, but remember the jboss instance is
> just the top of the process hierarchy. There are a lot of components
> where something might break during shutdown (filesystem umount timeout
> for example).

I did say "engine", not "engine vm". But see above for perhaps clearer
text.

>
> Martin
>
> On Thu, Mar 7, 2019 at 9:27 AM Yedidyah Bar David <didi@redhat.com> wrote:
> >
> > Hi all,
> >
> > How about making this change:
> >
> > Right before the engine goes down cleanly, it marks the shared storage
> > saying it did not crash but exited cleanly, and then HE-HA will not
> > try to restart it on another host. Perhaps make this optional, so that
> > users can do clean shutdowns and still test HA cleanly (or some other
> > use cases, where users might not want this).
> >
> > This should help a lot cases where people restarted their engine for
> > some reason, e.g. upgrade, and forgot to set maintenance.
> >
> > Makes sense?
> > --
> > Didi
> > _______________________________________________
> > Devel mailing list -- devel@ovirt.org
> > To unsubscribe send an email to devel-leave@ovirt.org
> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
> > List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/WCLSLEVXPHGRHL5BJHPLSYWPPOCMIJOQ/



--
Didi