----- Original Message -----
From: "Yedidyah Bar David" <didi(a)redhat.com>
To: "Doron Fediuck" <dfediuck(a)redhat.com>
Cc: "Sandro Bonazzola" <sbonazzo(a)redhat.com>, "Jiri Moskovcak"
<jmoskovc(a)redhat.com>
Sent: Sunday, March 16, 2014 12:28:27 PM
Subject: Re: Bug 1076530 – engine shouldn't kill the vds running the VM with the
hosted engine
Might be better to discuss this on bugzilla.
Bugzilla is not a mailing list. Moving to engine-devel.
----- Original Message -----
> From: "Doron Fediuck" <dfediuck(a)redhat.com>
> To: "Sandro Bonazzola" <sbonazzo(a)redhat.com>
> Cc: "Yedidyah Bar David" <didi(a)redhat.com>, "Jiri
Moskovcak"
> <jmoskovc(a)redhat.com>
> Sent: Sunday, March 16, 2014 12:01:51 PM
> Subject: Bug 1076530 – engine shouldn't kill the vds running the VM with
> the hosted engine
>
>
https://bugzilla.redhat.com/show_bug.cgi?id=1076530
>
> Sandro,
> I think this would be solved by a better validation during setup /
> deployment.
This can't be done during Validation in the otopi sense of the word.
At that point the engine does not exist yet and so we can't know what
versions it supports etc.
Why not?
You have the vdsm supported versions in a file (dsaversion IIRC)
and you should be able to get the relevant engine info before or
after deploying the DB.
It might be possible (didn't check) to check the versions right
before
trying to add the host to the cluster. This means we do not want to
abort (as we can do during Validation if something does not pass it).
What can we do? Perhaps offer a few options:
1. Do abort (will do mostly what happens today)
2. Let the user try to manually fix, probably by trying to change
the compatibility version of the cluster, and then try adding the
host again
3. Try to fix ourselves (same) and try adding again
4. Best would be to someone upgrade libvirt and reconfigure vdsm.
Not sure that's easy or even possible at this stage, where VM is
running and we do not want to loose it.
Thinking about this again, I am not sure the current behavior is that
bad. "Fixing" by re-installing with the correct versions is probably
way simpler than fixing after installation is (mostly) complete.
>
> I'm not keen on adding hosted-engine logic into the engine code.
Not sure about that. Not that it would help much, because the root
problem will still have to be solved, but in principle it might be
a good thing if the engine knows that killing some host will kill itself,
and so try harder to not do that and just leave it in some zombie,
requires-manual-action state. This is obviously more important during
normal operation than during installation.
--
Didi