[Engine-devel] Bug 1076530 – engine shouldn't kill the vds running the VM with the hosted engine

Sandro Bonazzola sbonazzo at redhat.com
Mon Mar 17 07:11:20 UTC 2014


Il 16/03/2014 11:59, Yedidyah Bar David ha scritto:
> 
> 
> ----- Original Message -----
>> From: "Doron Fediuck" <dfediuck at redhat.com>
>> To: "Yedidyah Bar David" <didi at redhat.com>
>> Cc: "Sandro Bonazzola" <sbonazzo at redhat.com>, "Jiri Moskovcak" <jmoskovc at redhat.com>, "engine-devel"
>> <engine-devel at ovirt.org>
>> Sent: Sunday, March 16, 2014 12:47:43 PM
>> Subject: Re: Bug 1076530 – engine shouldn't kill the vds running the VM with the hosted engine
>>
>>
>>
>> ----- Original Message -----
>>> From: "Yedidyah Bar David" <didi at redhat.com>
>>> To: "Doron Fediuck" <dfediuck at redhat.com>
>>> Cc: "Sandro Bonazzola" <sbonazzo at redhat.com>, "Jiri Moskovcak"
>>> <jmoskovc at redhat.com>
>>> Sent: Sunday, March 16, 2014 12:28:27 PM
>>> Subject: Re: Bug 1076530 – engine shouldn't kill the vds running the VM
>>> with the hosted engine
>>>
>>> Might be better to discuss this on bugzilla.
>>>
>> Bugzilla is not a mailing list. Moving to engine-devel.
>>
>>> ----- Original Message -----
>>>> From: "Doron Fediuck" <dfediuck at redhat.com>
>>>> To: "Sandro Bonazzola" <sbonazzo at redhat.com>
>>>> Cc: "Yedidyah Bar David" <didi at redhat.com>, "Jiri Moskovcak"
>>>> <jmoskovc at redhat.com>
>>>> Sent: Sunday, March 16, 2014 12:01:51 PM
>>>> Subject: Bug 1076530 – engine shouldn't kill the vds running the VM with
>>>> the hosted engine
>>>>
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1076530
>>>>
>>>> Sandro,
>>>> I think this would be solved by a better validation during setup /
>>>> deployment.
>>>
>>> This can't be done during Validation in the otopi sense of the word.
>>> At that point the engine does not exist yet and so we can't know what
>>> versions it supports etc.
>>>
>> Why not?
>> You have the vdsm supported versions in a file (dsaversion IIRC)
>> and you should be able to get the relevant engine info before or
>> after deploying the DB.
> 
> The VM does not exist yet at that point. How can you know what the user
> will install on it? You can tell them what they *should* install - e.g.
> "The highest compatibility version supported by this host is 3.4, you
> should install a 3.4 engine inside the engine VM". But we can't know what
> the user actually did until after we connect to the installed and working
> engine.
> 
>>
>>> It might be possible (didn't check) to check the versions right before
>>> trying to add the host to the cluster. This means we do not want to
>>> abort (as we can do during Validation if something does not pass it).
>>> What can we do? Perhaps offer a few options:
>>> 1. Do abort (will do mostly what happens today)
>>> 2. Let the user try to manually fix, probably by trying to change
>>> the compatibility version of the cluster, and then try adding the
>>> host again
>>> 3. Try to fix ourselves (same) and try adding again
>>> 4. Best would be to someone upgrade libvirt and reconfigure vdsm.
>>> Not sure that's easy or even possible at this stage, where VM is
>>> running and we do not want to loose it.

We can check VDSM caps in late setup / customization and abort if cluster compatibility is not 3.4.
I'm not sure that VDSM 3.3 is enough for running hosted engine.

We can warn the user about the minimum version of oVirt engine that must be installed inside the VM and
after that we can check oVirt engine cluster compatibility and refuse to continue until the cluster
have a correct support level. This will require manual changes like upgrading the engine in the VM
or fix cluster compatibility level if we find an invalid value.


>>>
>>> Thinking about this again, I am not sure the current behavior is that
>>> bad. "Fixing" by re-installing with the correct versions is probably
>>> way simpler than fixing after installation is (mostly) complete.
>>>
>>>>
>>>> I'm not keen on adding hosted-engine logic into the engine code.
>>>
>>> Not sure about that. Not that it would help much, because the root
>>> problem will still have to be solved, but in principle it might be
>>> a good thing if the engine knows that killing some host will kill itself,
>>> and so try harder to not do that and just leave it in some zombie,
>>> requires-manual-action state. This is obviously more important during
>>> normal operation than during installation.
>>> --
>>> Didi
>>>
>>
> 


-- 
Sandro Bonazzola
Better technology. Faster innovation. Powered by community collaboration.
See how it works at redhat.com



More information about the Devel mailing list