[ovirt-devel] Unable to start VMs

Tomas Jelinek tjelinek at redhat.com
Tue Sep 26 07:26:44 UTC 2017


On Tue, Sep 26, 2017 at 9:17 AM, Miroslava Voglova <mvoglova at redhat.com>
wrote:

> From 4.0 architecture family was renamed in script 04_00_0080_rename_architecture_family.
> So 'HotPlugCpuSupported', 'HotUnplugCpuSupported', '
> HotPlugMemorySupported', 'HotUnplugMemorySupported', 'IsMigrationSupported',
> 'IsMemorySnapshotSupported' and 'IsSuspendSupported' are all in db with x86
> not x86_64. In my point of view nothing wrong with that particular line in
> [1].
>
> Could be that somewhere in code is not used architecture family, but host
> architecture, when asked for value of this ConfigValues. But that would
> throw exception even before my patch, because '{"x86:"true","ppc":"true"}'
> was default value for HotPlugMemorySupported.
>

I see a code path where the cluster arch can be set to x86_64 - it is
always executed for external VMs (imported from external provider or
unmanaged). It does not happen all the time, it is only a fallback if the
arch type is not known/reported etc.

@Alexander: by any chance, was this VM an unmanaged one? Or imported? In
logs you should find something like:
"Illegal architecture type: {}, replacing with x86_64" or "null
architecture type, replacing with x86_64, {}".

Also, if you create a new VM, can you start it?


>
>
> [1] *https://gerrit.ovirt.org/#/c/81464/7/packaging/dbscripts/upgrade/pre_upgrade/0000_config.sql
> <https://gerrit.ovirt.org/#/c/81464/7/packaging/dbscripts/upgrade/pre_upgrade/0000_config.sql>*
>
> On Tue, Sep 26, 2017 at 9:00 AM, Tomas Jelinek <tjelinek at redhat.com>
> wrote:
>
>>
>>
>> On Mon, Sep 25, 2017 at 10:08 PM, Roy Golan <rgolan at redhat.com> wrote:
>>
>>>
>>>
>>> On Mon, 25 Sep 2017 at 22:52 Alexander Wels <awels at redhat.com> wrote:
>>>
>>>> On Monday, September 25, 2017 3:50:56 PM EDT Roy Golan wrote:
>>>> > So somewhere in the code somebody used the Arch and not the family.
>>>> See the
>>>> > enum getFamily() method
>>>> >
>>>>
>>>> Yep, in particular line 23 of FeatureSupported.java.
>>>>
>>>> I meant the caller of the method on this line. Do you have it in the
>>> trace so we can see who passed x86_64 as arch ?
>>>
>>> > On Mon, 25 Sep 2017 at 22:31 Alexander Wels <awels at redhat.com> wrote:
>>>> > > On Monday, September 25, 2017 3:24:14 PM EDT Roy Golan wrote:
>>>> > > > what JRE are you using? any change with that?
>>>> > >
>>>> > > So I just figured out the problem, and its really strange. It has
>>>> nothing
>>>> > > to
>>>> > > do with the SSL as the stack trace is mentioning. I manually stepped
>>>> > > through
>>>> > > the code to see what was going on and it turns out it is failing in
>>>> > > FeatureSupported.java in supportedInConfig call from hotPlugMemory.
>>>> > >
>>>> > > The Config.<Map>getValue(feature, version.getValue()) (version is
>>>> 4.2) is
>>>> > > returning a map containing x86=true and ppc=true. But then it
>>>> compares
>>>> > > this to
>>>> > > ArchitectureType.name() it returns null, because .name() return
>>>> x86_64. No
>>>> > > it
>>>> > > appears that sometime during the last few months we dropped the _64
>>>> in the
>>>> > > ArchitectureType, or at least in the database.
>>>>
>>>
>> It looks a lot like introduced here: https://gerrit.ovirt.org/#/c/81464/
>>
>> @Mirka: what you think?
>>
>>
>>> > >
>>>> > > As soon as I added a vdc_options tha contains x86_64 value for that
>>>> key it
>>>> > > started working. Now I have checked with Greg who has a fresh
>>>> database
>>>> > > that he
>>>> > > can start VMs no problem, and his database contains x86 instead of
>>>> x86_64.
>>>> > >
>>>> > > > On Mon, 25 Sep 2017 at 21:12 Alexander Wels <awels at redhat.com>
>>>> wrote:
>>>> > > > > Hi guys,
>>>> > > > >
>>>> > > > > I see to be having an issue starting VMs with the latest master.
>>>> > >
>>>> > > Whenever
>>>> > >
>>>> > > > > I
>>>> > > > > try to start a VM I get null pointer exception. And the VM
>>>> doesn't
>>>> > >
>>>> > > start.
>>>> > >
>>>> > > > > I
>>>> > > > > have debugged the engine, and it appears that the null pointer
>>>> happens
>>>> > > > > after
>>>> > > > > the engine tries to connect to the host. In the stack trace I
>>>> see
>>>> > > > > SSLPeerUnverifiedException, so it appears something went wrong
>>>> with a
>>>> > > > > certificate somewhere.
>>>> > > > >
>>>> > > > > I have put my hosts in maintaince and re-enrolled the
>>>> certificate, but
>>>> > > > > that
>>>> > > > > doesn't appear to be helping at all. Any other place I need to
>>>> look at
>>>> > >
>>>> > > to
>>>> > >
>>>> > > > > make
>>>> > > > > sure the engine can talk to the hosts? This appears to have
>>>> started
>>>> > >
>>>> > > after
>>>> > >
>>>> > > > > I
>>>> > > > > upgraded Wildfly to 11, so it is possible it has something to
>>>> do with
>>>> > >
>>>> > > that
>>>> > >
>>>> > > > > as
>>>> > > > > well.
>>>> > > > >
>>>> > > > > Any help figuring this out would be appreciated.
>>>> > > > >
>>>> > > > > Alexander
>>>> > > > > _______________________________________________
>>>> > > > > Devel mailing list
>>>> > > > > Devel at ovirt.org
>>>> > > > > http://lists.ovirt.org/mailman/listinfo/devel
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> Devel mailing list
>>> Devel at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20170926/38bcbc0a/attachment-0001.html>


More information about the Devel mailing list