[ovirt-devel] Unable to start VMs

Fred Rolland frolland at redhat.com
Tue Sep 26 11:23:48 UTC 2017


Thanks, it fixed my issue.

On Tue, Sep 26, 2017 at 1:04 PM, Miroslava Voglova <mvoglova at redhat.com>
wrote:

> Promised patch: https://gerrit.ovirt.org/#/c/82203/
>
> On Tue, Sep 26, 2017 at 11:44 AM, Miroslava Voglova <mvoglova at redhat.com>
> wrote:
>
>> So we found where is problem. There was patch before some time that was
>> reverted and had two badly formatted jsons in value of vdc_option. If you
>> have db from that time, the bug will appear, because the values are not
>> updated (because they are already in db).
>>
>> To fix this, its enough to change 'HotPlugMemorySupported' to value
>> '{"x86":"true","ppc":"true"}' for versions 4.0 - 4.2 and
>> 'HotUnplugMemorySupported' to '{"x86":"true","ppc":"true"}' for 4.2.
>>
>> Will do the patch that will include this update in 0000_config in case
>> anyone else had the same problem.
>>
>> On Tue, Sep 26, 2017 at 10:12 AM, Fred Rolland <frolland at redhat.com>
>> wrote:
>>
>>> I have same issue with new VM :
>>>
>>> 2017-09-26 11:07:59,255+03 ERROR [org.ovirt.engine.core.bll.GetArchitectureCapabilitiesQuery]
>>> (default task-2) [c066bc1d-4048-4b5f-bdb6-74fd813aa82e] Query
>>> 'GetArchitectureCapabilitiesQuery' failed: null
>>> 2017-09-26 11:07:59,255+03 ERROR [org.ovirt.engine.core.bll.GetArchitectureCapabilitiesQuery]
>>> (default task-2) [c066bc1d-4048-4b5f-bdb6-74fd813aa82e] Exception:
>>> java.lang.NullPointerException
>>>     at org.ovirt.engine.core.common.FeatureSupported.supportedInConfig(FeatureSupported.java:23)
>>> [common.jar:]
>>>     at org.ovirt.engine.core.common.FeatureSupported.hotUnplugMemory(FeatureSupported.java:43)
>>> [common.jar:]
>>>     at org.ovirt.engine.core.bll.GetArchitectureCapabilitiesQuery.i
>>> sSupported(GetArchitectureCapabilitiesQuery.java:66) [bll.jar:]
>>>     at org.ovirt.engine.core.bll.GetArchitectureCapabilitiesQuery.g
>>> etMap(GetArchitectureCapabilitiesQuery.java:36) [bll.jar:]
>>>     at org.ovirt.engine.core.bll.GetArchitectureCapabilitiesQuery.e
>>> xecuteQueryCommand(GetArchitectureCapabilitiesQuery.java:22) [bll.jar:]
>>>     at org.ovirt.engine.core.bll.QueriesCommandBase.executeCommand(QueriesCommandBase.java:106)
>>> [bll.jar:]
>>>     at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33)
>>> [dal.jar:]
>>>     at org.ovirt.engine.core.bll.executor.DefaultBackendQueryExecut
>>> or.execute(DefaultBackendQueryExecutor.java:14) [bll.jar:]
>>>
>>> Then:
>>>
>>> 2017-09-26 11:08:08,397+03 ERROR [org.ovirt.engine.core.vdsbroker.CreateVDSCommand]
>>> (org.ovirt.thread.EE-ManagedThreadFactory-engine-Thread-11)
>>> [9eb76d15-2bd5-49a9-a7d8-c73cfd55282c] Failed to create VM: null
>>> 2017-09-26 11:08:08,398+03 ERROR [org.ovirt.engine.core.vdsbroker.CreateVDSCommand]
>>> (org.ovirt.thread.EE-ManagedThreadFactory-engine-Thread-11)
>>> [9eb76d15-2bd5-49a9-a7d8-c73cfd55282c] Command 'CreateVDSCommand(
>>> CreateVDSCommandParameters:{hostId='b6b6a226-8d4f-4929-85d1-b218eceee99e',
>>> vmId='f15ddd07-408b-4665-aede-a9efc5716dc7', vm='VM [FEDORA_CINDER]'})'
>>> execution failed: java.lang.NullPointerException
>>>
>>>
>>> On Tue, Sep 26, 2017 at 10:26 AM, Tomas Jelinek <tjelinek at redhat.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Sep 26, 2017 at 9:17 AM, Miroslava Voglova <mvoglova at redhat.com
>>>> > wrote:
>>>>
>>>>> From 4.0 architecture family was renamed in
>>>>> script 04_00_0080_rename_architecture_family. So
>>>>> 'HotPlugCpuSupported', 'HotUnplugCpuSupported', 'HotPlugMemo
>>>>> rySupported', 'HotUnplugMemorySupported', 'IsMigrationSupported',
>>>>> 'IsMemorySnapshotSupported' and 'IsSuspendSupported' are all in db with x86
>>>>> not x86_64. In my point of view nothing wrong with that particular line in
>>>>> [1].
>>>>>
>>>>> Could be that somewhere in code is not used architecture family, but
>>>>> host architecture, when asked for value of this ConfigValues. But that
>>>>> would throw exception even before my patch, because
>>>>> '{"x86:"true","ppc":"true"}' was default value for HotPlugMemorySupported.
>>>>>
>>>>
>>>> I see a code path where the cluster arch can be set to x86_64 - it is
>>>> always executed for external VMs (imported from external provider or
>>>> unmanaged). It does not happen all the time, it is only a fallback if the
>>>> arch type is not known/reported etc.
>>>>
>>>> @Alexander: by any chance, was this VM an unmanaged one? Or imported?
>>>> In logs you should find something like:
>>>> "Illegal architecture type: {}, replacing with x86_64" or "null
>>>> architecture type, replacing with x86_64, {}".
>>>>
>>>> Also, if you create a new VM, can you start it?
>>>>
>>>>
>>>>>
>>>>>
>>>>> [1] *https://gerrit.ovirt.org/#/c/81464/7/packaging/dbscripts/upgrade/pre_upgrade/0000_config.sql
>>>>> <https://gerrit.ovirt.org/#/c/81464/7/packaging/dbscripts/upgrade/pre_upgrade/0000_config.sql>*
>>>>>
>>>>> On Tue, Sep 26, 2017 at 9:00 AM, Tomas Jelinek <tjelinek at redhat.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Sep 25, 2017 at 10:08 PM, Roy Golan <rgolan at redhat.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, 25 Sep 2017 at 22:52 Alexander Wels <awels at redhat.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On Monday, September 25, 2017 3:50:56 PM EDT Roy Golan wrote:
>>>>>>>> > So somewhere in the code somebody used the Arch and not the
>>>>>>>> family. See the
>>>>>>>> > enum getFamily() method
>>>>>>>> >
>>>>>>>>
>>>>>>>> Yep, in particular line 23 of FeatureSupported.java.
>>>>>>>>
>>>>>>>> I meant the caller of the method on this line. Do you have it in
>>>>>>> the trace so we can see who passed x86_64 as arch ?
>>>>>>>
>>>>>>> > On Mon, 25 Sep 2017 at 22:31 Alexander Wels <awels at redhat.com>
>>>>>>>> wrote:
>>>>>>>> > > On Monday, September 25, 2017 3:24:14 PM EDT Roy Golan wrote:
>>>>>>>> > > > what JRE are you using? any change with that?
>>>>>>>> > >
>>>>>>>> > > So I just figured out the problem, and its really strange. It
>>>>>>>> has nothing
>>>>>>>> > > to
>>>>>>>> > > do with the SSL as the stack trace is mentioning. I manually
>>>>>>>> stepped
>>>>>>>> > > through
>>>>>>>> > > the code to see what was going on and it turns out it is
>>>>>>>> failing in
>>>>>>>> > > FeatureSupported.java in supportedInConfig call from
>>>>>>>> hotPlugMemory.
>>>>>>>> > >
>>>>>>>> > > The Config.<Map>getValue(feature, version.getValue()) (version
>>>>>>>> is 4.2) is
>>>>>>>> > > returning a map containing x86=true and ppc=true. But then it
>>>>>>>> compares
>>>>>>>> > > this to
>>>>>>>> > > ArchitectureType.name() it returns null, because .name() return
>>>>>>>> x86_64. No
>>>>>>>> > > it
>>>>>>>> > > appears that sometime during the last few months we dropped the
>>>>>>>> _64 in the
>>>>>>>> > > ArchitectureType, or at least in the database.
>>>>>>>>
>>>>>>>
>>>>>> It looks a lot like introduced here: https://gerrit.ovirt.org/#/c/8
>>>>>> 1464/
>>>>>>
>>>>>> @Mirka: what you think?
>>>>>>
>>>>>>
>>>>>>> > >
>>>>>>>> > > As soon as I added a vdc_options tha contains x86_64 value for
>>>>>>>> that key it
>>>>>>>> > > started working. Now I have checked with Greg who has a fresh
>>>>>>>> database
>>>>>>>> > > that he
>>>>>>>> > > can start VMs no problem, and his database contains x86 instead
>>>>>>>> of x86_64.
>>>>>>>> > >
>>>>>>>> > > > On Mon, 25 Sep 2017 at 21:12 Alexander Wels <awels at redhat.com>
>>>>>>>> wrote:
>>>>>>>> > > > > Hi guys,
>>>>>>>> > > > >
>>>>>>>> > > > > I see to be having an issue starting VMs with the latest
>>>>>>>> master.
>>>>>>>> > >
>>>>>>>> > > Whenever
>>>>>>>> > >
>>>>>>>> > > > > I
>>>>>>>> > > > > try to start a VM I get null pointer exception. And the VM
>>>>>>>> doesn't
>>>>>>>> > >
>>>>>>>> > > start.
>>>>>>>> > >
>>>>>>>> > > > > I
>>>>>>>> > > > > have debugged the engine, and it appears that the null
>>>>>>>> pointer happens
>>>>>>>> > > > > after
>>>>>>>> > > > > the engine tries to connect to the host. In the stack trace
>>>>>>>> I see
>>>>>>>> > > > > SSLPeerUnverifiedException, so it appears something went
>>>>>>>> wrong with a
>>>>>>>> > > > > certificate somewhere.
>>>>>>>> > > > >
>>>>>>>> > > > > I have put my hosts in maintaince and re-enrolled the
>>>>>>>> certificate, but
>>>>>>>> > > > > that
>>>>>>>> > > > > doesn't appear to be helping at all. Any other place I need
>>>>>>>> to look at
>>>>>>>> > >
>>>>>>>> > > to
>>>>>>>> > >
>>>>>>>> > > > > make
>>>>>>>> > > > > sure the engine can talk to the hosts? This appears to have
>>>>>>>> started
>>>>>>>> > >
>>>>>>>> > > after
>>>>>>>> > >
>>>>>>>> > > > > I
>>>>>>>> > > > > upgraded Wildfly to 11, so it is possible it has something
>>>>>>>> to do with
>>>>>>>> > >
>>>>>>>> > > that
>>>>>>>> > >
>>>>>>>> > > > > as
>>>>>>>> > > > > well.
>>>>>>>> > > > >
>>>>>>>> > > > > Any help figuring this out would be appreciated.
>>>>>>>> > > > >
>>>>>>>> > > > > Alexander
>>>>>>>> > > > > _______________________________________________
>>>>>>>> > > > > Devel mailing list
>>>>>>>> > > > > Devel at ovirt.org
>>>>>>>> > > > > http://lists.ovirt.org/mailman/listinfo/devel
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Devel mailing list
>>>>>>> Devel at ovirt.org
>>>>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Devel mailing list
>>>> Devel at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20170926/c5095a1b/attachment-0001.html>


More information about the Devel mailing list