<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Sep 26, 2017 at 1:58 PM, Alexander Wels <span dir="ltr"><<a href="mailto:awels@redhat.com" target="_blank">awels@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="gmail-">On Tuesday, September 26, 2017 3:26:44 AM EDT Tomas Jelinek wrote:<br>
> On Tue, Sep 26, 2017 at 9:17 AM, Miroslava Voglova <<a href="mailto:mvoglova@redhat.com">mvoglova@redhat.com</a>><br>
><br>
> wrote:<br>
> > From 4.0 architecture family was renamed in script<br>
> > 04_00_0080_rename_<wbr>architecture_family. So 'HotPlugCpuSupported',<br>
> > 'HotUnplugCpuSupported', '<br>
> > HotPlugMemorySupported', 'HotUnplugMemorySupported',<br>
> > 'IsMigrationSupported', 'IsMemorySnapshotSupported' and<br>
> > 'IsSuspendSupported' are all in db with x86 not x86_64. In my point of<br>
> > view nothing wrong with that particular line in [1].<br>
> ><br>
> > Could be that somewhere in code is not used architecture family, but host<br>
> > architecture, when asked for value of this ConfigValues. But that would<br>
> > throw exception even before my patch, because '{"x86:"true","ppc":"true"}'<br>
> > was default value for HotPlugMemorySupported.<br>
><br>
> I see a code path where the cluster arch can be set to x86_64 - it is<br>
> always executed for external VMs (imported from external provider or<br>
> unmanaged). It does not happen all the time, it is only a fallback if the<br>
> arch type is not known/reported etc.<br>
><br>
> @Alexander: by any chance, was this VM an unmanaged one? Or imported? In<br>
> logs you should find something like:<br>
> "Illegal architecture type: {}, replacing with x86_64" or "null<br>
> architecture type, replacing with x86_64, {}".<br>
><br>
> Also, if you create a new VM, can you start it?<br>
><br>
<br>
</span>No its an old database though from pre 4.0 times. These VMs have never been<br>
unmanaged or imported from external providers. I did not see that in the log,<br>
I had to manuall step through the code to end up in the right place that<br>
causes the NPE. Like I said before line 23 in FeatureSupported.java is the<br>
culprit IMO. It does:<br>
<br>
String value = archOptions.get(<a href="http://arch.name" rel="noreferrer" target="_blank">arch.name</a>());<br>
<br>
arch is ArchitectureType, and <a href="http://arch.name" rel="noreferrer" target="_blank">arch.name</a> returns x86_64, and if I understand<br>
right they should have done arch.getFamily().name() which does happen 2 lines<br>
below it. Honestly I don't understand how any VMs are able to run with the<br>
code like that since they all check to see if you can do memory hot plug<br>
before starting, and that check runs through this piece of code, which based<br>
on the contents of [1] should return an npe since the database should not<br>
contain the x86_64 entries.<br></blockquote><div><br></div><div>the reason it did not work is that there was a syntactic error in the vdc_options table causing the <br></div><div>Config.<Map>getValue(feature, version.getValue()); to return null.</div><div><br></div><div>The VMs normally run, because if there is no entry for x86_64 than it checks x86 two lines below.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
> > [1]<br>
> > *<a href="https://gerrit.ovirt.org/#/c/81464/7/packaging/dbscripts/upgrade/pre_upg" rel="noreferrer" target="_blank">https://gerrit.ovirt.org/#/c/<wbr>81464/7/packaging/dbscripts/<wbr>upgrade/pre_upg</a><br>
> > rade/0000_config.sql<br>
> > <<a href="https://gerrit.ovirt.org/#/c/81464/7/packaging/dbscripts/upgrade/pre_upg" rel="noreferrer" target="_blank">https://gerrit.ovirt.org/#/c/<wbr>81464/7/packaging/dbscripts/<wbr>upgrade/pre_upg</a><br>
> > rade/0000_config.sql>*<br>
<div class="gmail-HOEnZb"><div class="gmail-h5">> ><br>
> > On Tue, Sep 26, 2017 at 9:00 AM, Tomas Jelinek <<a href="mailto:tjelinek@redhat.com">tjelinek@redhat.com</a>><br>
> ><br>
> > wrote:<br>
> >> On Mon, Sep 25, 2017 at 10:08 PM, Roy Golan <<a href="mailto:rgolan@redhat.com">rgolan@redhat.com</a>> wrote:<br>
> >>> On Mon, 25 Sep 2017 at 22:52 Alexander Wels <<a href="mailto:awels@redhat.com">awels@redhat.com</a>> wrote:<br>
> >>>> On Monday, September 25, 2017 3:50:56 PM EDT Roy Golan wrote:<br>
> >>>> > So somewhere in the code somebody used the Arch and not the family.<br>
> >>>><br>
> >>>> See the<br>
> >>>><br>
> >>>> > enum getFamily() method<br>
> >>>><br>
> >>>> Yep, in particular line 23 of FeatureSupported.java.<br>
> >>>><br>
> >>>> I meant the caller of the method on this line. Do you have it in the<br>
> >>><br>
> >>> trace so we can see who passed x86_64 as arch ?<br>
> >>><br>
> >>> > On Mon, 25 Sep 2017 at 22:31 Alexander Wels <<a href="mailto:awels@redhat.com">awels@redhat.com</a>> wrote:<br>
> >>>> > > On Monday, September 25, 2017 3:24:14 PM EDT Roy Golan wrote:<br>
> >>>> > > > what JRE are you using? any change with that?<br>
> >>>> > ><br>
> >>>> > > So I just figured out the problem, and its really strange. It has<br>
> >>>><br>
> >>>> nothing<br>
> >>>><br>
> >>>> > > to<br>
> >>>> > > do with the SSL as the stack trace is mentioning. I manually<br>
> >>>> > > stepped<br>
> >>>> > > through<br>
> >>>> > > the code to see what was going on and it turns out it is failing in<br>
> >>>> > > FeatureSupported.java in supportedInConfig call from hotPlugMemory.<br>
> >>>> > ><br>
> >>>> > > The Config.<Map>getValue(feature, version.getValue()) (version is<br>
> >>>><br>
> >>>> 4.2) is<br>
> >>>><br>
> >>>> > > returning a map containing x86=true and ppc=true. But then it<br>
> >>>><br>
> >>>> compares<br>
> >>>><br>
> >>>> > > this to<br>
> >>>> > > ArchitectureType.name() it returns null, because .name() return<br>
> >>>><br>
> >>>> x86_64. No<br>
> >>>><br>
> >>>> > > it<br>
> >>>> > > appears that sometime during the last few months we dropped the _64<br>
> >>>><br>
> >>>> in the<br>
> >>>><br>
> >>>> > > ArchitectureType, or at least in the database.<br>
> >><br>
> >> It looks a lot like introduced here: <a href="https://gerrit.ovirt.org/#/c/81464/" rel="noreferrer" target="_blank">https://gerrit.ovirt.org/#/c/<wbr>81464/</a><br>
> >><br>
> >> @Mirka: what you think?<br>
> >><br>
> >>>> > > As soon as I added a vdc_options tha contains x86_64 value for that<br>
> >>>><br>
> >>>> key it<br>
> >>>><br>
> >>>> > > started working. Now I have checked with Greg who has a fresh<br>
> >>>><br>
> >>>> database<br>
> >>>><br>
> >>>> > > that he<br>
> >>>> > > can start VMs no problem, and his database contains x86 instead of<br>
> >>>><br>
> >>>> x86_64.<br>
> >>>><br>
> >>>> > > > On Mon, 25 Sep 2017 at 21:12 Alexander Wels <<a href="mailto:awels@redhat.com">awels@redhat.com</a>><br>
> >>>><br>
> >>>> wrote:<br>
> >>>> > > > > Hi guys,<br>
> >>>> > > > ><br>
> >>>> > > > > I see to be having an issue starting VMs with the latest<br>
> >>>> > > > > master.<br>
> >>>> > ><br>
> >>>> > > Whenever<br>
> >>>> > ><br>
> >>>> > > > > I<br>
> >>>> > > > > try to start a VM I get null pointer exception. And the VM<br>
> >>>><br>
> >>>> doesn't<br>
> >>>><br>
> >>>> > > start.<br>
> >>>> > ><br>
> >>>> > > > > I<br>
> >>>> > > > > have debugged the engine, and it appears that the null pointer<br>
> >>>><br>
> >>>> happens<br>
> >>>><br>
> >>>> > > > > after<br>
> >>>> > > > > the engine tries to connect to the host. In the stack trace I<br>
> >>>><br>
> >>>> see<br>
> >>>><br>
> >>>> > > > > SSLPeerUnverifiedException, so it appears something went wrong<br>
> >>>><br>
> >>>> with a<br>
> >>>><br>
> >>>> > > > > certificate somewhere.<br>
> >>>> > > > ><br>
> >>>> > > > > I have put my hosts in maintaince and re-enrolled the<br>
> >>>><br>
> >>>> certificate, but<br>
> >>>><br>
> >>>> > > > > that<br>
> >>>> > > > > doesn't appear to be helping at all. Any other place I need to<br>
> >>>><br>
> >>>> look at<br>
> >>>><br>
> >>>> > > to<br>
> >>>> > ><br>
> >>>> > > > > make<br>
> >>>> > > > > sure the engine can talk to the hosts? This appears to have<br>
> >>>><br>
> >>>> started<br>
> >>>><br>
> >>>> > > after<br>
> >>>> > ><br>
> >>>> > > > > I<br>
> >>>> > > > > upgraded Wildfly to 11, so it is possible it has something to<br>
> >>>><br>
> >>>> do with<br>
> >>>><br>
> >>>> > > that<br>
> >>>> > ><br>
> >>>> > > > > as<br>
> >>>> > > > > well.<br>
> >>>> > > > ><br>
> >>>> > > > > Any help figuring this out would be appreciated.<br>
> >>>> > > > ><br>
> >>>> > > > > Alexander<br>
> >>>> > > > > ______________________________<wbr>_________________<br>
> >>>> > > > > Devel mailing list<br>
> >>>> > > > > <a href="mailto:Devel@ovirt.org">Devel@ovirt.org</a><br>
> >>>> > > > > <a href="http://lists.ovirt.org/mailman/listinfo/devel" rel="noreferrer" target="_blank">http://lists.ovirt.org/<wbr>mailman/listinfo/devel</a><br>
> >>><br>
> >>> ______________________________<wbr>_________________<br>
> >>> Devel mailing list<br>
> >>> <a href="mailto:Devel@ovirt.org">Devel@ovirt.org</a><br>
> >>> <a href="http://lists.ovirt.org/mailman/listinfo/devel" rel="noreferrer" target="_blank">http://lists.ovirt.org/<wbr>mailman/listinfo/devel</a><br>
<br>
<br>
</div></div></blockquote></div><br></div></div>