<div dir="ltr"><div><a href="https://gerrit.ovirt.org/#/c/78536">https://gerrit.ovirt.org/#/c/78536</a> broke network functional tests but a fix was merged today: <a href="https://gerrit.ovirt.org/#/c/78925/">https://gerrit.ovirt.org/#/c/78925/</a><br></div><div><br></div><div>I tried to run OST with my fix yesterday and still encountered the same failures.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jul 4, 2017 at 2:25 PM, Michal Skrivanek <span dir="ltr"><<a href="mailto:michal.skrivanek@redhat.com" target="_blank">michal.skrivanek@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><br><div><span class=""><blockquote type="cite"><div>On 4 Jul 2017, at 13:00, Eyal Edri <<a href="mailto:eedri@redhat.com" target="_blank">eedri@redhat.com</a>> wrote:</div><br class="m_5159822599123635676Apple-interchange-newline"><div><div dir="ltr"><div>I was able to reproduce the error [1] on a manual run with only new vdsm from [2],</div><div>and also to verify that w/o this change, while using latest tested run [3] it works.</div><div><br></div><div>So I think this proves quite clearly the problem is one of the latest VDSM patches.</div></div></div></blockquote><div><br></div></span>There is only a single patch between vdsms [1] and [3]</div><div><a href="https://gerrit.ovirt.org/#/c/78536" target="_blank">https://gerrit.ovirt.org/#/c/<wbr>78536</a></div><div><div class="h5"><div><br><blockquote type="cite"><div><div dir="ltr"><div><br></div><div>I'm running again the test with the suspected bad VDSM and hopefully will be able to extract the env to tar.gz file</div><div>which anyone can import using the lago demo tool.</div><div><br></div><div><br></div><div><br></div><div>[1] <a href="http://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/748/" target="_blank">http://jenkins.ovirt.org/v<wbr>iew/oVirt%20system%20tests/job<wbr>/ovirt-system-tests_manual/748<wbr>/</a></div><div>[2] <a href="http://jenkins.ovirt.org/job/vdsm_master_build-artifacts-el7-x86_64/2694/" target="_blank">http://jenkins.ovirt.org/j<wbr>ob/vdsm_master_build-artifacts<wbr>-el7-x86_64/2694/</a></div><div>[3] <a href="http://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/747/" target="_blank">http://jenkins.ovirt.org/v<wbr>iew/oVirt%20system%20tests/job<wbr>/ovirt-system-tests_manual/747<wbr>/</a></div><div><br></div><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jul 4, 2017 at 1:30 PM, Nadav Goldin <span dir="ltr"><<a href="mailto:ngoldin@redhat.com" target="_blank">ngoldin@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi, sorry for posting late, I had a brief look at this yesterday:<br>
1. I couldn't replicate it locally - which means it is most likely a<br>
recent change.<br>
2. I looked at the libvirt XMLs Lago generatd for the hosts, as a new<br>
version is used this week(0.40) - and they seem OK - specifically<br>
memroy and vcpus(which was my initial suspect).<br>
3. I saw two Engine patches, a bit prior to the time it started to<br>
fail, which *might* in my common sense be related, but it is out of my<br>
scope to tell(CC'ed patch owners):<br>
<br>
core: Make VmAnalyzer to treat a migrated Paused VM as success -<br>
<a href="https://gerrit.ovirt.org/78305" rel="noreferrer" target="_blank">https://gerrit.ovirt.org/78305</a><br>
<br>
fix custom fencing default config setting<br>
<a href="https://gerrit.ovirt.org/78720" rel="noreferrer" target="_blank">https://gerrit.ovirt.org/78720</a><br>
<br>
Shot in the wild - Could it be that the 'CPUOverload' filter was not<br>
active before for some reason?<br>
<br>
Also, there are some exceptions in host0 vdsm log[1], failing to get<br>
VM stats, though I can't tell if they are specific to this failure.<br>
<br>
Of course this is not a complete analysis, I hope it helps.<br>
<br>
<br>
[1] <a href="http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/7431/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-006_migrations.py/lago-basic-suite-master-host0/_var_log/vdsm/vdsm.log" rel="noreferrer" target="_blank">http://jenkins.ovirt.org/job/t<wbr>est-repo_ovirt_experimental_ma<wbr>ster/7431/artifact/exported-ar<wbr>tifacts/basic-suit-master-el7/<wbr>test_logs/basic-suite-master/p<wbr>ost-006_migrations.py/lago-bas<wbr>ic-suite-master-host0/_var_log<wbr>/vdsm/vdsm.log</a><br>
<span class="m_5159822599123635676m_2989331196243842324gmail-HOEnZb"><font color="#888888"><br>
<br>
Nadav.<br>
</font></span><div class="m_5159822599123635676m_2989331196243842324gmail-HOEnZb"><div class="m_5159822599123635676m_2989331196243842324gmail-h5"><br>
<br>
<br>
<br>
<br>
On Tue, Jul 4, 2017 at 12:46 PM, Eyal Edri <<a href="mailto:eedri@redhat.com" target="_blank">eedri@redhat.com</a>> wrote:<br>
><br>
><br>
> On Tue, Jul 4, 2017 at 12:18 PM, Michal Skrivanek<br>
> <<a href="mailto:michal.skrivanek@redhat.com" target="_blank">michal.skrivanek@redhat.com</a>> wrote:<br>
>><br>
>><br>
>> On 3 Jul 2017, at 15:35, Shlomo Ben David <<a href="mailto:sbendavi@redhat.com" target="_blank">sbendavi@redhat.com</a>> wrote:<br>
>><br>
>> Hi,<br>
>><br>
>> Test failed: [ 006_migrations.migrate_vm ]<br>
>> Link to suspected patches: N/A<br>
>> Link to Job:<br>
>> <a href="http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/7431/" rel="noreferrer" target="_blank">http://jenkins.ovirt.org/job/t<wbr>est-repo_ovirt_experimental_ma<wbr>ster/7431/</a><br>
>> Link to all logs:<br>
>> Error snippet from the log:<br>
>> <a href="http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/7431/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-006_migrations.py/" rel="noreferrer" target="_blank">http://jenkins.ovirt.org/job/t<wbr>est-repo_ovirt_experimental_ma<wbr>ster/7431/artifact/exported-ar<wbr>tifacts/basic-suit-master-el7/<wbr>test_logs/basic-suite-master/p<wbr>ost-006_migrations.py/</a><br>
>><br>
>> <error><br>
>><br>
>> "Fault reason is "Operation Failed". Fault detail is "[Cannot migrate VM.<br>
>> There is no host that satisfies current scheduling constraints. See below<br>
>> for details:, The host lago-basic-suite-master-host0 did not satisfy<br>
>> internal filter CPUOverloaded because its CPU is too loaded.]"<br>
>><br>
>> </error><br>
>><br>
>> <engine log><br>
>><br>
>> 2017-07-02 16:43:22,829-04 INFO<br>
>> [org.ovirt.engine.core.bll.Mig<wbr>rateVmToServerCommand] (default task-27)<br>
>> [87508047-fdc5-4a2f-9692-c83f7<wbr>b55bbc2] Lock Acquired to object<br>
>> 'EngineLock:{exclusiveLocks='[<wbr>2b34910d-cef2-44d6-a274-30e847<wbr>3eb5d9=VM]',<br>
>> sharedLocks=''}'<br>
>> 2017-07-02 16:43:22,833-04 DEBUG<br>
>> [org.ovirt.engine.core.dal.dbb<wbr>roker.PostgresDbEngineDialect$<wbr>PostgresSimpleJdbcCall]<br>
>> (default task-27) [87508047-fdc5-4a2f-9692-c83f7<wbr>b55bbc2] Compiled stored<br>
>> procedure. Call string is [{call getdiskvmelementspluggedtovm(?<wbr>)}]<br>
>> 2017-07-02 16:43:22,833-04 DEBUG<br>
>> [org.ovirt.engine.core.dal.dbb<wbr>roker.PostgresDbEngineDialect$<wbr>PostgresSimpleJdbcCall]<br>
>> (default task-27) [87508047-fdc5-4a2f-9692-c83f7<wbr>b55bbc2] SqlCall for<br>
>> procedure [GetDiskVmElementsPluggedToVm] compiled<br>
>> 2017-07-02 16:43:22,843-04 DEBUG<br>
>> [org.ovirt.engine.core.dal.dbb<wbr>roker.PostgresDbEngineDialect$<wbr>PostgresSimpleJdbcCall]<br>
>> (default task-27) [87508047-fdc5-4a2f-9692-c83f7<wbr>b55bbc2] Compiled stored<br>
>> procedure. Call string is [{call getattacheddisksnapshotstovm(?<wbr>, ?)}]<br>
>> 2017-07-02 16:43:22,843-04 DEBUG<br>
>> [org.ovirt.engine.core.dal.dbb<wbr>roker.PostgresDbEngineDialect$<wbr>PostgresSimpleJdbcCall]<br>
>> (default task-27) [87508047-fdc5-4a2f-9692-c83f7<wbr>b55bbc2] SqlCall for<br>
>> procedure [GetAttachedDiskSnapshotsToVm] compiled<br>
>> 2017-07-02 16:43:22,919-04 INFO<br>
>> [org.ovirt.engine.core.bll.sch<wbr>eduling.SchedulingManager] (default task-27)<br>
>> [87508047-fdc5-4a2f-9692-c83f7<wbr>b55bbc2] Candidate host<br>
>> 'lago-basic-suite-master-host0<wbr>' ('46bdc63d-98f5-4eee-81aa-2fb8<wbr>8b8f7cbe') was<br>
>> filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'CPUOverloaded'<br>
>> (correlation id: null)<br>
>> 2017-07-02 16:43:22,920-04 WARN<br>
>> [org.ovirt.engine.core.bll.Mig<wbr>rateVmToServerCommand] (default task-27)<br>
>> [87508047-fdc5-4a2f-9692-c83f7<wbr>b55bbc2] Validation of action<br>
>> 'MigrateVmToServer' failed for user admin@internal-authz. Reasons:<br>
>> VAR__ACTION__MIGRATE,VAR__TYPE<wbr>__VM,SCHEDULING_ALL_HOSTS_FILT<wbr>ERED_OUT,VAR__FILTERTYPE__INTE<wbr>RNAL,$hostName<br>
>> lago-basic-suite-master-host0,<wbr>$filterName<br>
>> CPUOverloaded,VAR__DETAIL__CPU<wbr>_OVERLOADED,SCHEDULING_HOST_FI<wbr>LTERED_REASON_WITH_DETAIL<br>
>><br>
>><br>
>><br>
>> This has nothing to do with migration<br>
>> The CPUOverload is a scheduling policy, unless there was any change in<br>
>> that area the obvious explanation would be that the host has a CPU overload<br>
>> condition.<br>
>> I briefly looked at logs and see ""cpuUser": "83.40", "cpuSys": "16.59",<br>
>> "cpuIdle": “0.08”” which indeed suggests an overload, from the same sample I<br>
>> can see it’s vdsm ("cpuUserVdsmd": “77.38”, cpuSysVdsmd": “18.44"<br>
>><br>
>> Since similar values are consistently being reported for some time, and<br>
>> there is a setupNetworks and storage rescan prior to the the failure, and<br>
>> there is no other indication of anything wrong, I’d just say the environment<br>
>> or the order of tests or timing has changed, but nothing wrong with the<br>
>> oVirt code<br>
>> Did any of that changed recently? Does it reproduce locally?<br>
><br>
><br>
> AFAIK, no significant environment changes or tests were done.<br>
> We will try to reproduce it locally and also on the manual job, but from<br>
> what it looks it is very consistent (unlike other race failures we've seen<br>
> lately ) and continues to fails on the same tests, so its either a change in<br>
> oVirt or something else that we're not thinking on.<br>
><br>
>><br>
>><br>
>> Thanks,<br>
>> michal<br>
>><br>
>> 2017-07-02 16:43:22,920-04 INFO<br>
>> [org.ovirt.engine.core.bll.Mig<wbr>rateVmToServerCommand] (default task-27)<br>
>> [87508047-fdc5-4a2f-9692-c83f7<wbr>b55bbc2] Lock freed to object<br>
>> 'EngineLock:{exclusiveLocks='[<wbr>2b34910d-cef2-44d6-a274-30e847<wbr>3eb5d9=VM]',<br>
>> sharedLocks=''}'<br>
>> 2017-07-02 16:43:22,929-04 DEBUG<br>
>> [org.ovirt.engine.core.utils.t<wbr>imer.FixedDelayJobListener]<br>
>> (DefaultQuartzScheduler7) [] Rescheduling<br>
>> <a href="http://DEFAULT.org" target="_blank">DEFAULT.org</a>.ovirt.engine.core.<wbr>bll.ColdRebootAutoStartVmsRunn<wbr>er.startFailedAutoStartVms#-92<wbr>23372036854775733<br>
>> as there is no unfired trigger.<br>
>> 2017-07-02 16:43:22,932-04 ERROR<br>
>> [org.ovirt.engine.api.restapi.<wbr>resource.AbstractBackendResour<wbr>ce] (default<br>
>> task-27) [] Operation Failed: [Cannot migrate VM. There is no host that<br>
>> satisfies current scheduling constraints. See below for details:, The host<br>
>> lago-basic-suite-master-host0 did not satisfy internal filter CPUOverloaded<br>
>> because its CPU is too loaded.]<br>
>> 2017-07-02 16:43:23,331-04 DEBUG<br>
>> [org.ovirt.engine.core.utils.t<wbr>imer.FixedDelayJobListener]<br>
>> (DefaultQuartzScheduler2) [] Rescheduling<br>
>> <a href="http://DEFAULT.org" target="_blank">DEFAULT.org</a>.ovirt.engine.core.<wbr>bll.HaAutoStartVmsRunner.start<wbr>FailedAutoStartVms#-9223372036<wbr>854775793<br>
>> as there is no unfired trigger.<br>
>> 2017-07-02 16:43:23,332-04 DEBUG<br>
>> [org.ovirt.engine.core.utils.t<wbr>imer.FixedDelayJobListener]<br>
>> (DefaultQuartzScheduler2) [] Rescheduling<br>
>> <a href="http://DEFAULT.org" target="_blank">DEFAULT.org</a>.ovirt.engine.core.<wbr>bll.tasks.CommandCallbacksPoll<wbr>er.invokeCallbackMethods#-9223<wbr>372036854775783<br>
>> as there is no unfired trigger.<br>
>><br>
>> <engine log><br>
>><br>
>><br>
>><br>
>> Best Regards,<br>
>><br>
>> Shlomi Ben-David | Software Engineer | Red Hat ISRAEL<br>
>> RHCSA | RHCVA | RHCE<br>
>> IRC: shlomibendavid (on #rhev-integ, #rhev-dev, #rhev-ci)<br>
>><br>
>> OPEN SOURCE - 1 4 011 && 011 4 1<br>
>><br>
>> ______________________________<wbr>_________________<br>
>> Devel mailing list<br>
>> <a href="mailto:Devel@ovirt.org" target="_blank">Devel@ovirt.org</a><br>
>> <a href="http://lists.ovirt.org/mailman/listinfo/devel" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman<wbr>/listinfo/devel</a><br>
>><br>
>><br>
>><br>
>> ______________________________<wbr>_________________<br>
>> Devel mailing list<br>
>> <a href="mailto:Devel@ovirt.org" target="_blank">Devel@ovirt.org</a><br>
>> <a href="http://lists.ovirt.org/mailman/listinfo/devel" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman<wbr>/listinfo/devel</a><br>
><br>
><br>
><br>
><br>
> --<br>
><br>
> Eyal edri<br>
><br>
><br>
> ASSOCIATE MANAGER<br>
><br>
> RHV DevOps<br>
><br>
> EMEA VIRTUALIZATION R&D<br>
><br>
><br>
> Red Hat EMEA<br>
><br>
</div></div><span class="m_5159822599123635676m_2989331196243842324gmail-im m_5159822599123635676m_2989331196243842324gmail-HOEnZb">> TRIED. TESTED. TRUSTED.<br>
> phone: <a href="tel:%2B972-9-7692018" value="+97297692018" target="_blank">+972-9-7692018</a><br>
> irc: eedri (on #tlv #rhev-dev #rhev-integ)<br>
><br>
</span><div class="m_5159822599123635676m_2989331196243842324gmail-HOEnZb"><div class="m_5159822599123635676m_2989331196243842324gmail-h5">> ______________________________<wbr>_________________<br>
> Devel mailing list<br>
> <a href="mailto:Devel@ovirt.org" target="_blank">Devel@ovirt.org</a><br>
> <a href="http://lists.ovirt.org/mailman/listinfo/devel" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman<wbr>/listinfo/devel</a><br>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="m_5159822599123635676m_2989331196243842324gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div style="font-family:overpass,sans-serif;margin:0px;padding:0px;font-size:14px;text-transform:uppercase;font-weight:bold"><font color="#cc0000">Eyal edri</font></div><div style="font-family:overpass,sans-serif;font-weight:bold;margin:0px;padding:0px;font-size:14px;text-transform:uppercase"><br></div><p style="font-family:overpass,sans-serif;font-size:10px;margin:0px 0px 4px;text-transform:uppercase">ASSOCIATE MANAGER</p><p style="font-family:overpass,sans-serif;font-size:10px;margin:0px 0px 4px;text-transform:uppercase">RHV DevOps</p><p style="font-family:overpass,sans-serif;font-size:10px;margin:0px 0px 4px;text-transform:uppercase">EMEA VIRTUALIZATION R&D</p><p style="font-family:overpass,sans-serif;font-size:10px;margin:0px 0px 4px;text-transform:uppercase"><br></p><div style="font-family:overpass,sans-serif;margin:0px;font-size:10px;color:rgb(153,153,153)"><a href="https://www.redhat.com/" style="color:rgb(0,136,206);margin:0px" target="_blank">Red Hat EMEA</a></div><table border="0" style="font-family:overpass,sans-serif;font-size:inherit"><tbody><tr><td width="100px"><a href="https://red.ht/sig" style="color:rgb(17,85,204)" target="_blank"><img src="https://www.redhat.com/profiles/rh/themes/redhatdotcom/img/logo-red-hat-black.png" width="90" height="auto"></a></td><td style="font-size:10px"><a href="https://redhat.com/trusted" style="color:rgb(204,0,0);font-weight:bold" target="_blank">TRIED. TESTED. TRUSTED.</a></td></tr></tbody></table></div><div>phone: <a href="tel:+972%209-769-2018" value="+97297692018" target="_blank">+972-9-7692018</a><br>irc: eedri (on #tlv #rhev-dev #rhev-integ)</div></div></div></div></div></div></div></div></div>
</div></div>
</div></blockquote></div><br></div></div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><p style="color:rgb(0,0,0);font-family:overpass,sans-serif;font-weight:bold;margin:0px;padding:0px;font-size:14px;text-transform:uppercase"><span>IRIT</span> <span>GOIHMAN</span></p><p style="color:rgb(0,0,0);font-family:overpass,sans-serif;font-size:10px;margin:0px 0px 4px;text-transform:uppercase"><span>SOFTWARE ENGINEER</span></p><p style="color:rgb(0,0,0);font-family:overpass,sans-serif;font-size:10px;margin:0px 0px 4px;text-transform:uppercase"><span>EMEA VIRTUALIZATION R&D</span></p><p style="font-family:overpass,sans-serif;margin:0px;font-size:10px;color:rgb(153,153,153)"><a href="https://www.redhat.com/" style="color:rgb(0,136,206);margin:0px" target="_blank">Red Hat <span>EMEA</span></a></p><p style="font-family:overpass,sans-serif;margin:0px 0px 6px;font-size:10px;color:rgb(153,153,153)"></p><table border="0" style="color:rgb(0,0,0);font-family:overpass,sans-serif;font-size:medium"><tbody><tr><td width="100px"><a href="https://red.ht/sig" target="_blank"><img src="https://www.redhat.com/files/brand/email/sig-redhat.png" width="90" height="auto"></a></td><td style="font-size:10px"><div><a href="https://redhat.com/trusted" style="color:rgb(204,0,0);font-weight:bold" target="_blank">TRIED. TESTED. TRUSTED.</a></div></td></tr></tbody></table><div style="color:rgb(0,0,0);font-family:overpass,sans-serif;font-size:10px"><div style="color:rgb(153,153,153)"><a href="https://twitter.com/redhatnews" title="twitter" style="background:url("https://www.redhat.com/files/brand/email/sm-twitter.png") 0px 50%/16px no-repeat transparent;height:20px;color:rgb(119,119,119);display:inline-block;line-height:20px;padding-left:16px" target="_blank">@redhatnews</a> <a href="https://www.linkedin.com/company/red-hat" title="LinkedIn" style="background:url("https://www.redhat.com/files/brand/email/sm-linkedin.png") 0px 50%/16px no-repeat transparent;height:20px;color:rgb(119,119,119);display:inline-block;line-height:20px;padding-left:16px" target="_blank">Red Hat</a> <a href="https://www.facebook.com/RedHatInc" title="Facebook" style="background:url("https://www.redhat.com/files/brand/email/sm-facebook.png") 0px 50%/16px no-repeat transparent;height:20px;color:rgb(119,119,119);display:inline-block;line-height:20px;padding-left:16px" target="_blank">Red Hat</a></div></div><div style="color:rgb(0,0,0);font-family:overpass,sans-serif;font-size:10px"></div></div></div></div></div>
</div>