[ovirt-devel] [ OST Failure Report ] [ oVirt master ] [ 03-07-2017 ] [ 006_migrations.migrate_vm ]

Eyal Edri eedri at redhat.com
Tue Jul 4 09:46:35 UTC 2017


On Tue, Jul 4, 2017 at 12:18 PM, Michal Skrivanek <
michal.skrivanek at redhat.com> wrote:

>
> On 3 Jul 2017, at 15:35, Shlomo Ben David <sbendavi at redhat.com> wrote:
>
> Hi,
>
> Test failed: [ 006_migrations.migrate_vm ]
> Link to suspected patches: N/A
> Link to Job: http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_
> master/7431/
> Link to all logs:
> Error snippet from the log: http://jenkins.ovirt.org/job/test-repo_ovirt_
> experimental_master/7431/artifact/exported-artifacts/
> basic-suit-master-el7/test_logs/basic-suite-master/post-006_migrations.py/
>
> <error>
>
>  "Fault reason is "Operation Failed". Fault detail is "[Cannot migrate VM.
> There is no host that satisfies current scheduling constraints. See below
> for details:, The host lago-basic-suite-master-host0 did not satisfy
> internal filter CPUOverloaded because its CPU is too loaded.]"
>
> </error>
>
> <engine log>
>
> 2017-07-02 16:43:22,829-04 INFO  [org.ovirt.engine.core.bll.MigrateVmToServerCommand]
> (default task-27) [87508047-fdc5-4a2f-9692-c83f7b55bbc2] Lock Acquired to
> object 'EngineLock:{exclusiveLocks='[2b34910d-cef2-44d6-a274-30e8473eb5d9=VM]',
> sharedLocks=''}'
> 2017-07-02 16:43:22,833-04 DEBUG [org.ovirt.engine.core.dal.dbbroker.
> PostgresDbEngineDialect$PostgresSimpleJdbcCall] (default task-27)
> [87508047-fdc5-4a2f-9692-c83f7b55bbc2] Compiled stored procedure. Call
> string is [{call getdiskvmelementspluggedtovm(?)}]
> 2017-07-02 16:43:22,833-04 DEBUG [org.ovirt.engine.core.dal.dbbroker.
> PostgresDbEngineDialect$PostgresSimpleJdbcCall] (default task-27)
> [87508047-fdc5-4a2f-9692-c83f7b55bbc2] SqlCall for procedure
> [GetDiskVmElementsPluggedToVm] compiled
> 2017-07-02 16:43:22,843-04 DEBUG [org.ovirt.engine.core.dal.dbbroker.
> PostgresDbEngineDialect$PostgresSimpleJdbcCall] (default task-27)
> [87508047-fdc5-4a2f-9692-c83f7b55bbc2] Compiled stored procedure. Call
> string is [{call getattacheddisksnapshotstovm(?, ?)}]
> 2017-07-02 16:43:22,843-04 DEBUG [org.ovirt.engine.core.dal.dbbroker.
> PostgresDbEngineDialect$PostgresSimpleJdbcCall] (default task-27)
> [87508047-fdc5-4a2f-9692-c83f7b55bbc2] SqlCall for procedure
> [GetAttachedDiskSnapshotsToVm] compiled
> 2017-07-02 16:43:22,919-04 INFO  [org.ovirt.engine.core.bll.scheduling.SchedulingManager]
> (default task-27) [87508047-fdc5-4a2f-9692-c83f7b55bbc2] Candidate host
> 'lago-basic-suite-master-host0' ('46bdc63d-98f5-4eee-81aa-2fb88b8f7cbe')
> was filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'CPUOverloaded'
> (correlation id: null)
> 2017-07-02 16:43:22,920-04 WARN  [org.ovirt.engine.core.bll.MigrateVmToServerCommand]
> (default task-27) [87508047-fdc5-4a2f-9692-c83f7b55bbc2] Validation of
> action 'MigrateVmToServer' failed for user admin at internal-authz. Reasons:
> VAR__ACTION__MIGRATE,VAR__TYPE__VM,SCHEDULING_ALL_HOSTS_
> FILTERED_OUT,VAR__FILTERTYPE__INTERNAL,$hostName
> lago-basic-suite-master-host0,$filterName CPUOverloaded,VAR__DETAIL__
> CPU_OVERLOADED,SCHEDULING_HOST_FILTERED_REASON_WITH_DETAIL
>
>
>
> This has nothing to do with migration
> The CPUOverload is a scheduling policy, unless there was any change in
> that area the obvious explanation would be that the host has a CPU overload
> condition.
> I briefly looked at logs and see ""cpuUser": "83.40", "cpuSys": "16.59",
> "cpuIdle": “0.08”” which indeed suggests an overload, from the same sample
> I can see it’s vdsm ("cpuUserVdsmd": “77.38”, cpuSysVdsmd": “18.44"
>
> Since similar values are consistently being reported for some time, and
> there is a setupNetworks and storage rescan prior to the the failure, and
> there is no other indication of anything wrong, I’d just say the
> environment or the order of tests or timing has changed, but nothing wrong
> with the oVirt code
> Did any of that changed recently? Does it reproduce locally?
>

AFAIK, no significant environment changes or tests were done.
We will try to reproduce it locally and also on the manual job,  but from
what it looks it is very consistent (unlike other race failures we've seen
lately ) and continues to fails on the same tests, so its either a change
in oVirt or something else that we're not thinking on.


>
> Thanks,
> michal
>
> 2017-07-02 16:43:22,920-04 INFO  [org.ovirt.engine.core.bll.MigrateVmToServerCommand]
> (default task-27) [87508047-fdc5-4a2f-9692-c83f7b55bbc2] Lock freed to
> object 'EngineLock:{exclusiveLocks='[2b34910d-cef2-44d6-a274-30e8473eb5d9=VM]',
> sharedLocks=''}'
> 2017-07-02 16:43:22,929-04 DEBUG [org.ovirt.engine.core.utils.timer.FixedDelayJobListener]
> (DefaultQuartzScheduler7) [] Rescheduling DEFAULT.org.ovirt.engine.core.
> bll.ColdRebootAutoStartVmsRunner.startFailedAutoStartVms#-9223372036854775733
> as there is no unfired trigger.
> 2017-07-02 16:43:22,932-04 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource]
> (default task-27) [] Operation Failed: [Cannot migrate VM. There is no host
> that satisfies current scheduling constraints. See below for details:, The
> host lago-basic-suite-master-host0 did not satisfy internal filter
> CPUOverloaded because its CPU is too loaded.]
> 2017-07-02 16:43:23,331-04 DEBUG [org.ovirt.engine.core.utils.timer.FixedDelayJobListener]
> (DefaultQuartzScheduler2) [] Rescheduling DEFAULT.org.ovirt.engine.core.
> bll.HaAutoStartVmsRunner.startFailedAutoStartVms#-9223372036854775793 as
> there is no unfired trigger.
> 2017-07-02 16:43:23,332-04 DEBUG [org.ovirt.engine.core.utils.timer.FixedDelayJobListener]
> (DefaultQuartzScheduler2) [] Rescheduling DEFAULT.org.ovirt.engine.core.
> bll.tasks.CommandCallbacksPoller.invokeCallbackMethods#-9223372036854775783
> as there is no unfired trigger.
>
> <engine log>
>
>
>
> Best Regards,
>
> Shlomi Ben-David | Software Engineer | Red Hat ISRAEL
> RHCSA | RHCVA | RHCE
> IRC: shlomibendavid (on #rhev-integ, #rhev-dev, #rhev-ci)
>
> OPEN SOURCE - 1 4 011 && 011 4 1
>
> _______________________________________________
> Devel mailing list
> Devel at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel
>
>
>
> _______________________________________________
> Devel mailing list
> Devel at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel
>



-- 

Eyal edri


ASSOCIATE MANAGER

RHV DevOps

EMEA VIRTUALIZATION R&D


Red Hat EMEA <https://www.redhat.com/>
<https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>
phone: +972-9-7692018
irc: eedri (on #tlv #rhev-dev #rhev-integ)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20170704/9181d41f/attachment-0001.html>


More information about the Devel mailing list