[ovirt-devel] Experimental Flow for Master Fails to Run a VM

Eyal Edri eedri at redhat.com
Sun Dec 4 16:17:09 UTC 2016


Some more info on changes that were merged during Friday:

And from the looks of it [1] looks the culprit, I'd like to avoid another
db upgrade issue with revert, anyone from dev can please handle this?

[1] https://gerrit.ovirt.org/#/c/67470/

core: New VM has RND device by default Script adds urandom rng device to
Blank template and all predefined instance types. This causes that new VMs
will inherit such RNG device. Custom instance types are not changed. The
assumption is that if they were created without a RNG device, it was an
intentional decision. Change-Id: I93a51b67c0e8bff06152d9fe7a4315efd509774d
Bug-Url: https://bugzilla.redhat.com/1337101 Signed-off-by: Jakub
Niedermertl <jniederm at redhat.com

ovirt-engine changelog

* d9fcf3b - engine: Reduce the severity of check-upgrade log (3 days ago)
Moti Asayag <masayag at redhat.com>
* 4616f4d - core: New VM has RND device by default (3 days ago) Jakub
Niedermertl <jniederm at redhat.com>
* c95365a - core: Fix of NPE when creating new instance type (3 days ago)
Jakub Niedermertl <jniederm at redhat.com>
* 2ff5c4d - restapi: Reflecting template RNG settings to new VM (3 days
ago) Jakub Niedermertl <jniederm at redhat.com>
* f8bdfa0 - frontend: use authz name instead of profile name for sysprep (3
days ago) Ondra Machacek <omachace at redhat.com>


vdsm changelog:

* a149cb7 - Adding simple client for sending gauge metrics to statsd port
using udp (3 days ago) Yaniv Bronhaim <ybronhei at redhat.com>
* d5c00b9 - API: move vm parameters fixup in a method (3 days ago)
Francesco Romani <fromani at redhat.com>
* 8180dfb - hostdev: add test for massive number of devices (3 days ago)
Martin Polednik <mpolednik at redhat.com>
* f904734 - Remove the usage of  clientIF from GlusterApi (3 days ago)
Ramesh Nachimuthu <rnachimu at redhat.com>
* 36c0ce6 - rename method wrapApiMethod to  _wrap_api_method (3 days ago)
Ramesh Nachimuthu <rnachimu at redhat.com>
* 3383158 - vmfakecon: optimize HostDeviceStub (3 days ago) Martin Polednik
<mpolednik at redhat.com>
* 316893c - hostdev: use *c*ElementTree (3 days ago) Martin Polednik <
mpolednik at redhat.com>
* c56619e - client: document ConnectionError exception (3 days ago) Irit
Goihman <igoihman at redhat.com>
* f5d605e - py3: take Queue from six.moves (3 days ago) Dan Kenigsberg <
danken at redhat.com>



On Sun, Dec 4, 2016 at 3:34 PM, Eyal Edri <eedri at redhat.com> wrote:

> FYI,
>
> I opened a bug [1] to track this issue since I don't see any attempts to
> resolve the issue on the thread, hopefully a bug will get more attention.
> Opened on VDSM since we see the libvirt error there, feel free to move
> product/team.
>
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1401303
>
> On Sun, Dec 4, 2016 at 1:23 PM, Eyal Edri <eedri at redhat.com> wrote:
>
>> Not sure if relevant, but Juan posted a fix for SDK4 last time it
>> happened ( but different failure on log-collector ):
>>
>> https://gerrit.ovirt.org/#/c/67213/
>>
>> * Added `urandom` to the `RngSource` enumerated type.
>>
>> On Sun, Dec 4, 2016 at 9:17 AM, Eyal Edri <eedri at redhat.com> wrote:
>>
>>> And its still failing from Friday,
>>> Since we don't have official Centos 7.3 repos yet ( hopefully we'll have
>>> it this week, but as of this moment its not published yet ) , we have to
>>> either revert the offending patch
>>> or send a quick fix.
>>>
>>> Right now all experimental flows for master are not working and nightly
>>> rpms are not refreshed with new RPMs.
>>>
>>>
>>>
>>> On Fri, Dec 2, 2016 at 9:41 PM, Yaniv Kaul <ykaul at redhat.com> wrote:
>>>
>>>>
>>>>
>>>> On Dec 2, 2016 2:11 PM, "Anton Marchukov" <amarchuk at redhat.com> wrote:
>>>>
>>>> Hello Martin.
>>>>
>>>> Do by outdated you mean the old libvirt? If so that is that livirt
>>>> available in CentOS 7.2? There is no 7.3 yet.
>>>>
>>>>
>>>> Right, this is the issue.
>>>> Y.
>>>>
>>>>
>>>> Anton.
>>>>
>>>> On Fri, Dec 2, 2016 at 1:07 PM, Martin Polednik <mpolednik at redhat.com>
>>>> wrote:
>>>>
>>>>> On 02/12/16 10:55 +0100, Anton Marchukov wrote:
>>>>>
>>>>>> Hello All.
>>>>>>
>>>>>> Engine log can be viewed here:
>>>>>>
>>>>>> http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_ma
>>>>>> ster/3838/artifact/exported-artifacts/basic_suite_master.sh-
>>>>>> el7/exported-artifacts/test_logs/basic-suite-master/post-004
>>>>>> _basic_sanity.py/lago-basic-suite-master-engine/_var_log_ovi
>>>>>> rt-engine/engine.log
>>>>>>
>>>>>> I see the following exception there:
>>>>>>
>>>>>> 2016-12-02 04:29:24,030-05 DEBUG
>>>>>> [org.ovirt.vdsm.jsonrpc.client.internal.ResponseWorker]
>>>>>> (ResponseWorker) [83b6b5d] Message received: {"jsonrpc": "2.0", "id":
>>>>>> "ec254aad-441b-47e7-a644-aebddcc1d62c", "result": true}
>>>>>> 2016-12-02 04:29:24,030-05 ERROR
>>>>>> [org.ovirt.vdsm.jsonrpc.client.JsonRpcClient] (ResponseWorker)
>>>>>> [83b6b5d] Not able to update response for
>>>>>> "ec254aad-441b-47e7-a644-aebddcc1d62c"
>>>>>> 2016-12-02 04:29:24,041-05 DEBUG
>>>>>> [org.ovirt.engine.core.utils.timer.FixedDelayJobListener]
>>>>>> (DefaultQuartzScheduler3) [47a31d72] Rescheduling
>>>>>> DEFAULT.org.ovirt.engine.core.bll.gluster.GlusterSyncJob.ref
>>>>>> reshLightWeightData#-9223372036854775775
>>>>>> as there is no unfired trigger.
>>>>>> 2016-12-02 04:29:24,024-05 DEBUG
>>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.PollVDSCommand] (default
>>>>>> task-12) [d932871a-af4f-4fc9-9ee5-f7a0126a7b85] Exception:
>>>>>> org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
>>>>>> VDSGenericException: VDSNetworkException: Timeout during xml-rpc call
>>>>>>         at org.ovirt.engine.core.vdsbroke
>>>>>> r.vdsbroker.FutureVDSCommand.get(FutureVDSCommand.java:73)
>>>>>> [vdsbroker.jar:]
>>>>>>
>>>>>> ...
>>>>>>
>>>>>> 2016-12-02 04:29:24,042-05 ERROR
>>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.PollVDSCommand] (default
>>>>>> task-12) [d932871a-af4f-4fc9-9ee5-f7a0126a7b85] Timeout waiting for
>>>>>> VDSM response: Internal timeout occured
>>>>>> 2016-12-02 04:29:24,044-05 DEBUG
>>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
>>>>>> (default task-12) [d932871a-af4f-4fc9-9ee5-f7a0126a7b85] START,
>>>>>> GetCapabilitiesVDSCommand(HostName = lago-basic-suite-master-host0,
>>>>>> VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
>>>>>> hostId='5eb7019e-28a3-4f93-9188-685b6c64a2f5',
>>>>>> vds='Host[lago-basic-suite-master-host0,5eb7019e-28a3-4f93-9
>>>>>> 188-685b6c64a2f5]'}),
>>>>>> log id: 58f448b8
>>>>>> 2016-12-02 04:29:24,044-05 DEBUG
>>>>>> [org.ovirt.vdsm.jsonrpc.client.reactors.stomp.impl.Message] (default
>>>>>> task-12) [d932871a-af4f-4fc9-9ee5-f7a0126a7b85] SEND
>>>>>> destination:jms.topic.vdsm_requests
>>>>>> reply-to:jms.topic.vdsm_responses
>>>>>> content-length:105
>>>>>>
>>>>>>
>>>>>> Please note that this runs on localhost with local bridge. So it is
>>>>>> not
>>>>>> likely to be network itself.
>>>>>>
>>>>>
>>>>> The main issue I see is that the VM run command has actually failed
>>>>> due to libvirt no accepting /dev/urandom as RNG source[1]. This was
>>>>> done as engine patch and according to git log, posted around Mon Nov
>>>>> 28. Also adding Jakub - this should either not happen from engine's
>>>>> point of view or the lago host is outdated.
>>>>>
>>>>> [1]
>>>>> http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_ma
>>>>> ster/3838/artifact/exported-artifacts/basic_suite_master.sh-
>>>>> el7/exported-artifacts/test_logs/basic-suite-master/post-004
>>>>> _basic_sanity.py/lago-basic-suite-master-host0/_var_log_vdsm/vdsm.log
>>>>>
>>>>>
>>>>> Anton.
>>>>>>
>>>>>> On Fri, Dec 2, 2016 at 10:43 AM, Anton Marchukov <amarchuk at redhat.com
>>>>>> >
>>>>>> wrote:
>>>>>>
>>>>>> FYI. Experimental flow for master currently fails to run a VM. The
>>>>>>> tests
>>>>>>> times out while waiting for 180 seconds:
>>>>>>>
>>>>>>> http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_
>>>>>>> master/3838/testReport/(root)/004_basic_sanity/vm_run/
>>>>>>>
>>>>>>> This is reproducible over 23 runs of this happened tonight, sounds
>>>>>>> like a
>>>>>>> regression to me:
>>>>>>>
>>>>>>> http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/
>>>>>>>
>>>>>>> I will update here with additional information once I find it.
>>>>>>>
>>>>>>> Last successful run was with this patch:
>>>>>>>
>>>>>>> https://gerrit.ovirt.org/#/c/66416/ (vdsm: API: move vm parameters
>>>>>>> fixup
>>>>>>> in a method)
>>>>>>>
>>>>>>> Known to start failing around this patch:
>>>>>>>
>>>>>>> https://gerrit.ovirt.org/#/c/67647/ (vdsmapi: fix a typo in string
>>>>>>> formatting)
>>>>>>>
>>>>>>> Please notes that we do not have gating implemented yet, so
>>>>>>> everything
>>>>>>> that was merged in between those patches might have caused this (not
>>>>>>> necessary in vdsm project).
>>>>>>>
>>>>>>> Anton.
>>>>>>> --
>>>>>>> Anton Marchukov
>>>>>>> Senior Software Engineer - RHEV CI - Red Hat
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Anton Marchukov
>>>>>> Senior Software Engineer - RHEV CI - Red Hat
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>>> Devel mailing list
>>>>>> Devel at ovirt.org
>>>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Anton Marchukov
>>>> Senior Software Engineer - RHEV CI - Red Hat
>>>>
>>>>
>>>> _______________________________________________
>>>> Devel mailing list
>>>> Devel at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Devel mailing list
>>>> Devel at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>>
>>>
>>>
>>>
>>> --
>>> Eyal Edri
>>> Associate Manager
>>> RHV DevOps
>>> EMEA ENG Virtualization R&D
>>> Red Hat Israel
>>>
>>> phone: +972-9-7692018
>>> irc: eedri (on #tlv #rhev-dev #rhev-integ)
>>>
>>
>>
>>
>> --
>> Eyal Edri
>> Associate Manager
>> RHV DevOps
>> EMEA ENG Virtualization R&D
>> Red Hat Israel
>>
>> phone: +972-9-7692018
>> irc: eedri (on #tlv #rhev-dev #rhev-integ)
>>
>
>
>
> --
> Eyal Edri
> Associate Manager
> RHV DevOps
> EMEA ENG Virtualization R&D
> Red Hat Israel
>
> phone: +972-9-7692018
> irc: eedri (on #tlv #rhev-dev #rhev-integ)
>



-- 
Eyal Edri
Associate Manager
RHV DevOps
EMEA ENG Virtualization R&D
Red Hat Israel

phone: +972-9-7692018
irc: eedri (on #tlv #rhev-dev #rhev-integ)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20161204/ccfe5c45/attachment-0001.html>


More information about the Devel mailing list