On Wed, Jul 5, 2017 at 10:13 AM Eyal Edri <eedri@redhat.com> wrote:
On Wed, Jul 5, 2017 at 10:02 AM, Yaniv Kaul <ykaul@redhat.com> wrote:


On Wed, Jul 5, 2017 at 9:39 AM, Irit Goihman <igoihman@redhat.com> wrote:
https://gerrit.ovirt.org/#/c/78536 was indeed the offending patch, the change was reverted and OST should pass now.

- Do we know why?
- O-S-T seems to be a great tool for finding JSON-RPC/STOMP issues. I suggest running it on every change related to these. 

In addition, if we have collectd installed now, can't we add a test that will check if CPU/Memory consumption spike above the normal and fail before 
it reach actions like vm run/migration? 

on my list.
 

Y.
 

On Tue, Jul 4, 2017 at 5:19 PM, Eyal Edri <eedri@redhat.com> wrote:
Guys,

I think we proved which vdsm works ( git sha1: 28558d7 ) and what was the changelog since until the point it fails, so you have the list of changes and steps to reproduce locally.
Which again, is reproducible on CI and locally, so please go over the changes done or reproduce the problem locally and see the issue on a live system.





On Tue, Jul 4, 2017 at 5:07 PM, Piotr Kliczewski <piotr.kliczewski@gmail.com> wrote:
Looking at the last experimental job the reason of the failure is:

2017-07-04 09:39:10,491-04 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-18) [] Operation Failed: [Cannot run VM. There is no host that satisfies current scheduling constraints. See below for details:, The host lago-basic-suite-master-host0 did not satisfy internal filter CPUOverloaded because its CPU is too loaded.]

Do we think that vdsm increased its cpu consumption recently?

On Tue, Jul 4, 2017 at 3:54 PM, Irit Goihman <igoihman@redhat.com> wrote:
I've checked vdsm logs and couldn't find anything related to my change.
I'll run OST without my changes and see if it runs successfully.

On Tue, Jul 4, 2017 at 4:49 PM, Eyal Edri <eedri@redhat.com> wrote:


On Tue, Jul 4, 2017 at 4:29 PM, Dafna Ron <dron@redhat.com> wrote:
This issue is reproduced locally as well.

you can run the following to reproduce locally

./run_suite.sh -s http://jenkins.ovirt.org/job/vdsm_master_build-artifacts-el7-x86_64/2694/ basic-suite-master

you will have the environment still running which would allow to view the live environment.
if you have any issues please ping me and I will help any way I can.

Thanks,
Dafna



Here is the list of changes done from the vdsm that is verified ( in tested now ) to HEAD:

* 74b2276 - (HEAD -> master, origin/master, origin/HEAD) stomp: add integration tests for client reconnect (6 hours ago) Irit Goihman <igoihman@redhat.com>
* 2a2f6cd - stomp: set default heartbeat values and add grace period (6 hours ago) Irit Goihman <igoihman@redhat.com>
* 56c306a - tests: Make random uuid test repeatable (17 hours ago) Nir Soffer <nsoffer@redhat.com>
* 864d4e3 - python3: Fix UUID packing/unpacking on python 3 (17 hours ago) Nir Soffer <nsoffer@redhat.com>
* 4ac4221 - python3: Improve uuid packing tests (17 hours ago) Nir Soffer <nsoffer@redhat.com>
* d264c8d - python3: Run misc_test in python 3 (17 hours ago) Nir Soffer <nsoffer@redhat.com>
* f923b0b - storage: Added disk type change logging (18 hours ago) Denis Chaplygin <dchaplyg@redhat.com>
* f1d54a1 - net: Unneeded newline is added when updating only the mtu (25 hours ago) Edward Haas <edwardh@redhat.com>
* 9056d61 - virt: metadata: remove dead code (26 hours ago) Francesco Romani <fromani@redhat.com>
* 08982b4 - virt: network: use core.find_device_guest_address (31 hours ago) Francesco Romani <fromani@redhat.com>
* 62e2bc5 - python3: Run qcow2_test on python 3 (2 days ago) Nir Soffer <nsoffer@redhat.com>
* 42f5efb - stomp: implement client reconnect (2 days ago) Irit Goihman <igoihman@redhat.com>
 



On 07/04/2017 01:35 PM, Barak Korren wrote:


On 4 July 2017 at 14:32, Irit Goihman <igoihman@redhat.com> wrote:
https://gerrit.ovirt.org/#/c/78536 broke network functional tests but a fix was merged today: https://gerrit.ovirt.org/#/c/78925/

I tried to run OST with my fix yesterday and still encountered the same failures.

Here is a reproducer of the failure with the fix patch:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/1061/

So that was not it probably...


--
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted


_______________________________________________
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel



_______________________________________________
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel



--

Eyal edri


ASSOCIATE MANAGER

RHV DevOps

EMEA VIRTUALIZATION R&D


Red Hat EMEA

TRIED. TESTED. TRUSTED.
phone: +972-9-7692018
irc: eedri (on #tlv #rhev-dev #rhev-integ)



--

IRIT GOIHMAN

SOFTWARE ENGINEER

EMEA VIRTUALIZATION R&D

Red Hat EMEA


_______________________________________________
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel




--

Eyal edri


ASSOCIATE MANAGER

RHV DevOps

EMEA VIRTUALIZATION R&D


Red Hat EMEA

TRIED. TESTED. TRUSTED.
phone: +972-9-7692018
irc: eedri (on #tlv #rhev-dev #rhev-integ)



--

IRIT GOIHMAN

SOFTWARE ENGINEER

EMEA VIRTUALIZATION R&D

Red Hat EMEA


_______________________________________________
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel




--

Eyal edri


ASSOCIATE MANAGER

RHV DevOps

EMEA VIRTUALIZATION R&D


Red Hat EMEA

TRIED. TESTED. TRUSTED.
phone: +972-9-7692018
irc: eedri (on #tlv #rhev-dev #rhev-integ)