[CQ]: 97841,2 (vdsm) failed "ovirt-master" system tests

newer
[JIRA] (OVIRT-2686) CQ vdsm are...

older
[JENKINS] Failed to setup proejct ...

oVirt Jenkins

26 Feb 2019 26 Feb '19

2 p.m.

Change 97841,2 (vdsm) is probably the reason behind recent system test failures in the "ovirt-master" change queue and needs to be fixed. This change had been removed from the testing queue. Artifacts build from this change will not be released until it is fixed. For further details about the change see: https://gerrit.ovirt.org/#/c/97841/2 For failed test results see: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/13141/

Show replies by date

Dafna Ron

26 Feb 26 Feb

2:20 p.m.

The error is caused by not being able to reach the storage. 2019-02-26 07:49:32,804-05 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-3) [] Operation Failed: [Storage domain cannot be reached. Please ensure it is accessible from the h ost(s).] this is not related to the patch which is adding a 4.3 branch. Adding Evgheni to rule out any issues on infra side as we have some random failures due to accessibility of vms in the past few days. On Tue, Feb 26, 2019 at 1:01 PM oVirt Jenkins <jenkins@ovirt.org> wrote:

...

Change 97841,2 (vdsm) is probably the reason behind recent system test failures in the "ovirt-master" change queue and needs to be fixed.

This change had been removed from the testing queue. Artifacts build from this change will not be released until it is fixed.

For further details about the change see: https://gerrit.ovirt.org/#/c/97841/2

For failed test results see: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/13141/ _______________________________________________ Infra mailing list -- infra@ovirt.org To unsubscribe send an email to infra-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/infra@ovirt.org/message/HZYTFKSGBEILYN...

Eyal Edri

27 Feb 27 Feb

9:05 a.m.

On Tue, Feb 26, 2019 at 3:21 PM Dafna Ron <dron@redhat.com> wrote:

...

The error is caused by not being able to reach the storage. 2019-02-26 07:49:32,804-05 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-3) [] Operation Failed: [Storage domain cannot be reached. Please ensure it is accessible from the h ost(s).]

this is not related to the patch which is adding a 4.3 branch.

Adding Evgheni to rule out any issues on infra side as we have some random failures due to accessibility of vms in the past few days.

Why would there be random failures accessing the VMs? I see vdsm was failing for a few runs already, it might be a real regression that needs to be investigated, and not a random failure. It might worth communicating to devel about it and getting someone to help with investigation if its possible.

...

On Tue, Feb 26, 2019 at 1:01 PM oVirt Jenkins <jenkins@ovirt.org> wrote:

...
Change 97841,2 (vdsm) is probably the reason behind recent system test failures in the "ovirt-master" change queue and needs to be fixed.

This change had been removed from the testing queue. Artifacts build from this change will not be released until it is fixed.

For further details about the change see: https://gerrit.ovirt.org/#/c/97841/2

For failed test results see: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/13141/ _______________________________________________ Infra mailing list -- infra@ovirt.org To unsubscribe send an email to infra-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/infra@ovirt.org/message/HZYTFKSGBEILYN...

_______________________________________________ Infra mailing list -- infra@ovirt.org To unsubscribe send an email to infra-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/infra@ovirt.org/message/TTGZIKPNNY7HOZ...

-- Eyal edri MANAGER RHV/CNV DevOps EMEA VIRTUALIZATION R&D Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)

Dafna Ron

9:47 a.m.

On Wed, Feb 27, 2019 at 8:05 AM Eyal Edri <eedri@redhat.com> wrote:

...

On Tue, Feb 26, 2019 at 3:21 PM Dafna Ron <dron@redhat.com> wrote:

...
The error is caused by not being able to reach the storage. 2019-02-26 07:49:32,804-05 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-3) [] Operation Failed: [Storage domain cannot be reached. Please ensure it is accessible from the h ost(s).]

this is not related to the patch which is adding a 4.3 branch.

Adding Evgheni to rule out any issues on infra side as we have some random failures due to accessibility of vms in the past few days.

Why would there be random failures accessing the VMs? I see vdsm was failing for a few runs already, it might be a real regression that needs to be investigated, and not a random failure. It might worth communicating to devel about it and getting someone to help with investigation if its possible.

1. its on several projects and on several tests 2. after re-running the changes they pass Slowness on servers/network can cause vms to startup more slowly which can cause certain tests to fail as we can see here. I am not saying that we should not involve dev, I am saying that i would like to make sure the infrastructure is ok since reporting issues on the list to developers when its not a regression can do more harm then good.

...

...
On Tue, Feb 26, 2019 at 1:01 PM oVirt Jenkins <jenkins@ovirt.org> wrote:

...
Change 97841,2 (vdsm) is probably the reason behind recent system test failures in the "ovirt-master" change queue and needs to be fixed.

This change had been removed from the testing queue. Artifacts build from this change will not be released until it is fixed.

For further details about the change see: https://gerrit.ovirt.org/#/c/97841/2

For failed test results see: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/13141/ _______________________________________________ Infra mailing list -- infra@ovirt.org To unsubscribe send an email to infra-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/infra@ovirt.org/message/HZYTFKSGBEILYN...

_______________________________________________ Infra mailing list -- infra@ovirt.org To unsubscribe send an email to infra-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/infra@ovirt.org/message/TTGZIKPNNY7HOZ...

--

Eyal edri

MANAGER

RHV/CNV DevOps

EMEA VIRTUALIZATION R&D

Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)

Galit Rosenthal

9:58 a.m.

It looks like vdsm is failing on the same server (ovirt-srv23.phx.ovirt.org) all the time with the same error. Looks like a infra issue, On Wed, Feb 27, 2019 at 10:48 AM Dafna Ron <dron@redhat.com> wrote:

...

On Wed, Feb 27, 2019 at 8:05 AM Eyal Edri <eedri@redhat.com> wrote:

...
On Tue, Feb 26, 2019 at 3:21 PM Dafna Ron <dron@redhat.com> wrote:

...
The error is caused by not being able to reach the storage. 2019-02-26 07:49:32,804-05 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-3) [] Operation Failed: [Storage domain cannot be reached. Please ensure it is accessible from the h ost(s).]

this is not related to the patch which is adding a 4.3 branch.

Adding Evgheni to rule out any issues on infra side as we have some random failures due to accessibility of vms in the past few days.

Why would there be random failures accessing the VMs? I see vdsm was failing for a few runs already, it might be a real regression that needs to be investigated, and not a random failure. It might worth communicating to devel about it and getting someone to help with investigation if its possible.

1. its on several projects and on several tests 2. after re-running the changes they pass

Slowness on servers/network can cause vms to startup more slowly which can cause certain tests to fail as we can see here.

I am not saying that we should not involve dev, I am saying that i would like to make sure the infrastructure is ok since reporting issues on the list to developers when its not a regression can do more harm then good.

...
...
On Tue, Feb 26, 2019 at 1:01 PM oVirt Jenkins <jenkins@ovirt.org> wrote:

...
Change 97841,2 (vdsm) is probably the reason behind recent system test failures in the "ovirt-master" change queue and needs to be fixed.

This change had been removed from the testing queue. Artifacts build from this change will not be released until it is fixed.

For further details about the change see: https://gerrit.ovirt.org/#/c/97841/2

For failed test results see: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/13141/ _______________________________________________ Infra mailing list -- infra@ovirt.org To unsubscribe send an email to infra-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/infra@ovirt.org/message/HZYTFKSGBEILYN...

_______________________________________________ Infra mailing list -- infra@ovirt.org To unsubscribe send an email to infra-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/infra@ovirt.org/message/TTGZIKPNNY7HOZ...

--

Eyal edri

MANAGER

RHV/CNV DevOps

EMEA VIRTUALIZATION R&D

Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)

-- GALIT ROSENTHAL SOFTWARE ENGINEER Red Hat <https://www.redhat.com/> galit@gmail.com T: 972-9-7692230 <https://red.ht/sig>

Dafna Ron

10:03 a.m.

also a ovirt-provider-ovn failed on same test on the same host (which means its not specific to vdsm and re-enforces that its infra) On Wed, Feb 27, 2019 at 8:59 AM Galit Rosenthal <grosenth@redhat.com> wrote:

...

It looks like vdsm is failing on the same server ( ovirt-srv23.phx.ovirt.org) all the time with the same error.

Looks like a infra issue,

On Wed, Feb 27, 2019 at 10:48 AM Dafna Ron <dron@redhat.com> wrote:

...
On Wed, Feb 27, 2019 at 8:05 AM Eyal Edri <eedri@redhat.com> wrote:

...
On Tue, Feb 26, 2019 at 3:21 PM Dafna Ron <dron@redhat.com> wrote:

...
The error is caused by not being able to reach the storage. 2019-02-26 07:49:32,804-05 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-3) [] Operation Failed: [Storage domain cannot be reached. Please ensure it is accessible from the h ost(s).]

this is not related to the patch which is adding a 4.3 branch.

Adding Evgheni to rule out any issues on infra side as we have some random failures due to accessibility of vms in the past few days.

Why would there be random failures accessing the VMs? I see vdsm was failing for a few runs already, it might be a real regression that needs to be investigated, and not a random failure. It might worth communicating to devel about it and getting someone to help with investigation if its possible.

1. its on several projects and on several tests 2. after re-running the changes they pass

Slowness on servers/network can cause vms to startup more slowly which can cause certain tests to fail as we can see here.

I am not saying that we should not involve dev, I am saying that i would like to make sure the infrastructure is ok since reporting issues on the list to developers when its not a regression can do more harm then good.

...
...
On Tue, Feb 26, 2019 at 1:01 PM oVirt Jenkins <jenkins@ovirt.org> wrote:

...
Change 97841,2 (vdsm) is probably the reason behind recent system test failures in the "ovirt-master" change queue and needs to be fixed.

This change had been removed from the testing queue. Artifacts build from this change will not be released until it is fixed.

For further details about the change see: https://gerrit.ovirt.org/#/c/97841/2

For failed test results see: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/13141/ _______________________________________________ Infra mailing list -- infra@ovirt.org To unsubscribe send an email to infra-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/infra@ovirt.org/message/HZYTFKSGBEILYN...

_______________________________________________ Infra mailing list -- infra@ovirt.org To unsubscribe send an email to infra-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/infra@ovirt.org/message/TTGZIKPNNY7HOZ...

--

Eyal edri

MANAGER

RHV/CNV DevOps

EMEA VIRTUALIZATION R&D

Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)

--

GALIT ROSENTHAL

SOFTWARE ENGINEER

Red Hat

<https://www.redhat.com/>

galit@gmail.com T: 972-9-7692230 <https://red.ht/sig>

2456

Age (days ago)

2457

Last active (days ago)

List overview

Download

5 comments

4 participants

participants (4)

Dafna Ron
Eyal Edri
Galit Rosenthal
oVirt Jenkins