OST fails in 002_bootstrap_pytest.py - setup_storage.sh

Looks like infrastructure issue setting up storage on engine host. Here are 2 failing builds with unrelated changes: https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6677/ https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6678/ Is this a known issue? Error Message AssertionError: setup_storage.sh failed. Exit code is 1 assert 1 == 0 -1 +0 Stacktrace prefix = <ovirtlago.prefix.OvirtPrefix object at 0x7f6fd2b998d0> @pytest.mark.run(order=14) def test_configure_storage(prefix): engine = prefix.virt_env.engine_vm() result = engine.ssh( [ '/tmp/setup_storage.sh', ], )
assert result.code == 0, 'setup_storage.sh failed. Exit code is %s'
% result.code E AssertionError: setup_storage.sh failed. Exit code is 1 E assert 1 == 0 E -1 E +0 The pytest traceback is nice, but in this case it is does not show any useful info. Since we run a script using ssh, the error message should include the process stdout and stderr which probably can explain the failure. Also I wonder why this code is called as a test (test_configure_storage). This looks like setup step so it should run as a fixture. Nir

On Fri, Mar 20, 2020 at 9:35 PM Nir Soffer <nsoffer@redhat.com> wrote:
Looks like infrastructure issue setting up storage on engine host.
Here are 2 failing builds with unrelated changes: https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6677/ https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6678/
Rebuilding still fails in setup_storage: https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6679/testReport/ https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6680/testReport/
Is this a known issue?
Error Message
AssertionError: setup_storage.sh failed. Exit code is 1 assert 1 == 0 -1 +0
Stacktrace
prefix = <ovirtlago.prefix.OvirtPrefix object at 0x7f6fd2b998d0>
@pytest.mark.run(order=14) def test_configure_storage(prefix): engine = prefix.virt_env.engine_vm() result = engine.ssh( [ '/tmp/setup_storage.sh', ], )
assert result.code == 0, 'setup_storage.sh failed. Exit code is
%s' % result.code E AssertionError: setup_storage.sh failed. Exit code is 1 E assert 1 == 0 E -1 E +0
The pytest traceback is nice, but in this case it is does not show any useful info.
Since we run a script using ssh, the error message should include the process stdout and stderr which probably can explain the failure.
Also I wonder why this code is called as a test (test_configure_storage). This looks like setup step so it should run as a fixture.
Nir

On 3/21/20 1:18 AM, Nir Soffer wrote:
On Fri, Mar 20, 2020 at 9:35 PM Nir Soffer <nsoffer@redhat.com <mailto:nsoffer@redhat.com>> wrote:
Looks like infrastructure issue setting up storage on engine host.
Here are 2 failing builds with unrelated changes: https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6677/ https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6678/
Rebuilding still fails in setup_storage:
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6679/testReport/ https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6680/testReport/
Is this a known issue?
Error Message
AssertionError: setup_storage.sh failed. Exit code is 1 assert 1 == 0 -1 +0
Stacktrace
prefix = <ovirtlago.prefix.OvirtPrefix object at 0x7f6fd2b998d0>
@pytest.mark.run(order=14) def test_configure_storage(prefix): engine = prefix.virt_env.engine_vm() result = engine.ssh( [ '/tmp/setup_storage.sh', ], ) > assert result.code == 0, 'setup_storage.sh failed. Exit code is %s' % result.code E AssertionError: setup_storage.sh failed. Exit code is 1 E assert 1 == 0 E -1 E +0
The pytest traceback is nice, but in this case it is does not show any useful info.
Since we run a script using ssh, the error message should include the process stdout and stderr which probably can explain the failure.
I posted https://gerrit.ovirt.org/#/c/107830/ to improve logging during storage setup. Unfortunately AFAICS it didn't fail, so I guess we'll have to merge it and wait for a failed job to get some helpful logs.
Also I wonder why this code is called as a test (test_configure_storage). This looks like setup step so it should run as a fixture.
That's true, but the pytest porting effort was about providing a bare minimum to move away from nose. Organizing the tests into proper setup/fixtures is a huge task and will be probably implemented incrementally in the nearest future.
Nir

On Mon, Mar 23, 2020 at 1:25 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
On 3/21/20 1:18 AM, Nir Soffer wrote:
On Fri, Mar 20, 2020 at 9:35 PM Nir Soffer <nsoffer@redhat.com> wrote:
Looks like infrastructure issue setting up storage on engine host.
Here are 2 failing builds with unrelated changes: https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6677/ https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6678/
Rebuilding still fails in setup_storage:
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6679/testReport/ https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6680/testReport/
Is this a known issue?
Error Message
AssertionError: setup_storage.sh failed. Exit code is 1 assert 1 == 0 -1 +0
Stacktrace
prefix = <ovirtlago.prefix.OvirtPrefix object at 0x7f6fd2b998d0>
@pytest.mark.run(order=14) def test_configure_storage(prefix): engine = prefix.virt_env.engine_vm() result = engine.ssh( [ '/tmp/setup_storage.sh', ], )
assert result.code == 0, 'setup_storage.sh failed. Exit code is %s' % result.code
E AssertionError: setup_storage.sh failed. Exit code is 1 E assert 1 == 0 E -1 E +0
The pytest traceback is nice, but in this case it is does not show any useful info.
Since we run a script using ssh, the error message should include the process stdout and stderr which probably can explain the failure.
I posted https://gerrit.ovirt.org/#/c/107830/ to improve logging during storage setup. Unfortunately AFAICS it didn't fail, so I guess we'll have to merge it and wait for a failed job to get some helpful logs.
Thanks. It still fails for me with current code: https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6689/testReport/ Same when using current vdsm master.
Also I wonder why this code is called as a test (test_configure_storage). This looks like setup step so it should run as a fixture.
That's true, but the pytest porting effort was about providing a bare minimum to move away from nose. Organizing the tests into proper setup/fixtures is a huge task and will be probably implemented incrementally in the nearest future.
Understood

On 3/23/20 2:17 PM, Nir Soffer wrote:
On Mon, Mar 23, 2020 at 1:25 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
On 3/21/20 1:18 AM, Nir Soffer wrote:
On Fri, Mar 20, 2020 at 9:35 PM Nir Soffer <nsoffer@redhat.com> wrote:
Looks like infrastructure issue setting up storage on engine host.
Here are 2 failing builds with unrelated changes: https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6677/ https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6678/
Rebuilding still fails in setup_storage:
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6679/testReport/ https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6680/testReport/
Is this a known issue?
Error Message
AssertionError: setup_storage.sh failed. Exit code is 1 assert 1 == 0 -1 +0
Stacktrace
prefix = <ovirtlago.prefix.OvirtPrefix object at 0x7f6fd2b998d0>
@pytest.mark.run(order=14) def test_configure_storage(prefix): engine = prefix.virt_env.engine_vm() result = engine.ssh( [ '/tmp/setup_storage.sh', ], )
assert result.code == 0, 'setup_storage.sh failed. Exit code is %s' % result.code
E AssertionError: setup_storage.sh failed. Exit code is 1 E assert 1 == 0 E -1 E +0
The pytest traceback is nice, but in this case it is does not show any useful info.
Since we run a script using ssh, the error message should include the process stdout and stderr which probably can explain the failure.
I posted https://gerrit.ovirt.org/#/c/107830/ to improve logging during storage setup. Unfortunately AFAICS it didn't fail, so I guess we'll have to merge it and wait for a failed job to get some helpful logs.
Thanks.
It still fails for me with current code: https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6689/testReport/
Same when using current vdsm master. Updated the patch according to your suggestions and currently trying out OST for the 4th time - all previous runs succeeded. I guess I'm out of luck :)
Also I wonder why this code is called as a test (test_configure_storage). This looks like setup step so it should run as a fixture. That's true, but the pytest porting effort was about providing a bare minimum to move away from nose. Organizing the tests into proper setup/fixtures is a huge task and will be probably implemented incrementally in the nearest future. Understood

On Mon, Mar 23, 2020 at 3:26 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
On 3/23/20 2:17 PM, Nir Soffer wrote:
On Mon, Mar 23, 2020 at 1:25 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
On 3/21/20 1:18 AM, Nir Soffer wrote:
On Fri, Mar 20, 2020 at 9:35 PM Nir Soffer <nsoffer@redhat.com> wrote:
Looks like infrastructure issue setting up storage on engine host.
Here are 2 failing builds with unrelated changes: https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6677/ https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6678/
Rebuilding still fails in setup_storage:
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6679/testReport/ https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6680/testReport/
Is this a known issue?
Error Message
AssertionError: setup_storage.sh failed. Exit code is 1 assert 1 == 0 -1 +0
Stacktrace
prefix = <ovirtlago.prefix.OvirtPrefix object at 0x7f6fd2b998d0>
@pytest.mark.run(order=14) def test_configure_storage(prefix): engine = prefix.virt_env.engine_vm() result = engine.ssh( [ '/tmp/setup_storage.sh', ], )
assert result.code == 0, 'setup_storage.sh failed. Exit code is %s' % result.code
E AssertionError: setup_storage.sh failed. Exit code is 1 E assert 1 == 0 E -1 E +0
The pytest traceback is nice, but in this case it is does not show any useful info.
Since we run a script using ssh, the error message should include the process stdout and stderr which probably can explain the failure.
I posted https://gerrit.ovirt.org/#/c/107830/ to improve logging during storage setup. Unfortunately AFAICS it didn't fail, so I guess we'll have to merge it and wait for a failed job to get some helpful logs.
Thanks.
It still fails for me with current code: https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6689/testReport/
Same when using current vdsm master. Updated the patch according to your suggestions and currently trying out OST for the 4th time - all previous runs succeeded. I guess I'm out of luck :)
It succeeds on your local OST setup but fail on Jenkins?
Also I wonder why this code is called as a test (test_configure_storage). This looks like setup step so it should run as a fixture. That's true, but the pytest porting effort was about providing a bare minimum to move away from nose. Organizing the tests into proper setup/fixtures is a huge task and will be probably implemented incrementally in the nearest future. Understood

On 3/23/20 2:53 PM, Nir Soffer wrote:
On Mon, Mar 23, 2020 at 3:26 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
On 3/23/20 2:17 PM, Nir Soffer wrote:
On Mon, Mar 23, 2020 at 1:25 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
On 3/21/20 1:18 AM, Nir Soffer wrote:
On Fri, Mar 20, 2020 at 9:35 PM Nir Soffer <nsoffer@redhat.com> wrote:
Looks like infrastructure issue setting up storage on engine host.
Here are 2 failing builds with unrelated changes: https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6677/ https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6678/ Rebuilding still fails in setup_storage:
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6679/testReport/ https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6680/testReport/
Is this a known issue?
Error Message
AssertionError: setup_storage.sh failed. Exit code is 1 assert 1 == 0 -1 +0
Stacktrace
prefix = <ovirtlago.prefix.OvirtPrefix object at 0x7f6fd2b998d0>
@pytest.mark.run(order=14) def test_configure_storage(prefix): engine = prefix.virt_env.engine_vm() result = engine.ssh( [ '/tmp/setup_storage.sh', ], )
assert result.code == 0, 'setup_storage.sh failed. Exit code is %s' % result.code
E AssertionError: setup_storage.sh failed. Exit code is 1 E assert 1 == 0 E -1 E +0
The pytest traceback is nice, but in this case it is does not show any useful info.
Since we run a script using ssh, the error message should include the process stdout and stderr which probably can explain the failure. I posted https://gerrit.ovirt.org/#/c/107830/ to improve logging during storage setup. Unfortunately AFAICS it didn't fail, so I guess we'll have to merge it and wait for a failed job to get some helpful logs.
Thanks.
It still fails for me with current code: https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6689/testReport/
Same when using current vdsm master. Updated the patch according to your suggestions and currently trying out OST for the 4th time - all previous runs succeeded. I guess I'm out of luck :)
It succeeds on your local OST setup but fail on Jenkins? No, I mean jenkins - both check-patch runs didn't fail on this script. I also tried running OST manually twice and same thing happened. Anyway - the patch has been merged now so if any failure occurs in CQ we should know what's going on.
Also I wonder why this code is called as a test (test_configure_storage). This looks like setup step so it should run as a fixture. That's true, but the pytest porting effort was about providing a bare minimum to move away from nose. Organizing the tests into proper setup/fixtures is a huge task and will be probably implemented incrementally in the nearest future. Understood

On 3/23/20 3:10 PM, Marcin Sobczyk wrote:
On 3/23/20 2:53 PM, Nir Soffer wrote:
On Mon, Mar 23, 2020 at 3:26 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
On 3/23/20 2:17 PM, Nir Soffer wrote:
On Mon, Mar 23, 2020 at 1:25 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
On 3/21/20 1:18 AM, Nir Soffer wrote:
On Fri, Mar 20, 2020 at 9:35 PM Nir Soffer <nsoffer@redhat.com> wrote:
Looks like infrastructure issue setting up storage on engine host.
Here are 2 failing builds with unrelated changes: https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6677/ https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6678/ Rebuilding still fails in setup_storage:
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6679/testReport/
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6680/testReport/
Is this a known issue?
Error Message
AssertionError: setup_storage.sh failed. Exit code is 1 assert 1 == 0 -1 +0
Stacktrace
prefix = <ovirtlago.prefix.OvirtPrefix object at 0x7f6fd2b998d0>
@pytest.mark.run(order=14) def test_configure_storage(prefix): engine = prefix.virt_env.engine_vm() result = engine.ssh( [ '/tmp/setup_storage.sh', ], ) > assert result.code == 0, 'setup_storage.sh failed. Exit > code is %s' % result.code E AssertionError: setup_storage.sh failed. Exit code is 1 E assert 1 == 0 E -1 E +0
The pytest traceback is nice, but in this case it is does not show any useful info.
Since we run a script using ssh, the error message should include the process stdout and stderr which probably can explain the failure. I posted https://gerrit.ovirt.org/#/c/107830/ to improve logging during storage setup. Unfortunately AFAICS it didn't fail, so I guess we'll have to merge it and wait for a failed job to get some helpful logs.
Thanks.
It still fails for me with current code: https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6689/testReport/
Same when using current vdsm master. Updated the patch according to your suggestions and currently trying out OST for the 4th time - all previous runs succeeded. I guess I'm out of luck :)
It succeeds on your local OST setup but fail on Jenkins? No, I mean jenkins - both check-patch runs didn't fail on this script. I also tried running OST manually twice and same thing happened. Anyway - the patch has been merged now so if any failure occurs in CQ we should know what's going on.
Ok, finally caught a failure in CQ [1]: [2020-03-23T14:14:09.836Z] if result.code != 0: [2020-03-23T14:14:09.836Z] msg = ( [2020-03-23T14:14:09.836Z] 'setup_storage.sh failed with exit code: {}.\n' [2020-03-23T14:14:09.836Z] 'stdout:\n{}' [2020-03-23T14:14:09.836Z] 'stderr:\n{}' [2020-03-23T14:14:09.836Z] ).format(result.code, result.out, result.err) [2020-03-23T14:14:09.836Z] > raise RuntimeError(msg) [2020-03-23T14:14:09.836Z] E RuntimeError: setup_storage.sh failed with exit code: 1. [2020-03-23T14:14:09.836Z] E stdout: [2020-03-23T14:14:09.836Z] E Reposync & Extra Sources Content 0.0 B/s | 0 B 00:00 [2020-03-23T14:14:09.836Z] E stderr: [2020-03-23T14:14:09.836Z] E + set -xe [2020-03-23T14:14:09.836Z] E + MAIN_NFS_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_2 [2020-03-23T14:14:09.836Z] E + ISCSI_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_3 [2020-03-23T14:14:09.836Z] E + NUM_LUNS=5 [2020-03-23T14:14:09.836Z] E ++ uname -r [2020-03-23T14:14:09.836Z] E ++ awk -F. '{print $(NF-1)}' [2020-03-23T14:14:09.836Z] E + DIST=el8_1 [2020-03-23T14:14:09.836Z] E + main [2020-03-23T14:14:09.836Z] E ++ hostname [2020-03-23T14:14:09.836Z] E + [[ lago-basic-suite-master-engine == *\i\p\v\6* ]] [2020-03-23T14:14:09.836Z] E + install_deps [2020-03-23T14:14:09.836Z] E + systemctl disable --now kdump.service [2020-03-23T14:14:09.836Z] E Removed /etc/systemd/system/multi-user.target.wants/kdump.service. [2020-03-23T14:14:09.836Z] E + yum install --nogpgcheck -y nfs-utils rpcbind lvm2 targetcli sg3_utils iscsi-initiator-utils lsscsi policycoreutils-python-utils [2020-03-23T14:14:09.836Z] E Failed to download metadata for repo 'alocalsync' [2020-03-23T14:14:09.836Z] E Error: Failed to download metadata for repo 'alocalsync' [1] https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-master_change-que...
Also I wonder why this code is called as a test (test_configure_storage). This looks like setup step so it should run as a fixture. That's true, but the pytest porting effort was about providing a bare minimum to move away from nose. Organizing the tests into proper setup/fixtures is a huge task and will be probably implemented incrementally in the nearest future. Understood

On Mon, Mar 23, 2020 at 3:16 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
On 3/23/20 3:10 PM, Marcin Sobczyk wrote:
On 3/23/20 2:53 PM, Nir Soffer wrote:
On Mon, Mar 23, 2020 at 3:26 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
On 3/23/20 2:17 PM, Nir Soffer wrote:
On Mon, Mar 23, 2020 at 1:25 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
On 3/21/20 1:18 AM, Nir Soffer wrote:
On Fri, Mar 20, 2020 at 9:35 PM Nir Soffer <nsoffer@redhat.com> wrote: > Looks like infrastructure issue setting up storage on engine host. > > Here are 2 failing builds with unrelated changes: > https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6677/ > https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6678/ Rebuilding still fails in setup_storage:
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6679/testReport/
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6680/testReport/
> Is this a known issue? > > Error Message > > AssertionError: setup_storage.sh failed. Exit code is 1 assert 1 > == 0 -1 +0 > > Stacktrace > > prefix = <ovirtlago.prefix.OvirtPrefix object at 0x7f6fd2b998d0> > > @pytest.mark.run(order=14) > def test_configure_storage(prefix): > engine = prefix.virt_env.engine_vm() > result = engine.ssh( > [ > '/tmp/setup_storage.sh', > ], > ) >> assert result.code == 0, 'setup_storage.sh failed. Exit >> code is %s' % result.code > E AssertionError: setup_storage.sh failed. Exit code is 1 > E assert 1 == 0 > E -1 > E +0 > > > The pytest traceback is nice, but in this case it is does not > show any useful info. > > Since we run a script using ssh, the error message should include > the process stdout and stderr > which probably can explain the failure. I posted https://gerrit.ovirt.org/#/c/107830/ to improve logging during storage setup. Unfortunately AFAICS it didn't fail, so I guess we'll have to merge it and wait for a failed job to get some helpful logs.
Thanks.
It still fails for me with current code:
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6689/testReport/
Same when using current vdsm master.
Updated the patch according to your suggestions and currently trying out OST for the 4th time - all previous runs succeeded. I guess I'm out of luck :) It succeeds on your local OST setup but fail on Jenkins? No, I mean jenkins - both check-patch runs didn't fail on this script. I also tried running OST manually twice and same thing happened. Anyway - the patch has been merged now so if any failure occurs in CQ we should know what's going on. Ok, finally caught a failure in CQ [1]:
[2020-03-23T14:14:09.836Z] if result.code != 0: [2020-03-23T14:14:09.836Z] msg = ( [2020-03-23T14:14:09.836Z] 'setup_storage.sh failed with exit code: {}.\n' [2020-03-23T14:14:09.836Z] 'stdout:\n{}' [2020-03-23T14:14:09.836Z] 'stderr:\n{}' [2020-03-23T14:14:09.836Z] ).format(result.code, result.out, result.err) [2020-03-23T14:14:09.836Z] > raise RuntimeError(msg) [2020-03-23T14:14:09.836Z] E RuntimeError: setup_storage.sh failed with exit code: 1. [2020-03-23T14:14:09.836Z] E stdout: [2020-03-23T14:14:09.836Z] E Reposync & Extra Sources Content 0.0 B/s | 0 B 00:00 [2020-03-23T14:14:09.836Z] E stderr: [2020-03-23T14:14:09.836Z] E + set -xe [2020-03-23T14:14:09.836Z] E + MAIN_NFS_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_2 [2020-03-23T14:14:09.836Z] E + ISCSI_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_3 [2020-03-23T14:14:09.836Z] E + NUM_LUNS=5 [2020-03-23T14:14:09.836Z] E ++ uname -r [2020-03-23T14:14:09.836Z] E ++ awk -F. '{print $(NF-1)}' [2020-03-23T14:14:09.836Z] E + DIST=el8_1 [2020-03-23T14:14:09.836Z] E + main [2020-03-23T14:14:09.836Z] E ++ hostname [2020-03-23T14:14:09.836Z] E + [[ lago-basic-suite-master-engine == *\i\p\v\6* ]] [2020-03-23T14:14:09.836Z] E + install_deps [2020-03-23T14:14:09.836Z] E + systemctl disable --now kdump.service [2020-03-23T14:14:09.836Z] E Removed /etc/systemd/system/multi-user.target.wants/kdump.service. [2020-03-23T14:14:09.836Z] E + yum install --nogpgcheck -y nfs-utils rpcbind lvm2 targetcli sg3_utils iscsi-initiator-utils lsscsi policycoreutils-python-utils [2020-03-23T14:14:09.836Z] E Failed to download metadata for repo 'alocalsync' [2020-03-23T14:14:09.836Z] E Error: Failed to download metadata for repo 'alocalsync'
[1]
https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-master_change-que...
Galit, could you please take a look?
> Also I wonder why this code is called as a test > (test_configure_storage). This looks like setup > step so it should run as a fixture. That's true, but the pytest porting effort was about providing a bare minimum to move away from nose. Organizing the tests into proper setup/fixtures is a huge task and will be probably implemented incrementally in the nearest future. Understood
-- Martin Perina Manager, Software Engineering Red Hat Czech s.r.o.

I will look at it. On Mon, Mar 23, 2020 at 4:18 PM Martin Perina <mperina@redhat.com> wrote:
On Mon, Mar 23, 2020 at 3:16 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
On 3/23/20 3:10 PM, Marcin Sobczyk wrote:
On 3/23/20 2:53 PM, Nir Soffer wrote:
On Mon, Mar 23, 2020 at 3:26 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
On 3/23/20 2:17 PM, Nir Soffer wrote:
On Mon, Mar 23, 2020 at 1:25 PM Marcin Sobczyk <msobczyk@redhat.com> wrote: > > On 3/21/20 1:18 AM, Nir Soffer wrote: > > On Fri, Mar 20, 2020 at 9:35 PM Nir Soffer <nsoffer@redhat.com> > wrote: >> Looks like infrastructure issue setting up storage on engine host. >> >> Here are 2 failing builds with unrelated changes: >> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6677/ >> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6678/ > Rebuilding still fails in setup_storage: > >
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6679/testReport/
> > https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6680/testReport/ > > >> Is this a known issue? >> >> Error Message >> >> AssertionError: setup_storage.sh failed. Exit code is 1 assert 1 >> == 0 -1 +0 >> >> Stacktrace >> >> prefix = <ovirtlago.prefix.OvirtPrefix object at 0x7f6fd2b998d0> >> >> @pytest.mark.run(order=14) >> def test_configure_storage(prefix): >> engine = prefix.virt_env.engine_vm() >> result = engine.ssh( >> [ >> '/tmp/setup_storage.sh', >> ], >> ) >>> assert result.code == 0, 'setup_storage.sh failed. Exit >>> code is %s' % result.code >> E AssertionError: setup_storage.sh failed. Exit code is 1 >> E assert 1 == 0 >> E -1 >> E +0 >> >> >> The pytest traceback is nice, but in this case it is does not >> show any useful info. >> >> Since we run a script using ssh, the error message should include >> the process stdout and stderr >> which probably can explain the failure. > I posted https://gerrit.ovirt.org/#/c/107830/ to improve logging > during storage setup. > Unfortunately AFAICS it didn't fail, so I guess we'll have to > merge it and wait for a failed job to get some helpful logs. Thanks.
It still fails for me with current code:
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6689/testReport/
Same when using current vdsm master.
Updated the patch according to your suggestions and currently trying out OST for the 4th time - all previous runs succeeded. I guess I'm out of luck :) It succeeds on your local OST setup but fail on Jenkins? No, I mean jenkins - both check-patch runs didn't fail on this script. I also tried running OST manually twice and same thing happened. Anyway - the patch has been merged now so if any failure occurs in CQ we should know what's going on. Ok, finally caught a failure in CQ [1]:
[2020-03-23T14:14:09.836Z] if result.code != 0: [2020-03-23T14:14:09.836Z] msg = ( [2020-03-23T14:14:09.836Z] 'setup_storage.sh failed with exit code: {}.\n' [2020-03-23T14:14:09.836Z] 'stdout:\n{}' [2020-03-23T14:14:09.836Z] 'stderr:\n{}' [2020-03-23T14:14:09.836Z] ).format(result.code, result.out, result.err) [2020-03-23T14:14:09.836Z] > raise RuntimeError(msg) [2020-03-23T14:14:09.836Z] E RuntimeError: setup_storage.sh failed with exit code: 1. [2020-03-23T14:14:09.836Z] E stdout: [2020-03-23T14:14:09.836Z] E Reposync & Extra Sources Content 0.0 B/s | 0 B 00:00 [2020-03-23T14:14:09.836Z] E stderr: [2020-03-23T14:14:09.836Z] E + set -xe [2020-03-23T14:14:09.836Z] E + MAIN_NFS_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_2 [2020-03-23T14:14:09.836Z] E + ISCSI_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_3 [2020-03-23T14:14:09.836Z] E + NUM_LUNS=5 [2020-03-23T14:14:09.836Z] E ++ uname -r [2020-03-23T14:14:09.836Z] E ++ awk -F. '{print $(NF-1)}' [2020-03-23T14:14:09.836Z] E + DIST=el8_1 [2020-03-23T14:14:09.836Z] E + main [2020-03-23T14:14:09.836Z] E ++ hostname [2020-03-23T14:14:09.836Z] E + [[ lago-basic-suite-master-engine == *\i\p\v\6* ]] [2020-03-23T14:14:09.836Z] E + install_deps [2020-03-23T14:14:09.836Z] E + systemctl disable --now kdump.service [2020-03-23T14:14:09.836Z] E Removed /etc/systemd/system/multi-user.target.wants/kdump.service. [2020-03-23T14:14:09.836Z] E + yum install --nogpgcheck -y nfs-utils rpcbind lvm2 targetcli sg3_utils iscsi-initiator-utils lsscsi policycoreutils-python-utils [2020-03-23T14:14:09.836Z] E Failed to download metadata for repo 'alocalsync' [2020-03-23T14:14:09.836Z] E Error: Failed to download metadata for repo 'alocalsync'
[1]
https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-master_change-que...
Galit, could you please take a look?
>> Also I wonder why this code is called as a test >> (test_configure_storage). This looks like setup >> step so it should run as a fixture. > That's true, but the pytest porting effort was about providing a > bare minimum to move away from nose. > Organizing the tests into proper setup/fixtures is a huge task and > will be probably implemented > incrementally in the nearest future. Understood
-- Martin Perina Manager, Software Engineering Red Hat Czech s.r.o.
-- GALIT ROSENTHAL SOFTWARE ENGINEER Red Hat <https://www.redhat.com/> galit@redhat.com T: 972-9-7692230 <https://red.ht/sig>

I run it now locally using the extra sources as it runs in the CQ and it didn't fail for me. I will continue to investigate tomorrow, Marcin, did you see this issue also in check_patch or only in CQ? Regards, Galit On Mon, Mar 23, 2020 at 4:29 PM Galit Rosenthal <grosenth@redhat.com> wrote:
I will look at it.
On Mon, Mar 23, 2020 at 4:18 PM Martin Perina <mperina@redhat.com> wrote:
On Mon, Mar 23, 2020 at 3:16 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
On 3/23/20 3:10 PM, Marcin Sobczyk wrote:
On 3/23/20 2:53 PM, Nir Soffer wrote:
On Mon, Mar 23, 2020 at 3:26 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
On 3/23/20 2:17 PM, Nir Soffer wrote: > On Mon, Mar 23, 2020 at 1:25 PM Marcin Sobczyk > <msobczyk@redhat.com> wrote: >> >> On 3/21/20 1:18 AM, Nir Soffer wrote: >> >> On Fri, Mar 20, 2020 at 9:35 PM Nir Soffer <nsoffer@redhat.com> >> wrote: >>> Looks like infrastructure issue setting up storage on engine host. >>> >>> Here are 2 failing builds with unrelated changes: >>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6677/ >>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6678/ >> Rebuilding still fails in setup_storage: >> >>
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6679/testReport/
>> >> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6680/testReport/ >> >> >>> Is this a known issue? >>> >>> Error Message >>> >>> AssertionError: setup_storage.sh failed. Exit code is 1 assert 1 >>> == 0 -1 +0 >>> >>> Stacktrace >>> >>> prefix = <ovirtlago.prefix.OvirtPrefix object at 0x7f6fd2b998d0> >>> >>> @pytest.mark.run(order=14) >>> def test_configure_storage(prefix): >>> engine = prefix.virt_env.engine_vm() >>> result = engine.ssh( >>> [ >>> '/tmp/setup_storage.sh', >>> ], >>> ) >>>> assert result.code == 0, 'setup_storage.sh failed. Exit >>>> code is %s' % result.code >>> E AssertionError: setup_storage.sh failed. Exit code is 1 >>> E assert 1 == 0 >>> E -1 >>> E +0 >>> >>> >>> The pytest traceback is nice, but in this case it is does not >>> show any useful info. >>> >>> Since we run a script using ssh, the error message should include >>> the process stdout and stderr >>> which probably can explain the failure. >> I posted https://gerrit.ovirt.org/#/c/107830/ to improve logging >> during storage setup. >> Unfortunately AFAICS it didn't fail, so I guess we'll have to >> merge it and wait for a failed job to get some helpful logs. > Thanks. > > It still fails for me with current code: > https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6689/testReport/ > > > Same when using current vdsm master. Updated the patch according to your suggestions and currently trying out OST for the 4th time - all previous runs succeeded. I guess I'm out of luck :) It succeeds on your local OST setup but fail on Jenkins? No, I mean jenkins - both check-patch runs didn't fail on this script. I also tried running OST manually twice and same thing happened. Anyway - the patch has been merged now so if any failure occurs in CQ we should know what's going on. Ok, finally caught a failure in CQ [1]:
[2020-03-23T14:14:09.836Z] if result.code != 0: [2020-03-23T14:14:09.836Z] msg = ( [2020-03-23T14:14:09.836Z] 'setup_storage.sh failed with exit code: {}.\n' [2020-03-23T14:14:09.836Z] 'stdout:\n{}' [2020-03-23T14:14:09.836Z] 'stderr:\n{}' [2020-03-23T14:14:09.836Z] ).format(result.code, result.out, result.err) [2020-03-23T14:14:09.836Z] > raise RuntimeError(msg) [2020-03-23T14:14:09.836Z] E RuntimeError: setup_storage.sh failed with exit code: 1. [2020-03-23T14:14:09.836Z] E stdout: [2020-03-23T14:14:09.836Z] E Reposync & Extra Sources Content 0.0 B/s | 0 B 00:00 [2020-03-23T14:14:09.836Z] E stderr: [2020-03-23T14:14:09.836Z] E + set -xe [2020-03-23T14:14:09.836Z] E + MAIN_NFS_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_2 [2020-03-23T14:14:09.836Z] E + ISCSI_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_3 [2020-03-23T14:14:09.836Z] E + NUM_LUNS=5 [2020-03-23T14:14:09.836Z] E ++ uname -r [2020-03-23T14:14:09.836Z] E ++ awk -F. '{print $(NF-1)}' [2020-03-23T14:14:09.836Z] E + DIST=el8_1 [2020-03-23T14:14:09.836Z] E + main [2020-03-23T14:14:09.836Z] E ++ hostname [2020-03-23T14:14:09.836Z] E + [[ lago-basic-suite-master-engine == *\i\p\v\6* ]] [2020-03-23T14:14:09.836Z] E + install_deps [2020-03-23T14:14:09.836Z] E + systemctl disable --now kdump.service [2020-03-23T14:14:09.836Z] E Removed /etc/systemd/system/multi-user.target.wants/kdump.service. [2020-03-23T14:14:09.836Z] E + yum install --nogpgcheck -y nfs-utils rpcbind lvm2 targetcli sg3_utils iscsi-initiator-utils lsscsi policycoreutils-python-utils [2020-03-23T14:14:09.836Z] E Failed to download metadata for repo 'alocalsync' [2020-03-23T14:14:09.836Z] E Error: Failed to download metadata for repo 'alocalsync'
[1]
https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-master_change-que...
Galit, could you please take a look?
>>> Also I wonder why this code is called as a test >>> (test_configure_storage). This looks like setup >>> step so it should run as a fixture. >> That's true, but the pytest porting effort was about providing a >> bare minimum to move away from nose. >> Organizing the tests into proper setup/fixtures is a huge task and >> will be probably implemented >> incrementally in the nearest future. > Understood >
-- Martin Perina Manager, Software Engineering Red Hat Czech s.r.o.
--
GALIT ROSENTHAL
SOFTWARE ENGINEER
Red Hat
galit@redhat.com T: 972-9-7692230 <https://red.ht/sig>
-- GALIT ROSENTHAL SOFTWARE ENGINEER Red Hat <https://www.redhat.com/> galit@redhat.com T: 972-9-7692230 <https://red.ht/sig>

I run it now locally using the extra sources as it runs in the CQ and it didn't fail for me.
I will continue to investigate tomorrow,
Marcin, did you see this issue also in check_patch or only in CQ? I wasn't aware of the issue till Nir raised it - I was working with the
On 3/23/20 8:51 PM, Galit Rosenthal wrote: patch previously and both check-patch and manual runs were fine. I think it concerns only CQ then.
Regards, Galit
On Mon, Mar 23, 2020 at 4:29 PM Galit Rosenthal <grosenth@redhat.com <mailto:grosenth@redhat.com>> wrote:
I will look at it.
On Mon, Mar 23, 2020 at 4:18 PM Martin Perina <mperina@redhat.com <mailto:mperina@redhat.com>> wrote:
On Mon, Mar 23, 2020 at 3:16 PM Marcin Sobczyk <msobczyk@redhat.com <mailto:msobczyk@redhat.com>> wrote:
On 3/23/20 3:10 PM, Marcin Sobczyk wrote: > > > On 3/23/20 2:53 PM, Nir Soffer wrote: >> On Mon, Mar 23, 2020 at 3:26 PM Marcin Sobczyk <msobczyk@redhat.com <mailto:msobczyk@redhat.com>> >> wrote: >>> >>> >>> On 3/23/20 2:17 PM, Nir Soffer wrote: >>>> On Mon, Mar 23, 2020 at 1:25 PM Marcin Sobczyk >>>> <msobczyk@redhat.com <mailto:msobczyk@redhat.com>> wrote: >>>>> >>>>> On 3/21/20 1:18 AM, Nir Soffer wrote: >>>>> >>>>> On Fri, Mar 20, 2020 at 9:35 PM Nir Soffer <nsoffer@redhat.com <mailto:nsoffer@redhat.com>> >>>>> wrote: >>>>>> Looks like infrastructure issue setting up storage on engine host. >>>>>> >>>>>> Here are 2 failing builds with unrelated changes: >>>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6677/ >>>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6678/ >>>>> Rebuilding still fails in setup_storage: >>>>> >>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6679/testReport/
>>>>> >>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6680/testReport/
>>>>> >>>>> >>>>>> Is this a known issue? >>>>>> >>>>>> Error Message >>>>>> >>>>>> AssertionError: setup_storage.sh failed. Exit code is 1 assert 1 >>>>>> == 0 -1 +0 >>>>>> >>>>>> Stacktrace >>>>>> >>>>>> prefix = <ovirtlago.prefix.OvirtPrefix object at 0x7f6fd2b998d0> >>>>>> >>>>>> @pytest.mark.run(order=14) >>>>>> def test_configure_storage(prefix): >>>>>> engine = prefix.virt_env.engine_vm() >>>>>> result = engine.ssh( >>>>>> [ >>>>>> '/tmp/setup_storage.sh', >>>>>> ], >>>>>> ) >>>>>>> assert result.code == 0, 'setup_storage.sh failed. Exit >>>>>>> code is %s' % result.code >>>>>> E AssertionError: setup_storage.sh failed. Exit code is 1 >>>>>> E assert 1 == 0 >>>>>> E -1 >>>>>> E +0 >>>>>> >>>>>> >>>>>> The pytest traceback is nice, but in this case it is does not >>>>>> show any useful info. >>>>>> >>>>>> Since we run a script using ssh, the error message should include >>>>>> the process stdout and stderr >>>>>> which probably can explain the failure. >>>>> I posted https://gerrit.ovirt.org/#/c/107830/ to improve logging >>>>> during storage setup. >>>>> Unfortunately AFAICS it didn't fail, so I guess we'll have to >>>>> merge it and wait for a failed job to get some helpful logs. >>>> Thanks. >>>> >>>> It still fails for me with current code: >>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6689/testReport/
>>>> >>>> >>>> Same when using current vdsm master. >>> Updated the patch according to your suggestions and currently trying >>> out >>> OST for the 4th time - >>> all previous runs succeeded. I guess I'm out of luck :) >> It succeeds on your local OST setup but fail on Jenkins? > No, I mean jenkins - both check-patch runs didn't fail on this script. > I also tried running OST manually twice and same thing happened. > Anyway - the patch has been merged now so if any failure occurs in CQ > we should know what's going on. Ok, finally caught a failure in CQ [1]:
[2020-03-23T14:14:09.836Z] if result.code != 0: [2020-03-23T14:14:09.836Z] msg = ( [2020-03-23T14:14:09.836Z] 'setup_storage.sh failed with exit code: {}.\n' [2020-03-23T14:14:09.836Z] 'stdout:\n{}' [2020-03-23T14:14:09.836Z] 'stderr:\n{}' [2020-03-23T14:14:09.836Z] ).format(result.code, result.out, result.err) [2020-03-23T14:14:09.836Z] > raise RuntimeError(msg) [2020-03-23T14:14:09.836Z] E RuntimeError: setup_storage.sh failed with exit code: 1. [2020-03-23T14:14:09.836Z] E stdout: [2020-03-23T14:14:09.836Z] E Reposync & Extra Sources Content 0.0 B/s | 0 B 00:00 [2020-03-23T14:14:09.836Z] E stderr: [2020-03-23T14:14:09.836Z] E + set -xe [2020-03-23T14:14:09.836Z] E + MAIN_NFS_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_2 [2020-03-23T14:14:09.836Z] E + ISCSI_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_3 [2020-03-23T14:14:09.836Z] E + NUM_LUNS=5 [2020-03-23T14:14:09.836Z] E ++ uname -r [2020-03-23T14:14:09.836Z] E ++ awk -F. '{print $(NF-1)}' [2020-03-23T14:14:09.836Z] E + DIST=el8_1 [2020-03-23T14:14:09.836Z] E + main [2020-03-23T14:14:09.836Z] E ++ hostname [2020-03-23T14:14:09.836Z] E + [[ lago-basic-suite-master-engine == *\i\p\v\6* ]] [2020-03-23T14:14:09.836Z] E + install_deps [2020-03-23T14:14:09.836Z] E + systemctl disable --now kdump.service [2020-03-23T14:14:09.836Z] E Removed /etc/systemd/system/multi-user.target.wants/kdump.service. [2020-03-23T14:14:09.836Z] E + yum install --nogpgcheck -y nfs-utils rpcbind lvm2 targetcli sg3_utils iscsi-initiator-utils lsscsi policycoreutils-python-utils [2020-03-23T14:14:09.836Z] E Failed to download metadata for repo 'alocalsync' [2020-03-23T14:14:09.836Z] E Error: Failed to download metadata for repo 'alocalsync'
[1] https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-master_change-que...
Galit, could you please take a look?
> >> >>>>>> Also I wonder why this code is called as a test >>>>>> (test_configure_storage). This looks like setup >>>>>> step so it should run as a fixture. >>>>> That's true, but the pytest porting effort was about providing a >>>>> bare minimum to move away from nose. >>>>> Organizing the tests into proper setup/fixtures is a huge task and >>>>> will be probably implemented >>>>> incrementally in the nearest future. >>>> Understood >>>> >
-- Martin Perina Manager, Software Engineering Red Hat Czech s.r.o.
--
GALIT ROSENTHAL
SOFTWARE ENGINEER
Red Hat
galit@redhat.com <mailto:galit@redhat.com> T: 972-9-7692230 <tel:972-9-7692230>
--
GALIT ROSENTHAL
SOFTWARE ENGINEER
Red Hat
galit@redhat.com <mailto:galit@redhat.com> T: 972-9-7692230 <tel:972-9-7692230>

Hi Galit I can see the issue again - now in manual OST runs: https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests... Regards, Marcin On 3/23/20 10:09 PM, Marcin Sobczyk wrote:
I run it now locally using the extra sources as it runs in the CQ and it didn't fail for me.
I will continue to investigate tomorrow,
Marcin, did you see this issue also in check_patch or only in CQ? I wasn't aware of the issue till Nir raised it - I was working with
On 3/23/20 8:51 PM, Galit Rosenthal wrote: the patch previously and both check-patch and manual runs were fine. I think it concerns only CQ then.
Regards, Galit
On Mon, Mar 23, 2020 at 4:29 PM Galit Rosenthal <grosenth@redhat.com <mailto:grosenth@redhat.com>> wrote:
I will look at it.
On Mon, Mar 23, 2020 at 4:18 PM Martin Perina <mperina@redhat.com <mailto:mperina@redhat.com>> wrote:
On Mon, Mar 23, 2020 at 3:16 PM Marcin Sobczyk <msobczyk@redhat.com <mailto:msobczyk@redhat.com>> wrote:
On 3/23/20 3:10 PM, Marcin Sobczyk wrote: > > > On 3/23/20 2:53 PM, Nir Soffer wrote: >> On Mon, Mar 23, 2020 at 3:26 PM Marcin Sobczyk <msobczyk@redhat.com <mailto:msobczyk@redhat.com>> >> wrote: >>> >>> >>> On 3/23/20 2:17 PM, Nir Soffer wrote: >>>> On Mon, Mar 23, 2020 at 1:25 PM Marcin Sobczyk >>>> <msobczyk@redhat.com <mailto:msobczyk@redhat.com>> wrote: >>>>> >>>>> On 3/21/20 1:18 AM, Nir Soffer wrote: >>>>> >>>>> On Fri, Mar 20, 2020 at 9:35 PM Nir Soffer <nsoffer@redhat.com <mailto:nsoffer@redhat.com>> >>>>> wrote: >>>>>> Looks like infrastructure issue setting up storage on engine host. >>>>>> >>>>>> Here are 2 failing builds with unrelated changes: >>>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6677/ >>>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6678/ >>>>> Rebuilding still fails in setup_storage: >>>>> >>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6679/testReport/
>>>>> >>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6680/testReport/
>>>>> >>>>> >>>>>> Is this a known issue? >>>>>> >>>>>> Error Message >>>>>> >>>>>> AssertionError: setup_storage.sh failed. Exit code is 1 assert 1 >>>>>> == 0 -1 +0 >>>>>> >>>>>> Stacktrace >>>>>> >>>>>> prefix = <ovirtlago.prefix.OvirtPrefix object at 0x7f6fd2b998d0> >>>>>> >>>>>> @pytest.mark.run(order=14) >>>>>> def test_configure_storage(prefix): >>>>>> engine = prefix.virt_env.engine_vm() >>>>>> result = engine.ssh( >>>>>> [ >>>>>> '/tmp/setup_storage.sh', >>>>>> ], >>>>>> ) >>>>>>> assert result.code == 0, 'setup_storage.sh failed. Exit >>>>>>> code is %s' % result.code >>>>>> E AssertionError: setup_storage.sh failed. Exit code is 1 >>>>>> E assert 1 == 0 >>>>>> E -1 >>>>>> E +0 >>>>>> >>>>>> >>>>>> The pytest traceback is nice, but in this case it is does not >>>>>> show any useful info. >>>>>> >>>>>> Since we run a script using ssh, the error message should include >>>>>> the process stdout and stderr >>>>>> which probably can explain the failure. >>>>> I posted https://gerrit.ovirt.org/#/c/107830/ to improve logging >>>>> during storage setup. >>>>> Unfortunately AFAICS it didn't fail, so I guess we'll have to >>>>> merge it and wait for a failed job to get some helpful logs. >>>> Thanks. >>>> >>>> It still fails for me with current code: >>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6689/testReport/
>>>> >>>> >>>> Same when using current vdsm master. >>> Updated the patch according to your suggestions and currently trying >>> out >>> OST for the 4th time - >>> all previous runs succeeded. I guess I'm out of luck :) >> It succeeds on your local OST setup but fail on Jenkins? > No, I mean jenkins - both check-patch runs didn't fail on this script. > I also tried running OST manually twice and same thing happened. > Anyway - the patch has been merged now so if any failure occurs in CQ > we should know what's going on. Ok, finally caught a failure in CQ [1]:
[2020-03-23T14:14:09.836Z] if result.code != 0: [2020-03-23T14:14:09.836Z] msg = ( [2020-03-23T14:14:09.836Z] 'setup_storage.sh failed with exit code: {}.\n' [2020-03-23T14:14:09.836Z] 'stdout:\n{}' [2020-03-23T14:14:09.836Z] 'stderr:\n{}' [2020-03-23T14:14:09.836Z] ).format(result.code, result.out, result.err) [2020-03-23T14:14:09.836Z] > raise RuntimeError(msg) [2020-03-23T14:14:09.836Z] E RuntimeError: setup_storage.sh failed with exit code: 1. [2020-03-23T14:14:09.836Z] E stdout: [2020-03-23T14:14:09.836Z] E Reposync & Extra Sources Content 0.0 B/s | 0 B 00:00 [2020-03-23T14:14:09.836Z] E stderr: [2020-03-23T14:14:09.836Z] E + set -xe [2020-03-23T14:14:09.836Z] E + MAIN_NFS_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_2 [2020-03-23T14:14:09.836Z] E + ISCSI_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_3 [2020-03-23T14:14:09.836Z] E + NUM_LUNS=5 [2020-03-23T14:14:09.836Z] E ++ uname -r [2020-03-23T14:14:09.836Z] E ++ awk -F. '{print $(NF-1)}' [2020-03-23T14:14:09.836Z] E + DIST=el8_1 [2020-03-23T14:14:09.836Z] E + main [2020-03-23T14:14:09.836Z] E ++ hostname [2020-03-23T14:14:09.836Z] E + [[ lago-basic-suite-master-engine == *\i\p\v\6* ]] [2020-03-23T14:14:09.836Z] E + install_deps [2020-03-23T14:14:09.836Z] E + systemctl disable --now kdump.service [2020-03-23T14:14:09.836Z] E Removed /etc/systemd/system/multi-user.target.wants/kdump.service. [2020-03-23T14:14:09.836Z] E + yum install --nogpgcheck -y nfs-utils rpcbind lvm2 targetcli sg3_utils iscsi-initiator-utils lsscsi policycoreutils-python-utils [2020-03-23T14:14:09.836Z] E Failed to download metadata for repo 'alocalsync' [2020-03-23T14:14:09.836Z] E Error: Failed to download metadata for repo 'alocalsync'
[1] https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-master_change-que...
Galit, could you please take a look?
> >> >>>>>> Also I wonder why this code is called as a test >>>>>> (test_configure_storage). This looks like setup >>>>>> step so it should run as a fixture. >>>>> That's true, but the pytest porting effort was about providing a >>>>> bare minimum to move away from nose. >>>>> Organizing the tests into proper setup/fixtures is a huge task and >>>>> will be probably implemented >>>>> incrementally in the nearest future. >>>> Understood >>>> >
-- Martin Perina Manager, Software Engineering Red Hat Czech s.r.o.
--
GALIT ROSENTHAL
SOFTWARE ENGINEER
Red Hat
galit@redhat.com <mailto:galit@redhat.com> T: 972-9-7692230 <tel:972-9-7692230>
--
GALIT ROSENTHAL
SOFTWARE ENGINEER
Red Hat
galit@redhat.com <mailto:galit@redhat.com> T: 972-9-7692230 <tel:972-9-7692230>

It looks like the local repo stops running. When I run curl before the failure just to check the status, I can see it isn't accessible. I'm trying to see where it fails or what cause it to fail. I manage to reproduce on BM On Mon, Mar 30, 2020 at 6:23 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
Hi Galit
I can see the issue again - now in manual OST runs:
https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests...
Regards, Marcin
On 3/23/20 10:09 PM, Marcin Sobczyk wrote:
On 3/23/20 8:51 PM, Galit Rosenthal wrote:
I run it now locally using the extra sources as it runs in the CQ and it didn't fail for me.
I will continue to investigate tomorrow,
Marcin, did you see this issue also in check_patch or only in CQ?
I wasn't aware of the issue till Nir raised it - I was working with the patch previously and both check-patch and manual runs were fine. I think it concerns only CQ then.
Regards, Galit
On Mon, Mar 23, 2020 at 4:29 PM Galit Rosenthal <grosenth@redhat.com> wrote:
I will look at it.
On Mon, Mar 23, 2020 at 4:18 PM Martin Perina <mperina@redhat.com> wrote:
On Mon, Mar 23, 2020 at 3:16 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
On 3/23/20 3:10 PM, Marcin Sobczyk wrote:
On 3/23/20 2:53 PM, Nir Soffer wrote:
On Mon, Mar 23, 2020 at 3:26 PM Marcin Sobczyk <msobczyk@redhat.com>
wrote: > > > On 3/23/20 2:17 PM, Nir Soffer wrote: >> On Mon, Mar 23, 2020 at 1:25 PM Marcin Sobczyk >> <msobczyk@redhat.com> wrote: >>> >>> On 3/21/20 1:18 AM, Nir Soffer wrote: >>> >>> On Fri, Mar 20, 2020 at 9:35 PM Nir Soffer <nsoffer@redhat.com> >>> wrote: >>>> Looks like infrastructure issue setting up storage on engine host. >>>> >>>> Here are 2 failing builds with unrelated changes: >>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6677/ >>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6678/ >>> Rebuilding still fails in setup_storage: >>> >>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6679/testReport/ >>> >>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6680/testReport/ >>> >>> >>>> Is this a known issue? >>>> >>>> Error Message >>>> >>>> AssertionError: setup_storage.sh failed. Exit code is 1 assert 1 >>>> == 0 -1 +0 >>>> >>>> Stacktrace >>>> >>>> prefix = <ovirtlago.prefix.OvirtPrefix object at 0x7f6fd2b998d0> >>>> >>>> @pytest.mark.run(order=14) >>>> def test_configure_storage(prefix): >>>> engine = prefix.virt_env.engine_vm() >>>> result = engine.ssh( >>>> [ >>>> '/tmp/setup_storage.sh', >>>> ], >>>> ) >>>>> assert result.code == 0, 'setup_storage.sh failed. Exit >>>>> code is %s' % result.code >>>> E AssertionError: setup_storage.sh failed. Exit code is 1 >>>> E assert 1 == 0 >>>> E -1 >>>> E +0 >>>> >>>> >>>> The pytest traceback is nice, but in this case it is does not >>>> show any useful info. >>>> >>>> Since we run a script using ssh, the error message should include >>>> the process stdout and stderr >>>> which probably can explain the failure. >>> I posted https://gerrit.ovirt.org/#/c/107830/ to improve logging >>> during storage setup. >>> Unfortunately AFAICS it didn't fail, so I guess we'll have to >>> merge it and wait for a failed job to get some helpful logs. >> Thanks. >> >> It still fails for me with current code: >> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6689/testReport/ >> >> >> Same when using current vdsm master. > Updated the patch according to your suggestions and currently trying > out > OST for the 4th time - > all previous runs succeeded. I guess I'm out of luck :) It succeeds on your local OST setup but fail on Jenkins? No, I mean jenkins - both check-patch runs didn't fail on this script. I also tried running OST manually twice and same thing happened. Anyway - the patch has been merged now so if any failure occurs in CQ we should know what's going on. Ok, finally caught a failure in CQ [1]:
[2020-03-23T14:14:09.836Z] if result.code != 0: [2020-03-23T14:14:09.836Z] msg = ( [2020-03-23T14:14:09.836Z] 'setup_storage.sh failed with exit code: {}.\n' [2020-03-23T14:14:09.836Z] 'stdout:\n{}' [2020-03-23T14:14:09.836Z] 'stderr:\n{}' [2020-03-23T14:14:09.836Z] ).format(result.code, result.out, result.err) [2020-03-23T14:14:09.836Z] > raise RuntimeError(msg) [2020-03-23T14:14:09.836Z] E RuntimeError: setup_storage.sh failed with exit code: 1. [2020-03-23T14:14:09.836Z] E stdout: [2020-03-23T14:14:09.836Z] E Reposync & Extra Sources Content 0.0 B/s | 0 B 00:00 [2020-03-23T14:14:09.836Z] E stderr: [2020-03-23T14:14:09.836Z] E + set -xe [2020-03-23T14:14:09.836Z] E + MAIN_NFS_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_2 [2020-03-23T14:14:09.836Z] E + ISCSI_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_3 [2020-03-23T14:14:09.836Z] E + NUM_LUNS=5 [2020-03-23T14:14:09.836Z] E ++ uname -r [2020-03-23T14:14:09.836Z] E ++ awk -F. '{print $(NF-1)}' [2020-03-23T14:14:09.836Z] E + DIST=el8_1 [2020-03-23T14:14:09.836Z] E + main [2020-03-23T14:14:09.836Z] E ++ hostname [2020-03-23T14:14:09.836Z] E + [[ lago-basic-suite-master-engine == *\i\p\v\6* ]] [2020-03-23T14:14:09.836Z] E + install_deps [2020-03-23T14:14:09.836Z] E + systemctl disable --now kdump.service [2020-03-23T14:14:09.836Z] E Removed /etc/systemd/system/multi-user.target.wants/kdump.service. [2020-03-23T14:14:09.836Z] E + yum install --nogpgcheck -y nfs-utils rpcbind lvm2 targetcli sg3_utils iscsi-initiator-utils lsscsi policycoreutils-python-utils [2020-03-23T14:14:09.836Z] E Failed to download metadata for repo 'alocalsync' [2020-03-23T14:14:09.836Z] E Error: Failed to download metadata for repo 'alocalsync'
[1]
https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-master_change-que...
Galit, could you please take a look?
>>>> Also I wonder why this code is called as a test >>>> (test_configure_storage). This looks like setup >>>> step so it should run as a fixture. >>> That's true, but the pytest porting effort was about providing a >>> bare minimum to move away from nose. >>> Organizing the tests into proper setup/fixtures is a huge task
and
>>> will be probably implemented >>> incrementally in the nearest future. >> Understood >>
-- Martin Perina Manager, Software Engineering Red Hat Czech s.r.o.
--
GALIT ROSENTHAL
SOFTWARE ENGINEER
Red Hat
galit@redhat.com T: 972-9-7692230 <https://red.ht/sig>
--
GALIT ROSENTHAL
SOFTWARE ENGINEER
Red Hat
galit@redhat.com T: 972-9-7692230 <https://red.ht/sig>
-- GALIT ROSENTHAL SOFTWARE ENGINEER Red Hat <https://www.redhat.com/> galit@redhat.com T: 972-9-7692230 <https://red.ht/sig>

On Mon, Mar 30, 2020 at 5:38 PM Galit Rosenthal <grosenth@redhat.com> wrote:
It looks like the local repo stops running. When I run curl before the failure just to check the status, I can see it isn't accessible.
I'm trying to see where it fails or what cause it to fail.
I manage to reproduce on BM
I thought that moving setup_storage will mitigate the issue: https://gerrit.ovirt.org/#/c/107989/ But it just postponed the error to further phase, now adding host failing to the same issue: Failed to download metadata for repo 'alocalsync' https://jenkins.ovirt.org/view/oVirt system tests/job/ovirt-system-tests_manual/6710 So Galit, please take a look, oVirt CQ is suffering from this issue for more than a week now
On Mon, Mar 30, 2020 at 6:23 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
Hi Galit
I can see the issue again - now in manual OST runs:
https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests...
Regards, Marcin
On 3/23/20 10:09 PM, Marcin Sobczyk wrote:
On 3/23/20 8:51 PM, Galit Rosenthal wrote:
I run it now locally using the extra sources as it runs in the CQ and it didn't fail for me.
I will continue to investigate tomorrow,
Marcin, did you see this issue also in check_patch or only in CQ?
I wasn't aware of the issue till Nir raised it - I was working with the patch previously and both check-patch and manual runs were fine. I think it concerns only CQ then.
Regards, Galit
On Mon, Mar 23, 2020 at 4:29 PM Galit Rosenthal <grosenth@redhat.com> wrote:
I will look at it.
On Mon, Mar 23, 2020 at 4:18 PM Martin Perina <mperina@redhat.com> wrote:
On Mon, Mar 23, 2020 at 3:16 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
On 3/23/20 3:10 PM, Marcin Sobczyk wrote:
On 3/23/20 2:53 PM, Nir Soffer wrote: > On Mon, Mar 23, 2020 at 3:26 PM Marcin Sobczyk <msobczyk@redhat.com>
> wrote: >> >> >> On 3/23/20 2:17 PM, Nir Soffer wrote: >>> On Mon, Mar 23, 2020 at 1:25 PM Marcin Sobczyk >>> <msobczyk@redhat.com> wrote: >>>> >>>> On 3/21/20 1:18 AM, Nir Soffer wrote: >>>> >>>> On Fri, Mar 20, 2020 at 9:35 PM Nir Soffer <nsoffer@redhat.com> >>>> wrote: >>>>> Looks like infrastructure issue setting up storage on engine host. >>>>> >>>>> Here are 2 failing builds with unrelated changes: >>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6677/ >>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6678/ >>>> Rebuilding still fails in setup_storage: >>>> >>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6679/testReport/ >>>> >>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6680/testReport/ >>>> >>>> >>>>> Is this a known issue? >>>>> >>>>> Error Message >>>>> >>>>> AssertionError: setup_storage.sh failed. Exit code is 1 assert 1 >>>>> == 0 -1 +0 >>>>> >>>>> Stacktrace >>>>> >>>>> prefix = <ovirtlago.prefix.OvirtPrefix object at 0x7f6fd2b998d0> >>>>> >>>>> @pytest.mark.run(order=14) >>>>> def test_configure_storage(prefix): >>>>> engine = prefix.virt_env.engine_vm() >>>>> result = engine.ssh( >>>>> [ >>>>> '/tmp/setup_storage.sh', >>>>> ], >>>>> ) >>>>>> assert result.code == 0, 'setup_storage.sh failed. Exit >>>>>> code is %s' % result.code >>>>> E AssertionError: setup_storage.sh failed. Exit code is 1 >>>>> E assert 1 == 0 >>>>> E -1 >>>>> E +0 >>>>> >>>>> >>>>> The pytest traceback is nice, but in this case it is does not >>>>> show any useful info. >>>>> >>>>> Since we run a script using ssh, the error message should include >>>>> the process stdout and stderr >>>>> which probably can explain the failure. >>>> I posted https://gerrit.ovirt.org/#/c/107830/ to improve logging >>>> during storage setup. >>>> Unfortunately AFAICS it didn't fail, so I guess we'll have to >>>> merge it and wait for a failed job to get some helpful logs. >>> Thanks. >>> >>> It still fails for me with current code: >>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6689/testReport/ >>> >>> >>> Same when using current vdsm master. >> Updated the patch according to your suggestions and currently trying >> out >> OST for the 4th time - >> all previous runs succeeded. I guess I'm out of luck :) > It succeeds on your local OST setup but fail on Jenkins? No, I mean jenkins - both check-patch runs didn't fail on this script. I also tried running OST manually twice and same thing happened. Anyway - the patch has been merged now so if any failure occurs in CQ we should know what's going on. Ok, finally caught a failure in CQ [1]:
[2020-03-23T14:14:09.836Z] if result.code != 0: [2020-03-23T14:14:09.836Z] msg = ( [2020-03-23T14:14:09.836Z] 'setup_storage.sh failed with exit code: {}.\n' [2020-03-23T14:14:09.836Z] 'stdout:\n{}' [2020-03-23T14:14:09.836Z] 'stderr:\n{}' [2020-03-23T14:14:09.836Z] ).format(result.code, result.out, result.err) [2020-03-23T14:14:09.836Z] > raise RuntimeError(msg) [2020-03-23T14:14:09.836Z] E RuntimeError: setup_storage.sh failed with exit code: 1. [2020-03-23T14:14:09.836Z] E stdout: [2020-03-23T14:14:09.836Z] E Reposync & Extra Sources Content 0.0 B/s | 0 B 00:00 [2020-03-23T14:14:09.836Z] E stderr: [2020-03-23T14:14:09.836Z] E + set -xe [2020-03-23T14:14:09.836Z] E + MAIN_NFS_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_2 [2020-03-23T14:14:09.836Z] E + ISCSI_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_3 [2020-03-23T14:14:09.836Z] E + NUM_LUNS=5 [2020-03-23T14:14:09.836Z] E ++ uname -r [2020-03-23T14:14:09.836Z] E ++ awk -F. '{print $(NF-1)}' [2020-03-23T14:14:09.836Z] E + DIST=el8_1 [2020-03-23T14:14:09.836Z] E + main [2020-03-23T14:14:09.836Z] E ++ hostname [2020-03-23T14:14:09.836Z] E + [[ lago-basic-suite-master-engine == *\i\p\v\6* ]] [2020-03-23T14:14:09.836Z] E + install_deps [2020-03-23T14:14:09.836Z] E + systemctl disable --now kdump.service [2020-03-23T14:14:09.836Z] E Removed /etc/systemd/system/multi-user.target.wants/kdump.service. [2020-03-23T14:14:09.836Z] E + yum install --nogpgcheck -y nfs-utils rpcbind lvm2 targetcli sg3_utils iscsi-initiator-utils lsscsi policycoreutils-python-utils [2020-03-23T14:14:09.836Z] E Failed to download metadata for repo 'alocalsync' [2020-03-23T14:14:09.836Z] E Error: Failed to download metadata for repo 'alocalsync'
[1]
https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-master_change-que...
Galit, could you please take a look?
> >>>>> Also I wonder why this code is called as a test >>>>> (test_configure_storage). This looks like setup >>>>> step so it should run as a fixture. >>>> That's true, but the pytest porting effort was about providing a >>>> bare minimum to move away from nose. >>>> Organizing the tests into proper setup/fixtures is a huge task
and
>>>> will be probably implemented >>>> incrementally in the nearest future. >>> Understood >>>
-- Martin Perina Manager, Software Engineering Red Hat Czech s.r.o.
--
GALIT ROSENTHAL
SOFTWARE ENGINEER
Red Hat
galit@redhat.com T: 972-9-7692230 <https://red.ht/sig>
--
GALIT ROSENTHAL
SOFTWARE ENGINEER
Red Hat
galit@redhat.com T: 972-9-7692230 <https://red.ht/sig>
--
GALIT ROSENTHAL
SOFTWARE ENGINEER
Red Hat
galit@redhat.com T: 972-9-7692230 <https://red.ht/sig>
-- Martin Perina Manager, Software Engineering Red Hat Czech s.r.o.

After investigating it looks like the issues started when this patch was merged. Marcin, can you help debug it. https://gerrit.ovirt.org/#/c/107399/ Thanks Galit On Mon, Mar 30, 2020 at 6:42 PM Martin Perina <mperina@redhat.com> wrote:
On Mon, Mar 30, 2020 at 5:38 PM Galit Rosenthal <grosenth@redhat.com> wrote:
It looks like the local repo stops running. When I run curl before the failure just to check the status, I can see it isn't accessible.
I'm trying to see where it fails or what cause it to fail.
I manage to reproduce on BM
I thought that moving setup_storage will mitigate the issue: https://gerrit.ovirt.org/#/c/107989/ But it just postponed the error to further phase, now adding host failing to the same issue: Failed to download metadata for repo 'alocalsync'
https://jenkins.ovirt.org/view/oVirt system tests/job/ovirt-system-tests_manual/6710
So Galit, please take a look, oVirt CQ is suffering from this issue for more than a week now
On Mon, Mar 30, 2020 at 6:23 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
Hi Galit
I can see the issue again - now in manual OST runs:
https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests...
Regards, Marcin
On 3/23/20 10:09 PM, Marcin Sobczyk wrote:
On 3/23/20 8:51 PM, Galit Rosenthal wrote:
I run it now locally using the extra sources as it runs in the CQ and it didn't fail for me.
I will continue to investigate tomorrow,
Marcin, did you see this issue also in check_patch or only in CQ?
I wasn't aware of the issue till Nir raised it - I was working with the patch previously and both check-patch and manual runs were fine. I think it concerns only CQ then.
Regards, Galit
On Mon, Mar 23, 2020 at 4:29 PM Galit Rosenthal <grosenth@redhat.com> wrote:
I will look at it.
On Mon, Mar 23, 2020 at 4:18 PM Martin Perina <mperina@redhat.com> wrote:
On Mon, Mar 23, 2020 at 3:16 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
On 3/23/20 3:10 PM, Marcin Sobczyk wrote: > > > On 3/23/20 2:53 PM, Nir Soffer wrote: >> On Mon, Mar 23, 2020 at 3:26 PM Marcin Sobczyk < msobczyk@redhat.com> >> wrote: >>> >>> >>> On 3/23/20 2:17 PM, Nir Soffer wrote: >>>> On Mon, Mar 23, 2020 at 1:25 PM Marcin Sobczyk >>>> <msobczyk@redhat.com> wrote: >>>>> >>>>> On 3/21/20 1:18 AM, Nir Soffer wrote: >>>>> >>>>> On Fri, Mar 20, 2020 at 9:35 PM Nir Soffer <nsoffer@redhat.com>
>>>>> wrote: >>>>>> Looks like infrastructure issue setting up storage on engine host. >>>>>> >>>>>> Here are 2 failing builds with unrelated changes: >>>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6677/ >>>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6678/ >>>>> Rebuilding still fails in setup_storage: >>>>> >>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6679/testReport/ >>>>> >>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6680/testReport/ >>>>> >>>>> >>>>>> Is this a known issue? >>>>>> >>>>>> Error Message >>>>>> >>>>>> AssertionError: setup_storage.sh failed. Exit code is 1 assert 1 >>>>>> == 0 -1 +0 >>>>>> >>>>>> Stacktrace >>>>>> >>>>>> prefix = <ovirtlago.prefix.OvirtPrefix object at 0x7f6fd2b998d0> >>>>>> >>>>>> @pytest.mark.run(order=14) >>>>>> def test_configure_storage(prefix): >>>>>> engine = prefix.virt_env.engine_vm() >>>>>> result = engine.ssh( >>>>>> [ >>>>>> '/tmp/setup_storage.sh', >>>>>> ], >>>>>> ) >>>>>>> assert result.code == 0, 'setup_storage.sh failed. Exit >>>>>>> code is %s' % result.code >>>>>> E AssertionError: setup_storage.sh failed. Exit code is 1 >>>>>> E assert 1 == 0 >>>>>> E -1 >>>>>> E +0 >>>>>> >>>>>> >>>>>> The pytest traceback is nice, but in this case it is does not >>>>>> show any useful info. >>>>>> >>>>>> Since we run a script using ssh, the error message should include >>>>>> the process stdout and stderr >>>>>> which probably can explain the failure. >>>>> I posted https://gerrit.ovirt.org/#/c/107830/ to improve logging >>>>> during storage setup. >>>>> Unfortunately AFAICS it didn't fail, so I guess we'll have to >>>>> merge it and wait for a failed job to get some helpful logs. >>>> Thanks. >>>> >>>> It still fails for me with current code: >>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6689/testReport/ >>>> >>>> >>>> Same when using current vdsm master. >>> Updated the patch according to your suggestions and currently trying >>> out >>> OST for the 4th time - >>> all previous runs succeeded. I guess I'm out of luck :) >> It succeeds on your local OST setup but fail on Jenkins? > No, I mean jenkins - both check-patch runs didn't fail on this script. > I also tried running OST manually twice and same thing happened. > Anyway - the patch has been merged now so if any failure occurs in CQ > we should know what's going on. Ok, finally caught a failure in CQ [1]:
[2020-03-23T14:14:09.836Z] if result.code != 0: [2020-03-23T14:14:09.836Z] msg = ( [2020-03-23T14:14:09.836Z] 'setup_storage.sh failed with exit code: {}.\n' [2020-03-23T14:14:09.836Z] 'stdout:\n{}' [2020-03-23T14:14:09.836Z] 'stderr:\n{}' [2020-03-23T14:14:09.836Z] ).format(result.code, result.out, result.err) [2020-03-23T14:14:09.836Z] > raise RuntimeError(msg) [2020-03-23T14:14:09.836Z] E RuntimeError: setup_storage.sh failed with exit code: 1. [2020-03-23T14:14:09.836Z] E stdout: [2020-03-23T14:14:09.836Z] E Reposync & Extra Sources Content 0.0 B/s | 0 B 00:00 [2020-03-23T14:14:09.836Z] E stderr: [2020-03-23T14:14:09.836Z] E + set -xe [2020-03-23T14:14:09.836Z] E + MAIN_NFS_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_2 [2020-03-23T14:14:09.836Z] E + ISCSI_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_3 [2020-03-23T14:14:09.836Z] E + NUM_LUNS=5 [2020-03-23T14:14:09.836Z] E ++ uname -r [2020-03-23T14:14:09.836Z] E ++ awk -F. '{print $(NF-1)}' [2020-03-23T14:14:09.836Z] E + DIST=el8_1 [2020-03-23T14:14:09.836Z] E + main [2020-03-23T14:14:09.836Z] E ++ hostname [2020-03-23T14:14:09.836Z] E + [[ lago-basic-suite-master-engine == *\i\p\v\6* ]] [2020-03-23T14:14:09.836Z] E + install_deps [2020-03-23T14:14:09.836Z] E + systemctl disable --now kdump.service [2020-03-23T14:14:09.836Z] E Removed /etc/systemd/system/multi-user.target.wants/kdump.service. [2020-03-23T14:14:09.836Z] E + yum install --nogpgcheck -y nfs-utils rpcbind lvm2 targetcli sg3_utils iscsi-initiator-utils lsscsi policycoreutils-python-utils [2020-03-23T14:14:09.836Z] E Failed to download metadata for repo 'alocalsync' [2020-03-23T14:14:09.836Z] E Error: Failed to download metadata for repo 'alocalsync'
[1]
https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-master_change-que...
Galit, could you please take a look?
> >> >>>>>> Also I wonder why this code is called as a test >>>>>> (test_configure_storage). This looks like setup >>>>>> step so it should run as a fixture. >>>>> That's true, but the pytest porting effort was about providing a >>>>> bare minimum to move away from nose. >>>>> Organizing the tests into proper setup/fixtures is a huge task and >>>>> will be probably implemented >>>>> incrementally in the nearest future. >>>> Understood >>>> >
-- Martin Perina Manager, Software Engineering Red Hat Czech s.r.o.
--
GALIT ROSENTHAL
SOFTWARE ENGINEER
Red Hat
galit@redhat.com T: 972-9-7692230 <https://red.ht/sig>
--
GALIT ROSENTHAL
SOFTWARE ENGINEER
Red Hat
galit@redhat.com T: 972-9-7692230 <https://red.ht/sig>
--
GALIT ROSENTHAL
SOFTWARE ENGINEER
Red Hat
galit@redhat.com T: 972-9-7692230 <https://red.ht/sig>
-- Martin Perina Manager, Software Engineering Red Hat Czech s.r.o.
-- GALIT ROSENTHAL SOFTWARE ENGINEER Red Hat <https://www.redhat.com/> galit@redhat.com T: 972-9-7692230 <https://red.ht/sig>
participants (4)
-
Galit Rosenthal
-
Marcin Sobczyk
-
Martin Perina
-
Nir Soffer