On Sun, Dec 18, 2016 at 6:08 PM, Barak Korren <bkorren(a)redhat.com> wrote:
On 18 December 2016 at 17:26, Nir Soffer <nsoffer(a)redhat.com>
wrote:
> On Sun, Dec 18, 2016 at 4:17 PM, Barak Korren <bkorren(a)redhat.com> wrote:
> We a lot of these errors in the rest of the log. This meas something
> is wrong with this vg.
>
> Needs deeper investigation from storage developer on both engine and vdsm side,
> but I would start by making sure we use clean luns. We are not trying
> to test esoteric
> negative flows in the system tests.
Here is the storage setup script:
https://gerrit.ovirt.org/gitweb?p=ovirt-system-tests.git;a=blob;f=common/...
25 iscsiadm -m discovery -t sendtargets -p 127.0.0.1
26 iscsiadm -m node -L all
This is alerting. Before we serve these luns, we should log out
from these nodes, and remove the nodes.
All storage used in the system tests comes from the engine VM
itself,
and is placed on a newly allocated QCOW2 file (exposed as /dev/sde to
the engine VM), so its unlikely the LUNs are not clean.
We did not change code related to getDeviceList lately, these getPV errors
tell us that there is an issue in a lower level component or the storage
server.
Does this test pass with older version of vdsm? engine?
> Did we change something in the system tests project or lago while
we
> were not looking?
Not likely as well:
https://gerrit.ovirt.org/gitweb?p=ovirt-system-tests.git;a=shortlog
ovirt-system-tests project has got its own CI, testing against the
last nigthly (we will move it to last build that passed the tests
soon). So we are unlikely to merge breaking code there.
It depends on the tests.
Do you have test logging in to the target and creating a vg using
the luns?
Then again
we're not gating the OS packages so some breakage may have gone in via
CentOS repos...
These failures are with centos 7.2 or 7.3? both?
> Can we reproduce this issue manually with same engine and vdsm
versions?
You have several options:
1: Get engine+vdsm builds from Jenkins:
http://jenkins.ovirt.org/job/ovirt-engine_master_build-artifacts-fc24-x86...
http://jenkins.ovirt.org/job/vdsm_master_build-artifacts-el7-x86_64/
(Getting the exact builds that went into a given OST run takes tracing
back the job invocation links from that run)
2: Use the latest experimental repo:
http://resources.ovirt.org/repos/ovirt/experimental/master/latest/rpm/el7/
3: Run lago and OST locally:
(as documented here:
http://ovirt-system-tests.readthedocs.io/en/latest/
you'd need to pass in the vdsm and engine packages to use)
Do you know how to setup the system so it run all the setup code up to
the code that cause the getPV errors?
We need to inspect the system at this point.
Nir