On Sun, Dec 18, 2016 at 6:08 PM, Barak Korren <bkorren@redhat.com> wrote:
> On 18 December 2016 at 17:26, Nir Soffer <nsoffer@redhat.com> wrote:
>> On Sun, Dec 18, 2016 at 4:17 PM, Barak Korren <bkorren@redhat.com> wrote:
>
>> We a lot of these errors in the rest of the log. This meas something
>> is wrong with this vg.
>>
>> Needs deeper investigation from storage developer on both engine and vdsm side,
>> but I would start by making sure we use clean luns. We are not trying
>> to test esoteric
>> negative flows in the system tests.
>
> Here is the storage setup script:
> https://gerrit.ovirt.org/gitweb?p=ovirt-system-tests. git;a=blob;f=common/deploy- scripts/setup_storage_unified_ he_extra_iscsi_el7.sh;hb=refs/ heads/master
25 iscsiadm -m discovery -t sendtargets -p 127.0.0.1
26 iscsiadm -m node -L all
This is alerting. Before we serve these luns, we should log out
from these nodes, and remove the nodes.
> All storage used in the system tests comes from the engine VM itself,
> and is placed on a newly allocated QCOW2 file (exposed as /dev/sde to
> the engine VM), so its unlikely the LUNs are not clean.
We did not change code related to getDeviceList lately, these getPV errors
tell us that there is an issue in a lower level component or the storage
server.
Does this test pass with older version of vdsm? engine?
>> Did we change something in the system tests project or lago while we
>> were not looking?
>
> Not likely as well:
> https://gerrit.ovirt.org/gitweb?p=ovirt-system-tests. git;a=shortlog
>
> ovirt-system-tests project has got its own CI, testing against the
> last nigthly (we will move it to last build that passed the tests
> soon). So we are unlikely to merge breaking code there.
It depends on the tests.
Do you have test logging in to the target and creating a vg using
the luns?
> Then again
> we're not gating the OS packages so some breakage may have gone in via
> CentOS repos...
These failures are with centos 7.2 or 7.3? both?
>> Can we reproduce this issue manually with same engine and vdsm versions?
>
> You have several options:
> 1: Get engine+vdsm builds from Jenkins:
> http://jenkins.ovirt.org/job/ovirt-engine_master_build- artifacts-fc24-x86_64/
> http://jenkins.ovirt.org/job/vdsm_master_build-artifacts- el7-x86_64/
> (Getting the exact builds that went into a given OST run takes tracing
> back the job invocation links from that run)
>
> 2: Use the latest experimental repo:
> http://resources.ovirt.org/repos/ovirt/experimental/ master/latest/rpm/el7/
>
> 3: Run lago and OST locally:
> (as documented here:
> http://ovirt-system-tests.readthedocs.io/en/latest/
> you'd need to pass in the vdsm and engine packages to use)
Do you know how to setup the system so it run all the setup code up to
the code that cause the getPV errors?
We need to inspect the system at this point.
Nir