[ovirt-devel] test-repo_ovirt_experimental_master job - failed

Yaniv Kaul ykaul at redhat.com
Sun Oct 30 10:40:07 UTC 2016


On Sun, Oct 30, 2016 at 12:26 PM, Nadav Goldin <ngoldin at redhat.com> wrote:

> Hi all, bumping this thread due to an almost identical failure[1]:
>
> ovirt-log-collector/ovirt-log-collector-20161030053238.log:2016-10-30
> 05:33:09::ERROR::__main__::791::root:: Failed to collect logs from:
> 192.168.200.4; /bin/ls:
> /rhev/data-center/mnt/blockSD/63c4fdd3-5d0f-4d16-b1e5-
> 5f43caa4cf82/master/tasks/6b3b6aa1-808c-42df-9db7-
> 52349f8533f2/6b3b6aa1-808c-42df-9db7-52349f8533f2.job.0:
> No such file or directory
> ovirt-log-collector/ovirt-log-collector-20161030053238.log-/bin/ls:
> cannot access /rhev/data-center/mnt/blockSD/63c4fdd3-5d0f-4d16-b1e5-
> 5f43caa4cf82/master/tasks/6b3b6aa1-808c-42df-9db7-
> 52349f8533f2/6b3b6aa1-808c-42df-9db7-52349f8533f2.recover.1:
> No such file or directory
> ovirt-log-collector/ovirt-log-collector-20161030053238.log-/bin/ls:
> cannot access /rhev/data-center/mnt/blockSD/63c4fdd3-5d0f-4d16-b1e5-
> 5f43caa4cf82/master/tasks/6b3b6aa1-808c-42df-9db7-
> 52349f8533f2/6b3b6aa1-808c-42df-9db7-52349f8533f2.task:
> No such file or directory
> ovirt-log-collector/ovirt-log-collector-20161030053238.log-/bin/ls:
> cannot access /rhev/data-center/mnt/blockSD/63c4fdd3-5d0f-4d16-b1e5-
> 5f43caa4cf82/master/tasks/6b3b6aa1-808c-42df-9db7-
> 52349f8533f2/6b3b6aa1-808c-42df-9db7-52349f8533f2.recover.0:
> No such file or directory
>
> To ensure I've checked lago/OST, and couldn't find any stage where
> there is a reference to '/rhv' nor any manipulation to
> ovirt-log-collector, only customizations made is a
> 'ovirt-log-collector.conf' with user/password. The code that pulls the
> logs in OST[2] runs the following command on the engine VM(and there
> it fails):
>
> ovirt-log-collector --conf /rot/ovirt-log-collector.conf
>
> The failure comes right after 'add_secondary_storage_domains'[3] test,
> which all of its steps ran successfully.
>

Not exactly.


>
> Can anyone look into this?
>

It may be my fault, in a way. I've added the log collector test to run in
parallel to the tests that add the secondary storage domains. The
directories it tries to access may or may not be available - this is
probably racy. I don't think it should fail, but I can certainly see why it
can.
The easiest 'fix' would be to split it to its own test (I wanted to save
execution time, as most of the time spent on secondary storage domains test
is not really useful).
Y.



>
> Thanks,
> Nadav.
>
> [1] http://jenkins.ovirt.org/job/ovirt-system-tests_master_
> check-patch-fc24-x86_64/141/console
> [2] https://github.com/oVirt/ovirt-system-tests/blob/
> master/basic_suite_master/test-scenarios/002_bootstrap.py#L490
> [3] https://github.com/oVirt/ovirt-system-tests/blob/
> master/basic_suite_master/test-scenarios/002_bootstrap.py#L243
>
>
> On Tue, Sep 20, 2016 at 9:45 AM, Sandro Bonazzola <sbonazzo at redhat.com>
> wrote:
> >
> >
> >
> > On Fri, Sep 9, 2016 at 1:19 PM, Yaniv Kaul <ykaul at redhat.com> wrote:
> >>
> >> Indeed, this is the log collector. I wonder if we collect its logs...
> >> Y.
> >
> >
> > This can't be log-collector, it can be sos vdsm plugin.
> > That said, if we run log-collector within lago we should collect the
> results as job artifacts.
> >
> >
> >>
> >>
> >>
> >> On Thu, Sep 8, 2016 at 6:54 PM, Eyal Edri <eedri at redhat.com> wrote:
> >>>
> >>> I'm pretty sure lago or ovirt system tests aren't doing it but its the
> log collector which is running during that test, I'm not near a computer so
> can't verify it yet.
> >>>
> >>>
> >>> On Sep 8, 2016 6:05 PM, "Nir Soffer" <nsoffer at redhat.com> wrote:
> >>>>
> >>>> On Thu, Sep 8, 2016 at 5:45 PM, Eyal Edri <eedri at redhat.com> wrote:
> >>>> > Adding devel.
> >>>> >
> >>>> > On Thu, Sep 8, 2016 at 5:43 PM, Shlomo Ben David <
> sbendavi at redhat.com>
> >>>> > wrote:
> >>>> >>
> >>>> >> Hi,
> >>>> >>
> >>>> >> Job [1] is failing with the following error:
> >>>> >>
> >>>> >> lago.ssh: DEBUG: Command 8de75538 on lago_basic_suite_master_engine
> >>>> >> errors:
> >>>> >>  ERROR: Failed to collect logs from: 192.168.200.2; /bin/ls:
> >>>> >> /rhev/data-center/mnt/blockSD/eb8c9f48-5f23-48dc-ab7d-
> 9451890fd422/master/tasks/1350bed7-443e-4ae6-ae1f-9b24d18c70a8.temp:
> >>>> >> No such file or directory
> >>>> >> /bin/ls: cannot open directory
> >>>> >> /rhev/data-center/mnt/blockSD/eb8c9f48-5f23-48dc-ab7d-
> 9451890fd422/master/tasks/1350bed7-443e-4ae6-ae1f-9b24d18c70a8.temp:
> >>>> >> No such file or directory
> >>>>
> >>>> This looks like a lago issue - it should never read anything inside
> /rhev
> >>>>
> >>>> This is a private directory for vdsm, no other process should ever
> depend
> >>>> on the content inside this directory, or even on the fact that it
> exists.
> >>>>
> >>>> In particular, /rhev/data-center/mnt/blockSD/*/master/tasks/*.temp
> >>>> Is not a log file, and lago should not collect it.
> >>>>
> >>>> Nir
> >>>>
> >>>> >> lago.utils: ERROR: Error while running thread
> >>>> >> Traceback (most recent call last):
> >>>> >>   File "/usr/lib/python2.7/site-packages/lago/utils.py", line 53,
> in
> >>>> >> _ret_via_queue
> >>>> >>     queue.put({'return': func()})
> >>>> >>   File
> >>>> >> "/home/jenkins/workspace/test-repo_ovirt_experimental_
> master/ovirt-system-tests/basic_suite_master/test-
> scenarios/002_bootstrap.py",
> >>>> >> line 493, in log_collector
> >>>> >>     result.code, 0, 'log collector failed. Exit code is %s' %
> result.code
> >>>> >>   File "/usr/lib/python2.7/site-packages/nose/tools/trivial.py",
> line 29,
> >>>> >> in eq_
> >>>> >>     raise AssertionError(msg or "%r != %r" % (a, b))
> >>>> >> AssertionError: log collector failed. Exit code is 2
> >>>> >>
> >>>> >>
> >>>> >> * The previous issue already fixed (SDK) and now we have a new
> issue on
> >>>> >> the same area.
> >>>> >>
> >>>> >>
> >>>> >> [1] -
> >>>> >> http://jenkins.ovirt.org/view/experimental%20jobs/job/test-
> repo_ovirt_experimental_master/1462/testReport/(root)/
> 002_bootstrap/add_secondary_storage_domains/
> >>>> >>
> >>>> >>
> >>>> >> Best Regards,
> >>>> >>
> >>>> >> Shlomi Ben-David | DevOps Engineer | Red Hat ISRAEL
> >>>> >> RHCSA | RHCE
> >>>> >> IRC: shlomibendavid (on #rhev-integ, #rhev-dev, #rhev-ci)
> >>>> >>
> >>>> >> OPEN SOURCE - 1 4 011 && 011 4 1
> >>>> >
> >>>> >
> >>>> >
> >>>> >
> >>>> > --
> >>>> > Eyal Edri
> >>>> > Associate Manager
> >>>> > RHV DevOps
> >>>> > EMEA ENG Virtualization R&D
> >>>> > Red Hat Israel
> >>>> >
> >>>> > phone: +972-9-7692018
> >>>> > irc: eedri (on #tlv #rhev-dev #rhev-integ)
> >>>> >
> >>>> > _______________________________________________
> >>>> > Devel mailing list
> >>>> > Devel at ovirt.org
> >>>> > http://lists.ovirt.org/mailman/listinfo/devel
> >>
> >>
> >
> >
> >
> > --
> > Sandro Bonazzola
> > Better technology. Faster innovation. Powered by community collaboration.
> > See how it works at redhat.com
> >
> >
> > _______________________________________________
> > Devel mailing list
> > Devel at ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20161030/47cd4d02/attachment-0001.html>


More information about the Devel mailing list