I see that this failure happens a lot on "ovirt-srv19.phx.ovirt.org", and by different projects that uses ansible.
Not sure it relates, but I've found (and removed) a stale lago environment in "/dev/shm" that were created by ovirt-system-tests_he-basic-iscsi-suite-master.
The stale environment caused the suite to not run in "/dev/shm".
The maximum number of semaphore on both  ovirt-srv19.phx.ovirt.org and ovirt-srv23.phx.ovirt.org (which run the ansible suite with success) is 128.

On Mon, Mar 19, 2018 at 3:37 PM, Yedidyah Bar David <didi@redhat.com> wrote:
Failed also here:

http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_64/4540/

The patch trigerring this affects many suites, and the job failed during ansible-suite-master .

On Mon, Mar 19, 2018 at 3:10 PM, Eyal Edri <eedri@redhat.com> wrote:
Gal and Daniel are looking into it, strange its not affecting all suites.

On Mon, Mar 19, 2018 at 2:11 PM, Dominik Holler <dholler@redhat.com> wrote:
Looks like /dev/shm is run out of space.

On Mon, 19 Mar 2018 13:33:28 +0200
Leon Goldberg <lgoldber@redhat.com> wrote:

> Hey, any updates?
>
> On Sun, Mar 18, 2018 at 10:44 AM, Edward Haas <ehaas@redhat.com>
> wrote:
>
> > We are doing nothing special there, just executing ansible through
> > their API.
> >
> > On Sun, Mar 18, 2018 at 10:42 AM, Daniel Belenky
> > <dbelenky@redhat.com> wrote:
> >
> >> It's not a space issue. Other suites ran on that slave after your
> >> suite successfully.
> >> I think that the problem is the setting for max semaphores, though
> >> I don't know what you're doing to reach that limit.
> >>
> >> [dbelenky@ovirt-srv18 ~]$ ipcs -ls
> >>
> >> ------ Semaphore Limits --------
> >> max number of arrays = 128
> >> max semaphores per array = 250
> >> max semaphores system wide = 32000
> >> max ops per semop call = 32
> >> semaphore max value = 32767
> >>
> >>
> >> On Sun, Mar 18, 2018 at 10:31 AM, Edward Haas <ehaas@redhat.com>
> >> wrote:
> >>> http://jenkins.ovirt.org/job/ovirt-system-tests_network-suite-master/
> >>>
> >>> On Sun, Mar 18, 2018 at 10:24 AM, Daniel Belenky
> >>> <dbelenky@redhat.com> wrote:
> >>>
> >>>> Hi Edi,
> >>>>
> >>>> Are there any logs? where you're running the suite? may I have a
> >>>> link?
> >>>>
> >>>> On Sun, Mar 18, 2018 at 8:20 AM, Edward Haas <ehaas@redhat.com>
> >>>> wrote:
> >>>>> Good morning,
> >>>>>
> >>>>> We are running in the OST network suite a test module with
> >>>>> Ansible and it started failing during the weekend on "OSError:
> >>>>> [Errno 28] No space left on device" when attempting to take a
> >>>>> lock in the mutiprocessing python module.
> >>>>>
> >>>>> It smells like a slave resource problem, could someone help
> >>>>> investigate this?
> >>>>>
> >>>>> Thanks,
> >>>>> Edy.
> >>>>>
> >>>>> =================================== FAILURES
> >>>>> =================================== ______________________
> >>>>> test_ovn_provider_create_scenario _______________________
> >>>>>
> >>>>> os_client_config = None
> >>>>>
> >>>>>     def test_ovn_provider_create_scenario(os_client_config):
> >>>>> >       _test_ovn_provider('create_scenario.yml')
> >>>>>
> >>>>> network-suite-master/tests/test_ovn_provider.py:68:
> >>>>> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> >>>>> _ _ _ _ _ _ _ _
> >>>>> network-suite-master/tests/test_ovn_provider.py:78: in
> >>>>> _test_ovn_provider playbook.run()
> >>>>> network-suite-master/lib/ansiblelib.py:127: in run
> >>>>> self._run_playbook_executor()
> >>>>> network-suite-master/lib/ansiblelib.py:138: in
> >>>>> _run_playbook_executor pbex =
> >>>>> PlaybookExecutor(**self._pbex_args) /usr/lib/python2.7/site-packages/ansible/executor/playbook_executor.py:60:
> >>>>> in __init__ self._tqm = TaskQueueManager(inventory=inventory,
> >>>>> variable_manager=variable_manager, loader=loader,
> >>>>> options=options,
> >>>>> passwords=self.passwords) /usr/lib/python2.7/site-packages/ansible/executor/task_queue_manager.py:104:
> >>>>> in __init__ self._final_q =
> >>>>> multiprocessing.Queue() /usr/lib64/python2.7/multiprocessing/__init__.py:218:
> >>>>> in Queue return
> >>>>> Queue(maxsize) /usr/lib64/python2.7/multiprocessing/queues.py:63:
> >>>>> in __init__ self._rlock =
> >>>>> Lock() /usr/lib64/python2.7/multiprocessing/synchronize.py:147:
> >>>>> in __init__ SemLock.__init__(self, SEMAPHORE, 1, 1) _ _ _ _ _ _
> >>>>> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> >>>>> _ _
> >>>>>
> >>>>> self = <Lock(owner=unknown)>, kind = 1, value = 1, maxvalue = 1
> >>>>>
> >>>>>     def __init__(self, kind, value, maxvalue):
> >>>>> >       sl = self._semlock = _multiprocessing.SemLock(kind,
> >>>>> > value, maxvalue)
> >>>>> E       OSError: [Errno 28] No space left on device
> >>>>>
> >>>>> /usr/lib64/python2.7/multiprocessing/synchronize.py:75: OSError
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>>
> >>>> DANIEL BELENKY
> >>>>
> >>>> RHV DEVOPS
> >>>>
> >>>
> >>>
> >>
> >>
> >> --
> >>
> >> DANIEL BELENKY
> >>
> >> RHV DEVOPS
> >>
> >
> >

_______________________________________________
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra



--

Eyal edri


MANAGER

RHV DevOps

EMEA VIRTUALIZATION R&D


Red Hat EMEA

TRIED. TESTED. TRUSTED.
phone: +972-9-7692018
irc: eedri (on #tlv #rhev-dev #rhev-integ)

_______________________________________________
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra




--
Didi



--
GAL bEN HAIM
RHV DEVOPS