[ovirt-devel] oVirt system tests currently failing to AddHost on master

Dan Kenigsberg danken at redhat.com
Fri Dec 23 16:25:26 UTC 2016


On Fri, Dec 23, 2016 at 6:20 PM, Barak Korren <bkorren at redhat.com> wrote:
> On 22 December 2016 at 21:56, Nir Soffer <nsoffer at redhat.com> wrote:
>> On Thu, Dec 22, 2016 at 9:12 PM, Fred Rolland <frolland at redhat.com> wrote:
>>> SuperVdsm fails to starts :
>>>
>>> MainThread::ERROR::2016-12-22
>>> 12:42:08,699::supervdsmServer::317::SuperVdsm.Server::(main) Could not start
>>> Super Vdsm
>>> Traceback (most recent call last):
>>>   File "/usr/share/vdsm/supervdsmServer", line 297, in main
>>>     server = manager.get_server()
>>>   File "/usr/lib64/python2.7/multiprocessing/managers.py", line 493, in
>>> get_server
>>>     self._authkey, self._serializer)
>>>   File "/usr/lib64/python2.7/multiprocessing/managers.py", line 162, in
>>> __init__
>>>     self.listener = Listener(address=address, backlog=16)
>>>   File "/usr/lib64/python2.7/multiprocessing/connection.py", line 136, in
>>> __init__
>>>     self._listener = SocketListener(address, family, backlog)
>>>   File "/usr/lib64/python2.7/multiprocessing/connection.py", line 260, in
>>> __init__
>>>     self._socket.bind(address)
>>>   File "/usr/lib64/python2.7/socket.py", line 224, in meth
>>>     return getattr(self._sock,name)(*args)
>>> error: [Errno 2] No such file or directory
>>>
>>>
>>> On Thu, Dec 22, 2016 at 7:54 PM, Barak Korren <bkorren at redhat.com> wrote:
>>>>
>>>> It hard to tell currently when did this start b/c we had so package
>>>> issues that made the tests fail before reaching that point most of the
>>>> day.
>>>>
>>>> Since we currently have an issue in Lago with collecting AddHost logs
>>>> (Hopefully we'll resolve this in the next release early next week),
>>>> I`ve ran the tests locally and attached the bundle of generated logs
>>>> to this message.
>>>>
>>>> Included in the attached file are engine logs, host-deploy logs and
>>>> VDSM logs for both test hosts.
>>>>
>>>> From a quick look inside it seems the issue is with VDSM failing to start.
>>
>> From host-deploy/ovirt-host-deploy-20161222124209-192.168.203.4-604a4799.log:
>>
>> 2016-12-22 12:42:05 DEBUG otopi.plugins.otopi.services.systemd
>> plugin.executeRaw:813 execute: ('/bin/systemctl', 'start',
>> 'vdsmd.service'), executable='None', cwd='None', env=None
>> 2016-12-22 12:42:09 DEBUG otopi.plugins.otopi.services.systemd
>> plugin.executeRaw:863 execute-result: ('/bin/systemctl', 'start',
>> 'vdsmd.service'), rc=1
>> 2016-12-22 12:42:09 DEBUG otopi.plugins.otopi.services.systemd
>> plugin.execute:921 execute-output: ('/bin/systemctl', 'start',
>> 'vdsmd.service') stdout:
>>
>>
>> 2016-12-22 12:42:09 DEBUG otopi.plugins.otopi.services.systemd
>> plugin.execute:926 execute-output: ('/bin/systemctl', 'start',
>> 'vdsmd.service') stderr:
>> A dependency job for vdsmd.service failed. See 'journalctl -xe' for details.
>>
>> This means that one of the services vdsm depends on could not start.
>>
>> 2016-12-22 12:42:09 DEBUG otopi.context context._executeMethod:142
>> method exception
>> Traceback (most recent call last):
>>   File "/tmp/ovirt-bUCuRxXXzU/pythonlib/otopi/context.py", line 132,
>> in _executeMethod
>>     method['method']()
>>   File "/tmp/ovirt-bUCuRxXXzU/otopi-plugins/ovirt-host-deploy/vdsm/packages.py",
>> line 209, in _start
>>     self.services.state('vdsmd', True)
>>   File "/tmp/ovirt-bUCuRxXXzU/otopi-plugins/otopi/services/systemd.py",
>> line 141, in state
>>     service=name,
>> RuntimeError: Failed to start service 'vdsmd'
>>
>> This error is not very useful for anyone. What we need in otopi log is
>> the output of
>> journalctl -xe (suggested by systemctl).
>>
>> Didi, can we collect this info when starting a service fail?
>>
>> Barak, can you log in to the host with this error and collect the output?
>>
> By the time I looged in to the host, all IP addresses are gone (I'm
> guessing the setup process killed dhclient), so I'm having to work via
> the serial console)
>
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
>     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>     inet 127.0.0.1/8 scope host lo
>        valid_lft forever preferred_lft forever
>     inet6 ::1/128 scope host
>        valid_lft forever preferred_lft forever
> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
> state UP qlen 1000
>     link/ether 54:52:c0:a8:cb:02 brd ff:ff:ff:ff:ff:ff
>     inet6 fe80::5652:c0ff:fea8:cb02/64 scope link
>        valid_lft forever preferred_lft forever
> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
> state UP qlen 1000
>     link/ether 54:52:c0:a8:cc:02 brd ff:ff:ff:ff:ff:ff
>     inet6 fe80::5652:c0ff:fea8:cc02/64 scope link
>        valid_lft forever preferred_lft forever
> 4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
> state UP qlen 1000
>     link/ether 54:52:c0:a8:cc:03 brd ff:ff:ff:ff:ff:ff
>     inet6 fe80::5652:c0ff:fea8:cc03/64 scope link
>        valid_lft forever preferred_lft forever
> 5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
> state UP qlen 1000
>     link/ether 54:52:c0:a8:ca:02 brd ff:ff:ff:ff:ff:ff
>     inet6 fe80::5652:c0ff:fea8:ca02/64 scope link
>        valid_lft forever preferred_lft forever
>
>
> Here is the interesting stuff I can gather from journalctl:
>
> Dec 22 12:42:06 lago-basic-suite-master-host0
> ovirt-imageio-daemon[5007]: Traceback (most recent call last):
> Dec 22 12:42:06 lago-basic-suite-master-host0
> ovirt-imageio-daemon[5007]: File "/usr/bin/ovirt-imageio-daemon", line
> 14, in <module>


Thanks, Barak.

My guess stays with

Bug 1400003 - imageio fails during system startup

as the culprit.


More information about the Devel mailing list