[ovirt-devel] [ OST Failure Report ] [ oVirt Master ] [ 06-11-2017 ] [ 002_bootstrap.verify_add_hosts ]

Yedidyah Bar David didi at redhat.com
Mon Nov 6 14:15:41 UTC 2017


On Mon, Nov 6, 2017 at 1:57 PM, Dafna Ron <dron at redhat.com> wrote:
> adding Didi.
>
>
> On 11/06/2017 11:51 AM, Ala Hino wrote:
>
> Suspected patch (https://gerrit.ovirt.org/#/c/83612/) is about cold merge
> and has nothing to do with host deploy.
>
> On Mon, Nov 6, 2017 at 1:39 PM, Dafna Ron <dron at redhat.com> wrote:
>>
>> Hi,
>>
>> We failed test 002_bootstrap.verify_add_hosts
>>
>> I can see we only tried to install one of the hosts (host-0) and failed.
>> the second host has no log which means we did not try to deploy it.
>>
>> The error suggests that we ovirt-imageio-daemon failed to start. However,
>> there is another message that I think should be addressed about conflicting
>> vdsm and libvirt configurations.
>>
>> Link to suspected patches: https://gerrit.ovirt.org/#/c/83612/
>>
>>
>> Link to Job:
>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/
>>
>>
>> Link to all logs:
>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/
>>
>>
>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/host-deploy/ovirt-host-deploy-20171106025647-lago-basic-suite-master-host-0-5530ab1f.log
>>
>>
>> (Relevant) error snippet from the log:
>>
>> <error>
>>
>> \
>>
>> 2017-11-06 02:56:46,526-0500 DEBUG
>> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:921
>> execute-output: ('/usr/bin/vdsm-tool', 'configure', '--force') stdout:
>>
>> Checking configuration status...
>>
>> abrt is not configured for vdsm
>> WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based on
>> vdsm configuration
>> lvm requires configuration
>> libvirt is not configured for vdsm yet
>> FAILED: conflicting vdsm and libvirt-qemu tls configuration.
>> vdsm.conf with ssl=True requires the following changes:
>> libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1
>> qemu.conf: spice_tls=1.
>> multipath requires configuration
>>
>>
>> 2017-11-06 02:56:47,551-0500 DEBUG otopi.plugins.otopi.services.systemd
>> plugin.execute:926 execute-output: ('/usr/bin/systemctl', 'start',
>> 'ovirt-imageio-daemon.service') stderr:
>> Job for ovirt-imageio-daemon.service failed because the control process
>> exited with error code. See "systemctl status ovirt-imageio-daemon.service"
>> and "journalctl -xe" for details.
>>
>> 2017-11-06 02:56:47,552-0500 DEBUG otopi.context
>> context._executeMethod:143 method exception
>> Traceback (most recent call last):
>>   File "/tmp/ovirt-R4R8gZhaQI/pythonlib/otopi/context.py", line 133, in
>> _executeMethod
>>     method['method']()
>>   File
>> "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/ovirt-host-deploy/vdsm/packages.py",
>> line 179, in _start
>>     self.services.state('ovirt-imageio-daemon', True)
>>   File "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/otopi/services/systemd.py",
>> line 141, in state
>>     service=name,
>> RuntimeError: Failed to start service 'ovirt-imageio-daemon'
>> 2017-11-06 02:56:47,553-0500 ERROR otopi.context
>> context._executeMethod:152 Failed to execute stage 'Closing up': Failed to
>> start service 'ovirt-imageio-daemon'

In /var/log/messages of the host [1], there is:

Nov  6 02:56:47 lago-basic-suite-master-host-0 systemd: Starting oVirt
ImageIO Daemon...
Nov  6 02:56:47 lago-basic-suite-master-host-0 python: detected
unhandled Python exception in '/usr/bin/ovirt-imageio-daemon'
Nov  6 02:56:47 lago-basic-suite-master-host-0 python: can't
communicate with ABRT daemon, is it running? [Errno 2] No such file or
directory
Nov  6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:
Traceback (most recent call last):
Nov  6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:
File "/usr/bin/ovirt-imageio-daemon", line 14, in <module>
Nov  6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:
server.main(sys.argv)
Nov  6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:
File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py",
line 57, in main
Nov  6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:
start(config)
Nov  6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:
File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py",
line 85, in start
Nov  6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:
WSGIRequestHandler)
Nov  6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:
File "/usr/lib64/python2.7/SocketServer.py", line 419, in __init__
Nov  6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:
self.server_bind()
Nov  6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:
File "/usr/lib64/python2.7/wsgiref/simple_server.py", line 48, in
server_bind
Nov  6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:
HTTPServer.server_bind(self)
Nov  6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:
File "/usr/lib64/python2.7/BaseHTTPServer.py", line 108, in
server_bind
Nov  6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:
SocketServer.TCPServer.server_bind(self)
Nov  6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:
File "/usr/lib64/python2.7/SocketServer.py", line 430, in server_bind
Nov  6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:
self.socket.bind(self.server_address)
Nov  6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:
File "/usr/lib64/python2.7/socket.py", line 224, in meth
Nov  6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:
return getattr(self._sock,name)(*args)
Nov  6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:
socket.error: [Errno 98] Address already in use
Nov  6 02:56:47 lago-basic-suite-master-host-0 systemd:
ovirt-imageio-daemon.service: main process exited, code=exited,
status=1/FAILURE

ovirt-host-deploy stops it, and immediately tries to start it:

2017-11-06 02:56:47,203-0500 DEBUG
otopi.plugins.otopi.services.systemd plugin.executeRaw:863
execute-result: ('/usr/bin/systemctl', 'stop',
'ovirt-imageio-daemon.service'), rc=0
...
2017-11-06 02:56:47,550-0500 DEBUG
otopi.plugins.otopi.services.systemd plugin.executeRaw:863
execute-result: ('/usr/bin/systemctl', 'start',
'ovirt-imageio-daemon.service'), rc=1

Also, imageio-daemon's log [2] looks a bit weird to me - it has 5
'Starting' lines, but no
other lines I would have expected to have, reading its source, and as
I can see in another
run, that did finish successfully [3].

Adding Idan, but not sure it's a bug in the daemon.

[1] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host-0/_var_log/

[2] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host-0/_var_log/ovirt-imageio-daemon/daemon.log

[3] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3628/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host-0/_var_log/ovirt-imageio-daemon/daemon.log

>>
>> </error>
>>
>>
>
>



-- 
Didi


More information about the Devel mailing list