
On Mon, Nov 6, 2017 at 1:57 PM, Dafna Ron <dron@redhat.com> wrote:
adding Didi.
On 11/06/2017 11:51 AM, Ala Hino wrote:
Suspected patch (https://gerrit.ovirt.org/#/c/83612/) is about cold merge and has nothing to do with host deploy.
On Mon, Nov 6, 2017 at 1:39 PM, Dafna Ron <dron@redhat.com> wrote:
Hi,
We failed test 002_bootstrap.verify_add_hosts
I can see we only tried to install one of the hosts (host-0) and failed. the second host has no log which means we did not try to deploy it.
The error suggests that we ovirt-imageio-daemon failed to start. However, there is another message that I think should be addressed about conflicting vdsm and libvirt configurations.
Link to suspected patches: https://gerrit.ovirt.org/#/c/83612/
Link to Job: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/
Link to all logs: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/...
(Relevant) error snippet from the log:
<error>
\
2017-11-06 02:56:46,526-0500 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:921 execute-output: ('/usr/bin/vdsm-tool', 'configure', '--force') stdout:
Checking configuration status...
abrt is not configured for vdsm WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based on vdsm configuration lvm requires configuration libvirt is not configured for vdsm yet FAILED: conflicting vdsm and libvirt-qemu tls configuration. vdsm.conf with ssl=True requires the following changes: libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1 qemu.conf: spice_tls=1. multipath requires configuration
2017-11-06 02:56:47,551-0500 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:926 execute-output: ('/usr/bin/systemctl', 'start', 'ovirt-imageio-daemon.service') stderr: Job for ovirt-imageio-daemon.service failed because the control process exited with error code. See "systemctl status ovirt-imageio-daemon.service" and "journalctl -xe" for details.
2017-11-06 02:56:47,552-0500 DEBUG otopi.context context._executeMethod:143 method exception Traceback (most recent call last): File "/tmp/ovirt-R4R8gZhaQI/pythonlib/otopi/context.py", line 133, in _executeMethod method['method']() File "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/ovirt-host-deploy/vdsm/packages.py", line 179, in _start self.services.state('ovirt-imageio-daemon', True) File "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/otopi/services/systemd.py", line 141, in state service=name, RuntimeError: Failed to start service 'ovirt-imageio-daemon' 2017-11-06 02:56:47,553-0500 ERROR otopi.context context._executeMethod:152 Failed to execute stage 'Closing up': Failed to start service 'ovirt-imageio-daemon'
In /var/log/messages of the host [1], there is: Nov 6 02:56:47 lago-basic-suite-master-host-0 systemd: Starting oVirt ImageIO Daemon... Nov 6 02:56:47 lago-basic-suite-master-host-0 python: detected unhandled Python exception in '/usr/bin/ovirt-imageio-daemon' Nov 6 02:56:47 lago-basic-suite-master-host-0 python: can't communicate with ABRT daemon, is it running? [Errno 2] No such file or directory Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: Traceback (most recent call last): Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/bin/ovirt-imageio-daemon", line 14, in <module> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: server.main(sys.argv) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 57, in main Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: start(config) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 85, in start Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: WSGIRequestHandler) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/SocketServer.py", line 419, in __init__ Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: self.server_bind() Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/wsgiref/simple_server.py", line 48, in server_bind Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: HTTPServer.server_bind(self) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/BaseHTTPServer.py", line 108, in server_bind Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: SocketServer.TCPServer.server_bind(self) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/SocketServer.py", line 430, in server_bind Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: self.socket.bind(self.server_address) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/socket.py", line 224, in meth Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: return getattr(self._sock,name)(*args) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: socket.error: [Errno 98] Address already in use Nov 6 02:56:47 lago-basic-suite-master-host-0 systemd: ovirt-imageio-daemon.service: main process exited, code=exited, status=1/FAILURE ovirt-host-deploy stops it, and immediately tries to start it: 2017-11-06 02:56:47,203-0500 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:863 execute-result: ('/usr/bin/systemctl', 'stop', 'ovirt-imageio-daemon.service'), rc=0 ... 2017-11-06 02:56:47,550-0500 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:863 execute-result: ('/usr/bin/systemctl', 'start', 'ovirt-imageio-daemon.service'), rc=1 Also, imageio-daemon's log [2] looks a bit weird to me - it has 5 'Starting' lines, but no other lines I would have expected to have, reading its source, and as I can see in another run, that did finish successfully [3]. Adding Idan, but not sure it's a bug in the daemon. [1] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/... [2] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/... [3] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3628/artifact/...
</error>
-- Didi